# KPF Processing Scripts
This page describes the scripts uses for processing KPF data and the ones that are used for production processing.

Before listing the scripts, we need to import them.

In [1]:
import sys
codepath = '/code/KPF-Pipeline/scripts/'
sys.path.append(codepath)
import generate_time_series_plots
import ingest_dates_kpf_tsdb
import ingest_watch_kpf_tsdb
import kpf_processing_progress
import launch_kpf_tsdb_plotting
import make_plots_kpf_tsdb
import qlp_parallel
from modules.Utils.string_proc import print_shell_script_docstring



All plotting processes have completed.


### generate_time_series_plots.py
This commandline script is used ingest keywords, RVs, and other information into an observational database that can be used for various purposes including making plots of performance over time.  It is part of the production processing pipeline.  The docstring for this is below. Compared with ingest_watch_kpf_tsdb.py (below), this script is used for a specified date range and is not typically used in production processing.

In [2]:
print(generate_time_series_plots.__doc__)


Script Name: generate_time_series_plots.py

Description:
    This script generates KPF time series plots over various time intervals and
    saves the results to predefined directories. It supports multithreaded execution
    for tasks with different intervals and date ranges, including daily, monthly,
    yearly, and decade-based plots. Additionally, the script monitors the status
    of running threads and reports on their activity.

Features:
    - Generates plots for multiple time intervals (day, month, year, decade).
    - Supports custom date ranges for plot generation.
    - Multithreaded execution for efficiency, allowing simultaneous tasks.
    - Monitors thread status and execution time for each task.
    - Saves results in a structured format for further analysis.

Usage:
    Run this script with optional arguments to specify the database path:

        python generate_time_series_plots.py --db_path /path/to/database.db

Options:
    --db_path   Path to the time series data

### ingest_dates_kpf_tsdb.py
This commandline script is used ingest keywords, RVs, and other information into an observational database that can be used for various purposes including making plots of performance over time.  It is part of the production processing pipeline.  The docstring for this is below. Compared with ingest_watch_kpf_tsdb.py (below), this script is used for a specified date range and is not typically used in production processing.

In [3]:
print(ingest_dates_kpf_tsdb.main.__doc__)


    Script Name: ingest_dates_kpf_tsdb.py
   
    Description:
      This script is used to ingest KPF observations over a date range into a 
      KPF Time Series Database.

    Options:
      --help        Display help message
      --start_date  Start date in YYYYMMDD format
      --end_date    End date in YYYYMMDD format
   
    Usage:
      ./ingest_dates_kpf_tsdb.py YYYYMMDD YYYYMMDD dbname.db
   
    Example:
      ./ingest_dates_kpf_tsdb.py 20231201 20240101 kpfdb.db
    


### ingest_watch_kpf_tsdb.py
The commandline script ingest_watch_kpf_tsdb.py is used ingest keywords, RVs, and other information into an observational database that can be used for various purposes including making plots of performance over time.  The docstring for this is below. 

In [4]:
print(ingest_watch_kpf_tsdb.__doc__)


Script Name: ingest_watch_kpf_tsdb.py

Description:
    This script watches directories for new or modified KPF files and ingests their
    data into a KPF Time Series Database. The script utilizes the Watchdog library
    to monitor filesystem events and triggers ingestion processes. Additionally, it 
    performs periodic scans of data directories to ensure all observations are 
    ingested.

Features:
    - Ingests file metadata and telemetry into the database.
    - Watches multiple directories for new or modified KPF files.
    - Performs periodic scans of data directories.
    - Supports multithreaded execution.

Usage:
    Run this script with optional arguments to specify the database path:
    
        python ingest_watch_kpf_tsdb.py --db_path /path/to/database.db

Options:
    --db_path   Path to the time series database file. Default: /data/time_series/kpf_ts.db

Examples:
    1. Using default database path:
        python ingest_watch_kpf_tsdb.py

    2. Specifying a cust

### kpf_slowtouch.sh
Add description.  The docstring is below.

In [5]:
print_shell_script_docstring(codepath + 'kpf_slowtouch.sh')

Script name: kpf_slowtouch.sh
Author: Andrew Howard
        with assistance from Chat-GPT4
        or maybe the other way around
Date: June 23, 2023

This script is used to touch a list of KPF L0 files that have names like
KP.20230623.12345.67.fits.  This is useful to initiate reprocessing
using the KPF DRP.  The list of L0 files can be provided in multiple ways:
   1. As command-line arguments when invoking the script.
   2. In the first column of a CSV file specified with the -f option.
      This is useful for CSV files with a large set of L0 filenames
      downloaded from Jump.  Such files might have double quotes around
      the L0 filename, which the script will remove when appropriate.
   3. All filenames in a directory specified with the -d option.

Command-line options (all are optional):
-f <filename>       : The script will read the KPF L0 filenames
                      from the first column of a CSV with the name <filename>.
                      Useful for lists of L0 f

### kpf_processing_progress.py
This commandline script is used to assess the status and progress of processing KPF data.  The docstring is below.

In [6]:
print(kpf_processing_progress.main.__doc__)


    Script Name: kpf_processing_progress.py
   
    Description:
      This script is used to assess the status and progress of processing KPF data.
      It searches over a range of dates specified by the first two arguments which are 
      of the form YYYYMMDD.  For each date (with /data/kpf/L0/YYYYMMDD as the 
      assumed L0 directory), it examines each L0 file and the associated 2D/L1/L2 
      files in their related directories.  If the first argument is a date after the 
      second argument, then the dates are printed in reverse chronological order (later 
      dates first).  The output of this script is a table with columns indicating the 
      date for each row, the most recent modification date for and L0 file in that 
      directory, the fraction of 2D files processed, the fraction of L1 files processed, 
      and the fraction of L2 files processed.  Sample output is shown below.
      
      > ./scripts/kpf_processing_progress.py 20231231 20230101 --current_version

### launch_kpf_tsdb_plotting.py
This commandline script is used to automatically ingest information from KPF data products into an Observational Database.  It is part of the production processing pipeline.  The docstring is below.

In [7]:
print(ingest_watch_kpf_tsdb.__doc__)


Script Name: ingest_watch_kpf_tsdb.py

Description:
    This script watches directories for new or modified KPF files and ingests their
    data into a KPF Time Series Database. The script utilizes the Watchdog library
    to monitor filesystem events and triggers ingestion processes. Additionally, it 
    performs periodic scans of data directories to ensure all observations are 
    ingested.

Features:
    - Ingests file metadata and telemetry into the database.
    - Watches multiple directories for new or modified KPF files.
    - Performs periodic scans of data directories.
    - Supports multithreaded execution.

Usage:
    Run this script with optional arguments to specify the database path:
    
        python ingest_watch_kpf_tsdb.py --db_path /path/to/database.db

Options:
    --db_path   Path to the time series database file. Default: /data/time_series/kpf_ts.db

Examples:
    1. Using default database path:
        python ingest_watch_kpf_tsdb.py

    2. Specifying a cust

### launch_qlp.sh
This commandline scirpt is used to launch multiple instances of the quicklook pipeline to generate standard diagnostic plots for L0/2D/L1/L2/master data products.  It is part of the production processing pipeline.  The docstring is below.

In [8]:
print_shell_script_docstring(codepath + 'launch_qlp.sh')

Script name: launch_qlp.sh
Author: Andrew Howard

This script launches 15 QLP (Quicklook Pipeline) instances for data levels
L0, 2D, L1, L2, and masters. It utilizes the specified recipe and config
files to process observational data in the KPF pipeline. The script can
optionally process only recent observations from the current day.

Command-line options (all are optional):
  --only_recent       Use a specialized recipe to process only observations
                      from the current day.
  -h, --help          Display this help message and exit.

Examples:
1. Launch QLP instances for all data levels with the default recipe:
   ./launch_qlp.sh

2. Launch QLP instances for only recent observations:
   ./launch_qlp.sh --only_recent

3. Display the help message:
   ./launch_qlp.sh -h


### make_plots_kpf_tsdb.py
This commandline script is used to generate standard KPF time series plots of telemetry and other information.  It uses the Observational database and is part of the production processing pipeline.  The docstring is below.

In [9]:
print(make_plots_kpf_tsdb.__doc__)


Script Name: make_plots_kpf_tsdb.py

Description:
    This script generates standard KPF Time Series plots from the database. 
    It supports plotting data over specific intervals such as day, month, year, 
    decade, or custom ranges like the last N days. The plots are saved to 
    predefined directories for further use or analysis.

Features:
    - Creates time series plots for various intervals (day, month, year, decade).
    - Supports custom ranges like "last N days".
    - Saves plots in a structured directory format.
    - Includes configurable wait times for process orchestration.

Usage:
    Run this script with required arguments to specify the database path, 
    interval, and wait time:

        python make_plots_kpf_tsdb.py --db_path /path/to/database.db --interval <interval> --wait_time <seconds>

Options:
    --db_path       Path to the time series database file. 
                    Default: /data/time_series/kpf_ts.db
    --interval      Interval for plotting. Supp

### qlp_parallel.py
This commandline script is used to reprocess Quicklook data products over a specified date range.  The docstring is below.

In [10]:
print(qlp_parallel.main.__doc__)


    Script Name: qlp_parallel.py
   
    Description:
      This commandn line script uses the 'parallel' utility to execute the recipe 
      called 'recipes/quicklook_match.recipe' to generate standard Quicklook data 
      products.  The script selects all KPF files based on their
      type (L0/2D/L1/L2/master) from the standard data directory using a date 
      range specified by the parameters start_date and end_date.  L0 files are 
      included if the --l0 flag is set or none of the --l0, --2d, --l1, --l2
      flags are set (in which case all data types are included).  The --2d, 
      --l1, and --l2 flags have similar functions.  The script assumes that it
      is being run in Docker and will return with an error message if not. 
      If start_date is later than end_date, the arguments will be reversed 
      and the files with later dates will be processed first.
      
      Invoking the --print_files flag causes the script to print filenames
      but not create QLP d