Skip to content

Latest commit

 

History

History
42 lines (37 loc) · 1.92 KB

from-parquet-pr.md

File metadata and controls

42 lines (37 loc) · 1.92 KB

from-parquet-pr

  • domain(s): pairs
  • generates: ldc.api.supervised.pairs.PairData

Reads prompt/output pairs from Parquet database files.

usage: from-parquet-pr [-h] [-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
                       [-N LOGGER_NAME] [-i [INPUT [INPUT ...]]]
                       [-I [INPUT_LIST [INPUT_LIST ...]]]
                       [--col_instruction COL] [--col_input COL]
                       [--col_output COL] [--col_id COL]
                       [--col_meta [COL [COL ...]]]

Reads prompt/output pairs from Parquet database files.

optional arguments:
  -h, --help            show this help message and exit
  -l {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --logging_level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
                        The logging level to use. (default: WARN)
  -N LOGGER_NAME, --logger_name LOGGER_NAME
                        The custom name to use for the logger, uses the plugin
                        name by default (default: None)
  -i [INPUT [INPUT ...]], --input [INPUT [INPUT ...]]
                        Path to the parquet file(s) to read; glob syntax is
                        supported (default: None)
  -I [INPUT_LIST [INPUT_LIST ...]], --input_list [INPUT_LIST [INPUT_LIST ...]]
                        Path to the text file(s) listing the parquet files to
                        use (default: None)
  --col_instruction COL
                        The name of the column with the instructions (default:
                        None)
  --col_input COL       The name of the column with the inputs (default: None)
  --col_output COL      The name of the column with the outputs (default:
                        None)
  --col_id COL          The name of the column with the row IDs (gets stored
                        under 'id' in meta-data) (default: None)
  --col_meta [COL [COL ...]]
                        The name of the columns to store in the meta-data
                        (default: None)