Run the pipeline
The new pipeline uses a configuration file to specify how the pipeline should be run, i.e., to specify paths, which actions to perform, etc. You always need a configuration file for running the pipeline. The configuration filename should be kcor.FLAGS.cfg
. See config/kcor.spec.cfg
for the definition of each possible option. You must change at least the paths in this file in order to run the pipeline.
If you want to use the database, you must have your database login credentials in a file. I recommend creating a file that is not viewable by anyone but yourself. The format of the file should be:
[pipeline]
host : {database URL}
user : {actual username here}
password : {actual password here}
port : 3306
database : MLSO
It can have multiple logins in it, the config file specifies the location of the file and which section of the file to use.
Running the pipeline requires a Python 3 installation with the psutil
package installed (can be installed via pip
).
Use the kcor
script in the bin
directory of a kcor-pipeline installation to run the pipeline. As stated above, you will need to have a valid and appropriately named configuration file.
To run the pipeline for a day, or range of days, do something like the following:
$ kcor process -f reprocess-2018 20180702
where reprocess-2018
is the "FLAGS" portion of a configuration filename to be used for the run. To run the pipeline for multiple days, use "-" (for a range with the start date inclusive and the end date exclusive) and "," to combine dates, like:
$ kcor process -f reprocess-2018 20180601-20180701,20180704
which runs the pipeline for all of June 2018 plus 20180704 (but not 20180701).
To run the calibration for a day, or range of days, do something like the following:
$ kcor cal -f reprocess-2018 20180702
where reprocess-2018
is the "FLAGS" portion of a configuration filename to be used for the run. You can also do:
$ kcor cal --list files.txt -f reprocess-2018 20180702
where files.txt
is a file containing a list of files to use for a calibration.