☝️ NOTE |
---|
Model parameter schema files for moved to the CTAO gitlab, see this repository. |
CWL implementations of workflows for the CTA Simulation Pipeline.
Workflows are used to set, derive, and validate simulation model parameters following all or a subset of these steps:
Setting Workflows:
- Receive: data or parameter update through an API.
- Assert: verify that input data or parameters are in correct units, formats, allowed ranges, etc.
- Derive: derive simulation model parameter from received data (not applicable to all input data).
- Validate: validate updated simulation model parameter (e.g., by comparison with measurements).
- Review: review of updates to simulation model parameter(s) and validation steps (by expert; in some cases automatized).
Acceptance Workflows:
- Feedback: provide feedback on validation and review (successful / unsuccessful validation).
- Update DB: update simulation model database with new parameter (successful validation).
- Document: document all derivation and validation steps.
The implementation in simtools-workflows consist of the following main components:
- workflows encoded in Common Workflow Language (CWL) (see ./workflows in this repository)
- parameter schemas defining all input data and simulation model parameters (see ./schemas in this repository)
The workflows are using simulation tools and software consisting of simtools and the simulation software (e.g., CORSIKA and sim_telarray).
CWL workflows consists of
- Tools are called by steps in a Workflow and are doing one single task. In most cases, tools are calling an application of simtools including all required configuration parameters (e.g., workflows/tools/derive_array_elements_coordinates.cwl).
- Workflows connect tools, and allow to execute the steps discussed above (receive, assert, derive, validate, ...), see.e.g, workflow/DeriveArrayElementCoordinates.cwl.
The parameter schema files are used to describe all input data and simulation model parameters and can be found in ./schemas in this repository.
See the Guide to the Simulation Model Parameter File Schema for details.
Clone the simtools workflows repository and install dependencies using mamba or conda:
git clone git@github.com:gammasim/workflows.git
mamba env create -f environment.yml
mamba activate simtools-workflows-dev
All examples require docker to be installed and running. Tools are using the simtools-prod:latest docker image.
The following command line tool converts telescope coordinates from UTM to CORSIKA coordinates.
cwltool DeriveArrayElementCoordinates.cwl ../tests/resources/test_derive_array_elements_coordinates.yml
The workflow steps executed are:
- assert input data (as defined in the configuration file; in this example in tests/resources/test_derive_array_elements_coordinates.yml)
- convert coordinates
- assert derived parameter values
Expected output:
- telescope position file in CORSIKA coordinates (ecsv format)
- metadata files describing each workflow step (yml format)
- log files (stdout and stderr; ascii format)
To run workflows which require access to the model database, use the following commands:
cwltool --custom-net bridge \
--preserve-environment DB_API_PORT \
--preserve-environment DB_SERVER \
--preserve-environment DB_API_USER \
--preserve-environment DB_API_PW \
--preserve-environment DB_API_AUTHENTICATION_DATABASE \
SetMirrorPanelRandomReflection.cwl ../tests/resources/test_derive_mirror_panel_rnda.yml
The environmental variables are used to configure the database connection (see simtools).
The following workflows are implemented and available for testing:
- DeriveArrayElementCoordinates.cwl: derive telescope coordinates from array element IDs
- SetMirrorRandomReflection.cwl: derive random reflection for all mirrors in a telescope
Use cwltool --validate file_name.cwl
to check a workflow file or command line tool for valid CWL code (this is also done by the CI).
Prepare a workflow graph, e.g.:
cwltool --print-dot DeriveArrayElementCoordinates.cwl | dot -Tsvg > DeriveArrayElementCoordinates.cwl.svg
Alternatively, use https://view.commonwl.org/ , e.g., see this example
Following notes are unsorted, and possibly useful for the implementation of new workflows:
- set output directory with
--outdir
(otherwise result files are written to current directory) - capture stdout of tools by using
cwltool --log-dir ./my-log-dir...
. This allows to find errors generated by the tools (otherwise reported errors are mostly that output is not found). - allow containers in tools to access network: use
cwltool --custom-net bridge...
and addNetworkAccess: networkAccess: true
to the tool requirements. - set environmental variables (propagated into containers) with
cwltool --preserve-environment DB_API_PORT --preserve-environment DB_SERVER ...
This project is supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – project number 460248186 (PUNCH4NFDI).