Skip to content

gammasim/workflows

Repository files navigation

Workflows for simtools

☝️ NOTE
Model parameter schema files for moved to the CTAO gitlab, see this repository.

Introduction

CWL implementations of workflows for the CTA Simulation Pipeline.

Workflows are used to set, derive, and validate simulation model parameters following all or a subset of these steps:

Setting Workflows:

  1. Receive: data or parameter update through an API.
  2. Assert: verify that input data or parameters are in correct units, formats, allowed ranges, etc.
  3. Derive: derive simulation model parameter from received data (not applicable to all input data).
  4. Validate: validate updated simulation model parameter (e.g., by comparison with measurements).
  5. Review: review of updates to simulation model parameter(s) and validation steps (by expert; in some cases automatized).

Acceptance Workflows:

  1. Feedback: provide feedback on validation and review (successful / unsuccessful validation).
  2. Update DB: update simulation model database with new parameter (successful validation).
  3. Document: document all derivation and validation steps.

The implementation in simtools-workflows consist of the following main components:

The workflows are using simulation tools and software consisting of simtools and the simulation software (e.g., CORSIKA and sim_telarray).

Workflows and tools

CWL workflows consists of

Parameter and input data schema files

The parameter schema files are used to describe all input data and simulation model parameters and can be found in ./schemas in this repository.

See the Guide to the Simulation Model Parameter File Schema for details.

Using workflows

Getting started

Clone the simtools workflows repository and install dependencies using mamba or conda:

git clone git@github.com:gammasim/workflows.git
mamba env create -f environment.yml
mamba activate simtools-workflows-dev

All examples require docker to be installed and running. Tools are using the simtools-prod:latest docker image.

Workflow examples

Simple workflow (no database access required)

The following command line tool converts telescope coordinates from UTM to CORSIKA coordinates.

cwltool DeriveArrayElementCoordinates.cwl ../tests/resources/test_derive_array_elements_coordinates.yml

The workflow steps executed are:

Expected output:

  • telescope position file in CORSIKA coordinates (ecsv format)
  • metadata files describing each workflow step (yml format)
  • log files (stdout and stderr; ascii format)

Workflow with database access

To run workflows which require access to the model database, use the following commands:

cwltool --custom-net bridge \
    --preserve-environment DB_API_PORT \
    --preserve-environment DB_SERVER \
    --preserve-environment DB_API_USER \
    --preserve-environment DB_API_PW \
    --preserve-environment DB_API_AUTHENTICATION_DATABASE  \
    SetMirrorPanelRandomReflection.cwl ../tests/resources/test_derive_mirror_panel_rnda.yml

The environmental variables are used to configure the database connection (see simtools).

Implemented workflows available for testing

The following workflows are implemented and available for testing:

Writing and testing workflows

CWL validation

Use cwltool --validate file_name.cwl to check a workflow file or command line tool for valid CWL code (this is also done by the CI).

Workflow Graphs

Prepare a workflow graph, e.g.:

cwltool --print-dot DeriveArrayElementCoordinates.cwl | dot -Tsvg > DeriveArrayElementCoordinates.cwl.svg

Alternatively, use https://view.commonwl.org/ , e.g., see this example

Developer Notes

Following notes are unsorted, and possibly useful for the implementation of new workflows:

  • set output directory with --outdir (otherwise result files are written to current directory)
  • capture stdout of tools by using cwltool --log-dir ./my-log-dir... . This allows to find errors generated by the tools (otherwise reported errors are mostly that output is not found).
  • allow containers in tools to access network: use cwltool --custom-net bridge... and add NetworkAccess: networkAccess: true to the tool requirements.
  • set environmental variables (propagated into containers) with cwltool --preserve-environment DB_API_PORT --preserve-environment DB_SERVER ...

Acknowledgements

This project is supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – project number 460248186 (PUNCH4NFDI).

Releases

No releases published

Packages

No packages published