- test_websocket.py
- requirements.txt --- contains modules to run
pip install
on
These logs will contain:
- the stderr/stdout of ICA as it stages the analysis run before it runs it
- the stderr/stdout collected at each step during an analysis run
- the stderr/stdout of CWL/Nextflow as it orchestrates the analysis run
- the stderr/stdout of ICA as it brings the result back to your ICA project
More details about the logs that ICA collects during an analysis run can be found here
You can use the docker image keng404/monitor_ica_analysis_run:0.0.2
with all the appropriate scripts and libraries installed
python3 test_websocket.py --api_key_file {FILE} [--project_name {STR}|--project_id {STR}] [OPTIONAL:--analysis_name {STR} | --analysis_id {STR}]
--api_key_file
: path to text file that contains your API key--project_name
: name of your ICA project or--project_id
: project id of your ICA project--analysis_name
: user_reference or name of your analysis run or--analysis_id
: analysis id of the analysis you want to monitor
If both --analysis_name
and --analysis_id
are undefined, the script will try to grab/monitor logs from the most recent analysis run in your ICA project.
- An additional Rscript is provided to help parse the JSON message returned from the ICA getAnalysisSteps endpoint and provide a table containing steps to monitor a running pipeline. This can be particularly useful for nextflow-based pipelines. An example command-line to run this script can be found below:
Rscript ica.analysis_table.R --process-steps $PWD/analysis_id_{ANALYSIS_ID}/step_metadata.txt
- directory where
step_metadata.txt
is generated will be created by the python script above.
- Distinguishes between analysis runs that have the same user_reference
- picks the most recent analysis with the user_reference name
- ICA CLI limitation launching an ICA pipeline where you have a null (i.e. not specified) multi-value parameter. You won't be able to configure this in the CLI.
- This is possible when launching via the API (default settings).
See this file for recommendations
Rscript ica_pipelines.check_out_workflow_metrics.R --db-file {db_file}
db_file
is an SQLite DB generated by the kubernetes pod that runs your CWL/NF based ICA pipelines
The R script will generate graphs that can be used to identify how to optimize your pipeline runs (i.e. w.r.t CPU and memory).
This script is actually run when running the script test_websocket.py
. You will see an warning message if the script cannot find a file metrics.db
in the analysis run output.
If you specify your analysis output files in your analysis run request, this script, and move the metrics.db file to a user-defined location, this script will not work.
Your pipeline may have done this by using the ICA endpoints /api/projects/{project_id}/analysis:nextflow
or /api/projects/{project_id}/analysis:cwl
. See the swagger page here.
Your pipeline request would have included the following parameter shown below:
"analysisOutput": [
{
"sourcePath": "string",
"type": "FILE",
"targetProjectId": "string",
"targetPath": "string",
"actionOnExist": "string"
}
]
- create Docker image bundling the python script and supplementary R scripts
- create documentation identifying the edits required to pull back the SQLite DB generated by the kubernetes pod that runs your CWL/NF based ICA pipelines