Monitor ICA analysis run

This repo contains scripts and demo code to monitor and troubleshoot analysis runs in ICA

Scripts and demo code to monitor analysis runs in ICA

test_websocket.py
requirements.txt --- contains modules to run pip install on

If analysis run is InProgress --- this script hopes to help stream logs

If analysis run is completed (i.e. Succeeded or Failed)--- this script will download the logs

These logs will contain:

the stderr/stdout of ICA as it stages the analysis run before it runs it
the stderr/stdout collected at each step during an analysis run
the stderr/stdout of CWL/Nextflow as it orchestrates the analysis run
the stderr/stdout of ICA as it brings the result back to your ICA project

More details about the logs that ICA collects during an analysis run can be found here

You can use the docker image `keng404/monitor_ica_analysis_run:0.0.2` with all the appropriate scripts and libraries installed

See here for the Docker image

Template command line

python3 test_websocket.py --api_key_file {FILE} [--project_name {STR}|--project_id {STR}] [OPTIONAL:--analysis_name {STR} | --analysis_id {STR}]

--api_key_file : path to text file that contains your API key
--project_name : name of your ICA project or --project_id : project id of your ICA project
--analysis_name : user_reference or name of your analysis run or --analysis_id : analysis id of the analysis you want to monitor

If both --analysis_name and --analysis_id are undefined, the script will try to grab/monitor logs from the most recent analysis run in your ICA project.

Rscript extension

An additional Rscript is provided to help parse the JSON message returned from the ICA getAnalysisSteps endpoint and provide a table containing steps to monitor a running pipeline. This can be particularly useful for nextflow-based pipelines. An example command-line to run this script can be found below:

 Rscript ica.analysis_table.R --process-steps $PWD/analysis_id_{ANALYSIS_ID}/step_metadata.txt

directory where step_metadata.txt is generated will be created by the python script above.

Limitations

Distinguishes between analysis runs that have the same user_reference
- picks the most recent analysis with the user_reference name
ICA CLI limitation launching an ICA pipeline where you have a null (i.e. not specified) multi-value parameter. You won't be able to configure this in the CLI.
- This is possible when launching via the API (default settings).

Supplementary addition to get CPU, memory, disk usage on ICA for each analysis/pipeline run

Adding logic to pull back kubernetes logs and metrics files to your ICA analysis run

See this file for recommendations

Getting CPU and memory usage in an ICA pipeline run --- follow recommendations above

Rscript ica_pipelines.check_out_workflow_metrics.R --db-file {db_file}

db_file is an SQLite DB generated by the kubernetes pod that runs your CWL/NF based ICA pipelines The R script will generate graphs that can be used to identify how to optimize your pipeline runs (i.e. w.r.t CPU and memory). This script is actually run when running the script test_websocket.py. You will see an warning message if the script cannot find a file metrics.db in the analysis run output.

Limitations of finding the db_file

If you specify your analysis output files in your analysis run request, this script, and move the metrics.db file to a user-defined location, this script will not work.

Your pipeline may have done this by using the ICA endpoints /api/projects/{project_id}/analysis:nextflow or /api/projects/{project_id}/analysis:cwl. See the swagger page here.

Your pipeline request would have included the following parameter shown below:

  "analysisOutput": [
    {
      "sourcePath": "string",
      "type": "FILE",
      "targetProjectId": "string",
      "targetPath": "string",
      "actionOnExist": "string"
    }
  ]

Todos

create Docker image bundling the python script and supplementary R scripts
create documentation identifying the edits required to pull back the SQLite DB generated by the kubernetes pod that runs your CWL/NF based ICA pipelines

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
Dockerfile		Dockerfile
README.md		README.md
adding_cpu_and_memory_montoring_to_ica_pipeline.md		adding_cpu_and_memory_montoring_to_ica_pipeline.md
ica.analysis_table.R		ica.analysis_table.R
ica_pipelines.check_out_workflow_metrics.R		ica_pipelines.check_out_workflow_metrics.R
install_packages.R		install_packages.R
requirements.txt		requirements.txt
test_websocket.py		test_websocket.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dockerfile

Dockerfile

README.md

README.md

adding_cpu_and_memory_montoring_to_ica_pipeline.md

adding_cpu_and_memory_montoring_to_ica_pipeline.md

ica.analysis_table.R

ica.analysis_table.R

ica_pipelines.check_out_workflow_metrics.R

ica_pipelines.check_out_workflow_metrics.R

install_packages.R

install_packages.R

requirements.txt

requirements.txt

test_websocket.py

test_websocket.py

Repository files navigation

Monitor ICA analysis run

This repo contains scripts and demo code to monitor and troubleshoot analysis runs in ICA

Scripts and demo code to monitor analysis runs in ICA

If analysis run is InProgress --- this script hopes to help stream logs

If analysis run is completed (i.e. Succeeded or Failed)--- this script will download the logs

You can use the docker image `keng404/monitor_ica_analysis_run:0.0.2` with all the appropriate scripts and libraries installed

Template command line

Rscript extension

Limitations

Supplementary addition to get CPU, memory, disk usage on ICA for each analysis/pipeline run

Adding logic to pull back kubernetes logs and metrics files to your ICA analysis run

Getting CPU and memory usage in an ICA pipeline run --- follow recommendations above

Limitations of finding the db_file

Todos

About

Releases

Packages

Contributors 2

Languages

keng404/monitor_ica_analysis_run

Folders and files

Latest commit

History

Repository files navigation

Monitor ICA analysis run

This repo contains scripts and demo code to monitor and troubleshoot analysis runs in ICA

Scripts and demo code to monitor analysis runs in ICA

If analysis run is InProgress --- this script hopes to help stream logs

If analysis run is completed (i.e. Succeeded or Failed)--- this script will download the logs

You can use the docker image keng404/monitor_ica_analysis_run:0.0.2 with all the appropriate scripts and libraries installed

Template command line

Rscript extension

Limitations

Supplementary addition to get CPU, memory, disk usage on ICA for each analysis/pipeline run

Adding logic to pull back kubernetes logs and metrics files to your ICA analysis run

Getting CPU and memory usage in an ICA pipeline run --- follow recommendations above

Limitations of finding the db_file

Todos

About

Resources

Stars

Watchers

Forks

Languages

You can use the docker image `keng404/monitor_ica_analysis_run:0.0.2` with all the appropriate scripts and libraries installed