Skip to content

How to run genotoscope_classify.py

Damianos P. Melidis edited this page Nov 25, 2021 · 3 revisions

genotoscope_classify.py

This command accepts as input a GSvar (single WES) or a set of GSvar files (WES of a set of patients) It examines the ACMG/AMP evidence-based criteria for each variant included in a GSvar file.

Based on the triggered criteria, assigns the ACMG/AMP pathogenicity class and computes the pathogenicity probability for each variant included in a WES file.

It outputs extended GSvar files with information on the criteria examination, ACMG classification and probability. If selected it summarizes the classified WES file to the found pathogenic variants (ACMG class 4-5) and to the uncertain significance variants (VUS) (ACMG class 3) with pathogenicity probability higher than the user threshold.

Analysis root path

genotoscope_classify needs a parameter analysis_root. The user should specify the analysis root path so that it contains at least 2 subfolders:

  • analysis_root/genotoscope_data (please also here)
  • analysis_root/input_gsvars_folder

Settings file

Move settings file genotoscope_local.yaml outside GenOtoScope repository root folder. Then update the path in the file to reflect the paths of the annotation and open data, needed for genotoscope_classify.py to run.

Add in bashrc

(If not already done for the genotoscope_annotate.py)

  • Add in ~.bashrc: export PYTHONPATH="path2repository/GenOtoScope:$PYTHONPATH"
  • Activate ~.bashrc: source ~.bashrc

Options

  • -h, --help show this help message and exit
  • --analysis_root ANALYSIS_ROOT |String| Absolute path to directory under which GenOtoScope data is placed
  • --input_variants INPUT_VARIANTS |String| Absolute path to patients variants or single file path with variants
  • --output_path OUTPUT_PATH |String| Absolute path to output directory
  • --settings_file SETTINGS_FILE Settings file (.yaml) path
  • --filter_HL_genes {0,1} |int flag|Apply classification on variants affecting ONLY known HL genes (1) or all presented genes (0)
  • --min_pathogenicity MIN_PATHOGENICITY |float| Minimum pathogenicity probability, in the range of [0.0,1.0], for variants classified with class 3 (ACMG), along with all variants classified with 4/5 class, to be extracted in a separate 'summary' file. Choose -1 to deactivate extraction step.
  • --logging {INFO,DEBUG} Logging level
  • --threads THREADS Number of threads (Current version of GenOtoScope uses 1 thread)

Example runs

  1. Classify all GSvar files in given folder: /home/username/example_gsvars,

    analysis root: /home/username/analysis and

    output folder: /home/username/example_gsvars_classify.

    python3.7 genotoscope_classify.py --analysis_root /home/username/analysis --input_variants /home/username/example_gsvars --output_path /home/username/example_gsvars_classify --settings_file /home/username/analysis/genotoscope_server.yaml --filter_HL_genes 0 --min_pathogenicity 0.49 --logging INFO --threads 1

  2. Classify a single GSvar file: /home/username/example_gsvars/example_gsvar.GSvar,

    analysis root: /home/username/analysis and

    output folder: /home/username/example_gsvars_classify.

    python3.7 genotoscope_classify.py --analysis_root /home/username/analysis --input_variants /home/username/example_gsvars/example_gsvar.GSvar --output_path /home/username/example_gsvars_classify --settings_file /home/username/analysis/genotoscope_server.yaml --filter_HL_genes 0 --min_pathogenicity 0.49 --logging INFO --threads 1