Skip to content

Commit

Permalink
line length fix
Browse files Browse the repository at this point in the history
  • Loading branch information
tkmamidi committed Jan 25, 2024
1 parent 23bee66 commit 0383375
Show file tree
Hide file tree
Showing 4 changed files with 17 additions and 7 deletions.
File renamed without changes.
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ Markdown](https://github.com/uab-cgds-worthey/DITTO/actions/workflows/linting.ym
> Gitlab](https://gitlab.rc.uab.edu/center-for-computational-genomics-and-data-science/sciops/ditto). It was migrated to
> Github in April 2023, and the Gitlab version has been archived.
We aim to develop a pipeline for accurate and rapid interpretation of genetic variants for pathogenicity using patient’s genotype (VCF) information.
We aim to develop a pipeline for accurate and rapid interpretation of genetic variants for pathogenicity using patient’s
genotype (VCF) information.

## Usage

Expand All @@ -26,7 +27,8 @@ in this [GitHub repo](https://github.com/uab-cgds-worthey/DITTO-API).

### Setting up to use locally

> ***NOTE:*** This setup will allow one to annotate a VCF sample and make DITTO predictions. Currently tested only in Cheaha (UAB HPC). Docker versions may need to be explored later to make it
> ***NOTE:*** This setup will allow one to annotate a VCF sample and make DITTO predictions. Currently tested only in
> Cheaha (UAB HPC). Docker versions may need to be explored later to make it
> useable in Mac and Windows.
#### System Requirements
Expand Down
11 changes: 7 additions & 4 deletions docs/build_DITTO.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,11 +77,13 @@ oc run clinvar.vcf.gz -l hg38 -t csv --package mypackage -d path/to/output/direc
## Preprocessing

By default, OpenCravat annotates all transcript level annotations for each variant in a single row. DITTO makes transcript level
predictions for each variant. To parse out each transcript level annotations to different rows, use the below command
By default, OpenCravat annotates all transcript level annotations for each variant in a single row. DITTO makes
transcript level predictions for each variant. To parse out each transcript level annotations to different rows, use
the below command

```sh
python src/annotation_parsing/parse_single_sample.py -i clinvar.vcf.gz.variant.csv -e parse -o clinvar.vcf.gz.variant.csv_parsed.csv.gz -c configs/opencravat_train_config.json
python src/annotation_parsing/parse_single_sample.py -i clinvar.vcf.gz.variant.csv -e parse \
-o clinvar.vcf.gz.variant.csv_parsed.csv.gz -c configs/opencravat_train_config.json
```

Filter and process the annotations as shown in this [python
Expand All @@ -95,7 +97,8 @@ uses the testing data to calculate accuracy, roc, and prc metrics along with a S
to train the model.

```sh
python training/NN.py --train_x /data/train_class_data_80.csv.gz --test_x /data/test_class_data_20.csv.gz -c configs/col_config.yaml -o /data/
python training/NN.py --train_x /data/train_class_data_80.csv.gz \
--test_x /data/test_class_data_20.csv.gz -c configs/col_config.yaml -o /data/
```

This script took 10 CPU cores, 100 GB memory and ~17 hrs to tune and train DITTO.
Expand Down
7 changes: 6 additions & 1 deletion docs/install_openCravat.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,12 @@ Test it by using `oc config md` command. It should output the new modules direct
```sh
oc module install-base

oc module install aloft cadd cadd_exome cancer_genome_interpreter ccre_screen chasmplus civic clingen clinpred clinvar cosmic cosmic_gene cscape dann dann_coding dbscsnv dbsnp dgi ensembl_regulatory_build ess_gene exac_gene fathmm fathmm_xf_coding funseq2 genehancer gerp ghis gnomad gnomad3 gnomad_gene gtex gwas_catalog linsight loftool lrt mavedb metalr metasvm mutation_assessor mutationtaster mutpred1 mutpred_indel ncbigene ndex ndex_chd ndex_signor omim pangalodb phastcons phdsnpg phi phylop polyphen2 prec provean repeat revel rvis segway sift siphy spliceai uniprot vest cgc cgd varity_r
oc module install aloft cadd cadd_exome cancer_genome_interpreter ccre_screen chasmplus civic clingen clinpred clinvar \
cosmic cosmic_gene cscape dann dann_coding dbscsnv dbsnp dgi ensembl_regulatory_build ess_gene exac_gene fathmm \
fathmm_xf_coding funseq2 genehancer gerp ghis gnomad gnomad3 gnomad_gene gtex gwas_catalog linsight loftool lrt mavedb \
metalr metasvm mutation_assessor mutationtaster mutpred1 mutpred_indel ncbigene ndex ndex_chd ndex_signor omim \
pangalodb phastcons phdsnpg phi phylop polyphen2 prec provean repeat revel rvis segway sift siphy spliceai uniprot \
vest cgc cgd varity_r
```

Please look at the [install logs](../docs/install_openCravat.logfile) for the versions of all the above annotators used to
Expand Down

0 comments on commit 0383375

Please sign in to comment.