Based on HPO profiles and input files provided this pipeline run GADO and/or Exomiser
Before you can use the pipeline you need to install Exomiser, GADO and some supporting files
You can download the latest version of Exomiser from the Monarch initiative FTP
- Update
exomiser_cli
params in thenextflow.config
to point to the.jar
file of exomiser CLI. - Update the
config/application.properties
file to point to your exomiser data folder. - Note that the current configuration also use CADD score, so you need to have CADD score files installed as well and you need to configure the corresponding file location in
config/application.properties
(or remove CADD from the template files inconfig
)
- Download the GADO cli 1.0.1 from the official release
- Download dataset files. You can find the links in the GADO github wiki
- Uncompress the prediction matrix
.zip
file and rename files so that you have a folder (let sayGADO_resources
containing the following files:- hpo_predictions_info.txt
- hpo_predictions_genes.txt
- hpo_predictions_matrix_spiked.cols.txt
- hpo_predictions_matrix_spiked.rows.txt
- hpo_predictions_matrix_spiked.dat
- Download the HPO ontology
.obo
file from GADO wiki or directly from HPO ontology
Then you need to update the following params in the nextflow.config
- GADO_cli: path to your GADO cli
.jar
file - GADO_datafolder: path to folder containing GADO files (
GADO_resources
in this example) - HPO_obofile: path to your
.obo
files
When everything is properly configured in nextflow.config
you can run the pipeline using
nextflow main.nf --GADO --exomiser \
--HPO HPO_profiles.tsv \
--exomiser_input exomiser_input.tsv \
--exomiser_template config/template_GRCh38.yml \
--out results
NB There are current profiles for sge
and slurm
in the config file, but you need to configure the queue names for your system
Only HPO profiles file is required for GADO, while also exomiser input is required for exomiser.
This is a tab-separated file without header containing 1 case per line, with case ID in column 1 and then 1 HPO term per column
case1 HP:00001 HP:000002
case2 HP:00003 HP:000004 HP:000006
This is a tab-separated file without header containing 1 case per line, with case ID, proband id, vcf file and ped file. NB case ID
must match case ID from the HPO profiles and proband id
must match the id of proband sample in the VCF file.
case1 proband1 case1_family.vcf.gz case1.ped
case2 proband2 case2_family.vcf.gz case2.ped
The exomiser annotation and filter settings are store in the .yml
templated in the config
folder. The provided files will filter for protein-changing variants with population AF < 1% and use CADD, PP2 and SIFT scores for variant scoring. All possible segregation models are evaluated and hiPhive is used for HPO-based prioritization. You can change these template to change analysis settings for the Exomiser. Please refer to the exomiser documentation.