Skip to content

Commit

Permalink
Update ClinVar paths
Browse files Browse the repository at this point in the history
  • Loading branch information
krassowski committed Oct 4, 2020
1 parent 8cfd7e5 commit 80468ca
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 10 deletions.
7 changes: 2 additions & 5 deletions website/data/download.sh
Original file line number Diff line number Diff line change
Expand Up @@ -95,11 +95,8 @@ cd mutations

wget https://www.dropbox.com/s/lhou9rnwl6lwuwj/mc3.v0.2.8.PUBLIC.maf.gz
wget https://www.dropbox.com/s/zodasbvinx339tw/ESP6500_muts_annotated.txt.gz
# v2017:
# wget https://www.dropbox.com/s/du2qe1skxwmuep2/clinvar_muts_annotated.txt.gz
# v2019:
wget ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar_20190520.vcf.gz
wget ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/xml/ClinVarFullRelease_2019-05.xml.gz
wget ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar_20201003.vcf.gz
wget ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/xml/ClinVarFullRelease_2020-10.xml.gz

wget https://www.dropbox.com/s/pm74k3qwxrqmu2q/all_mimp_annotations_p085.rsav

Expand Down
2 changes: 1 addition & 1 deletion website/data/mutations/annotate_clinvar.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash
./ensure_annovar.sh
file=clinvar_20190520.vcf.gz
file=clinvar_20201003.vcf.gz
gunzip ${file} -c > clinvar.avinput

./annovar/table_annovar.pl clinvar.avinput humandb/ -buildver hg19 -out clinvar_annotated -remove -protocol refGene -operation g -nastring . -thread 2 -otherinfo -vcfinput
Expand Down
8 changes: 4 additions & 4 deletions website/imports/mutations/clinvar.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ class ClinVarImporter(MutationImporter):
name = 'clinvar'
model = InheritedMutation
default_path = 'data/mutations/clinvar_muts_annotated.txt.gz'
default_xml_path = 'data/mutations/ClinVarFullRelease_2019-05.xml.gz'
default_xml_path = 'data/mutations/ClinVarFullRelease_2020-10.xml.gz'
header = [
'Chr', 'Start', 'End', 'Ref', 'Alt', 'Func.refGene', 'Gene.refGene',
'GeneDetail.refGene', 'ExonicFunc.refGene', 'AAChange.refGene', 'Otherinfo',
Expand Down Expand Up @@ -346,7 +346,7 @@ def import_disease_associations(self):

def remove_muts_without_origin(self):

origin_blacklist = {
origin_exclusion_list = {
'not applicable',
'not provided',
'not-reported',
Expand All @@ -355,12 +355,12 @@ def remove_muts_without_origin(self):
'unknown'
}

print('ClinVar mutations origin blacklist: ', origin_blacklist)
print('ClinVar mutations origin exclusion list: ', origin_exclusion_list)

print('Removing ClinVar associations with blacklisted or missing origin; NOTE:')
print('\torigin is not set also when the mutation was skipped due to other reasons, such as non-human species')
removed_cnt = ClinicalData.query.filter(
or_(ClinicalData.origin == None, ClinicalData.origin.in_(origin_blacklist))
or_(ClinicalData.origin == None, ClinicalData.origin.in_(origin_exclusion_list))
).delete(synchronize_session='fetch')
db.session.commit()
print(f'Removed {removed_cnt} associations')
Expand Down

0 comments on commit 80468ca

Please sign in to comment.