Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.
Sign up
Collection of manually-curated variants that cause exon skipping
Cannot retrieve the latest commit at this time.
| Type | Name | Latest commit message | Commit time |
|---|---|---|---|
| Failed to load latest commit information. | |||
|
|
data | ||
|
|
logs | ||
|
|
src | ||
|
|
README | ||
README
Here is the pipeline used to generate this set of variants causing exon skipping: # BEFORE CURATION 1. run query_hgmd_splice_MySQL.py to produce hgmd_splice_variants.tsv 2. pull out hgmd_splice_pmids (trivial, can use cut or some other command line utility) 3. run scrape_pubmed_abstracts.py which takes hgmd_splice_pmids.txt as input, produces exon_skipping_pmids.txt 4. run map_pmids_to_variants.py which takes exon_skipping_pmids.txt and hgmd_splice_variants.tsv as input, and produces exon_skipping.prelim.vcf 5. annotate preliminary VCF file with VEP (this is just to collect useful info for the manual curation step) 6. run tableize_vep_vcf.py on annotated to get data into spreadsheet format (exon_skipping.prelim.tsv) 7. finally, run create_spreadsheet.R to get exon_skipping_spreadsheet.tsv. This will serve as the medium for manual curation. Note: merge_spreadsheets.R merges manual annotations from a partially curated spreadsheet with a newly generated spreadsheet # AFTER CURATION python convert_tsv_to_vcf.py -i curated_spreadsheet.tsv --info status,donorDist,acceptorDist,type | grep -v "status=discard" | grep -v "type=as" > donor_skip.vcf python convert_tsv_to_vcf.py -i curated_spreadsheet.tsv --info status,donorDist,acceptorDist,type | grep -v "status=discard" | grep -v "type=ds" > acceptor_skip.vcf # Other notes convert_tsv_to_vcf.py is in maclab_scripts/vcf/ tableize_vep_vcf.py is in maclab_scripts/vep/