Investigating the frequency of DNA words near loci associated with human complex traits
- Kartikay Chadha
- Dr. Jo Knight
- Dr. Andrew D. Paterson
Program Advisory Committee members:
- Dr. Michael D. Wilson
- Dr. Mario Masellis
- Centre for Addiction and Mental Health, Toronto.
- Institute of Medical Science, University of Toronto.
- The hospital for Sick Children, Toronto.
- Lancaster University, United Kingdom.
- Sunnybrook Health Sciences Centre, Toronto.
Genome-wide association studies and expression quantitative trait loci (eQTL) studies haveidentified thousands of variants associated with complex diseases and gene expression levels.The frequency of DNA words associated with these variants has not been extensively evaluated.These words may help understand the biological role of trait-associated variants and also enable their identification in future studies.
An exact word-counting method was developed to investigate the hypothesis that short DNA words have different frequencies near single nucleotide polymorphisms (SNPs) associated with (1) Alzheimer’s disease and (2) thyroid eQTLs, compared to the rest of the genome. No significant DNA words were found near AD associated SNPs.
Some words enriched in GC content have significantly higher frequency around thyroid’s eQTLs compared to controls. TheseDNA words were no longer significant when the controls were matched for nucleotide frequency, but this is likely due to over-matching.
This Githut repository release contains all codes to perform the analysis described in the MSc. Thesis undertaken by the Kartikay Chadha at the University of Toronto.
Thesis release date : To be announced
Thesis Link : To be announced
Thesis Defense date : 2018-07-27
GITHUB RELEASE DATE: 2018-07-17
For any questions or concern please contact the author at: firstname.lastname@example.org