Integrative genomic analyses of promoter G-quadruplexes reveal their selective constraint and association with gene activation
Li Guangyue
Code and processed data for this study in submission assessing evidence for the negative selection in gnomAD and 1000 Genomes project over promoter G-quadruplex (G4) forming sequences and enrichment for functional associations using the data from ENCODE and GTEx.
The analysis is separated into five major parts:
- Putative G-quadruplex (pG4) forming sequences and the stable scores within promoters obtained from Quadron software.
- Population genetics analysis of canonical, putative G-quadruplex (pG4) forming sequences within promoters (we split this result code part into 4 parts: 2.1-2.4).
- The effects of G4s on promoter activity by comparing the BG4 ChIP-seq data and RNA-seq data in K562 and HepG2 cells.
- The BG4 ChIP-seq and TT-seq datasets from this study (GSE178668) to interrogate the relationship between G4 formation and gene expression.
- Testing for the enrichment of functional associations, including cis-eQTLs mapped by GTEx, and histone markers, chromatin remodeler and transcript factor binding sites mapped by ENCODE.
Script: contains all of the analysis scripts used to generate main figures for this study
Data: contains processed, publicly available data organized by source