Skip to content

Computational-Rare-Disease-Genomics-WHG/5-UTR_characterisation

Repository files navigation

Differences in 5'untranslated regions highlight the importance of translational regulation of dosage sensitive genes

The following repository contains the code accompany paper "Differences in 5'untranslated regions highlight the importance of translational regulation of dosage sensitive genes" and contains scripts to reproduce our figures.

We recommend installing the following packages used in our RMarkdown scripts prior to reproducing the results.

packages <- c("data.table", "ggplot2", "ggrepel", "ggthemes", 
    "ggalt", "stringr", "tidyverse", 
    "splitstackshape", "BiocManager", 
    "gridExtra", "grid", "ape", "seqinr")

# Install CRAN Packages
install.packages(packages)

# Install Bioconductor Packages
BiocManager::install(c("GenomicScores", "BSgenome.Hsapiens.UCSC.hg38", "BSgenome"))

# Install remote packages.
remotes::install_github('jorvlan/raincloudplots')

To download the data, we recommend install wget, bgzip and installing gsutils (as per the instructions provided by Google, which can be found here).

download_all.sh downloads MANE v1 and gnomAD exomes 2.1.1 constraint metrics and stores them in the 5-UTR_char directory.

R markdown analysis files:

The scripts are broken up in to 4 RMarkdown notebooks as below and need to be run in the order shown below.

  1. MANE 5'UTR analysis
  2. 5'UTR composition analysis
  3. 5'UTRs across LEOUF
  4. 5'UTRs of disease genes

The folder 5-UTR_char contains all data used by and generated by the markdown files.

Questions or Help?

For any questions or help regarding the code, please get in touch with us via the contact details from our manuscript.