Please edit the file paths as required.
The overall pipeline and the files used for the analyses are shown in AnalysesOverview.jpg
The accession number in GEO database for the RNAs-seq data reported in this paper is GSE218480, Chip-seq data is GSE218479.
Folder containing the RData containing the RNA-seq and ChIP-seq analyses can be downloaded from https://data.cyverse.org/dav-anon/iplant/home/lijingyang23/Hari_files/RData.zip
Folder containing the TCGA data used in the analyses below can be downlaoded from https://data.cyverse.org/dav-anon/iplant/home/lijingyang23/Hari_files/TCGA.zip
RNA-seq analyses
- The file Salmon_alignment.sh has codes for aligning the RNA-seq data.
- RNA_seq_Analyses.Rmd processes aligned files and summarizes Salmon aligned transcripts to genes, normalize RNA-seq data, and performs differential expression analyses. The file RNA_seq_Analyses.R initiates RNA_seq_Analyses.Rmd run. The output of this file is Lijing_RNA_seq_Analyses_062922.Rdata where all the analyzed files are stored. The RData file can be used for various downstream analyses.
- RNA_seq_Analyses.html has the output R Markdown output from executing RNA_seq_Analyses.Rmd.
- Gene set enrichment analysis of the RNA-seq data can be reproduced by running the codes in the folder FigureCodes.
ChIP-seq analyses
- ChIP-seq reads are aligned and peaks are called using ChIPSeq_Alignment_MACSPeak.sh
- Differential peak analyses is performed using DiffBind_Analyses.R
- Analyses of the peak calls is done using run_Peak_Calls_Analysis.R which executes Peak_Calls_Analysis.Rmd
- The file Peak_Calls_Analysis_062922.html contains the R Markdown output of Peak_Calls_Analysis.Rmd.
- Peak_Calls_Analysis.RData contains all R objects from running Peak_Calls_Analysis.Rmd. This file can be used for further analyses of the ChIP-seq data.
The Folder FigureCodes contains the codes for reproducing the figures.