- Comparison of RNA QC run using either trimmomatic or fastp.
- Provision of scripts for RNA-seq processing steps, from QC to alignment and quantification.
To use, install from github. This can be done using the following lines of code:
install.packages("devtools")
library(devtools)
install_github("RHReynolds/RNAseqProcessing")
The executables for all tools have already been downloaded to /tools/
. To run all scripts below assumes that you are able to call various tools from the command line without having to point to their exact location on the server. To do this edit your .profile in your home directory:
cd
nano .profile
Then ensure that each tool (fastp, STAR, etc.) has an export PATH
line, which will tell bash to look in /tools/
for commands. E.g. For fastp add: export PATH="/tools/fastp/:$PATH"
.
Script | Processing Step | Description | Author(s) |
---|---|---|---|
prealignmentQC_fastp_PEadapters.R | Pre-alignment QC | This will perform fastp trimming, with adapter sequence auto-detection for PE data enabled, followed by fastQC and MultiQC. If you wish to specify adapters, this flag needs to be enabled. Script not yet produced. | DZ, KD & RHR |
prealignmentQC_fastp_notrimming.R | Pre-alignment QC | This will run fastp, but with trimming disabled, followed by fastQC and MultiQC. | DZ, KD & RHR |
STAR_alignment_withReadGroups_multi2pass.R | Alignment | Performs STAR alignment, with the option of adding read groups if needed (this is important if you're planning to use you bams for later de-duplication with UMIs). By default, this script will perform 1st pass mapping. If users wish to use it for 2nd pass mapping, together with a file of filtered junctions, call the --sj_file flag. This script is primarily for use with reads of length > 75 bp. If read length is shorter, different parameters may be necessary. For details of alignment process, read the alignment workflow. |
DZ & RHR |
STAR_splice_junction_merge.R | Alignment | Performs merging of SJ.out.tab files from 1st pass mapping, removes duplicated splice junctions (as determined by genomic location) and outputs one SJ.out.tab file with the genomic coordinates. Also has optional flag for filtering junctions by the number of samples they are present in. For details of alignment process, read the alignment workflow. | RHR |
post_alignment_QC_RSeQC.R | Post-alignment QC | Performs (i) sorting and indexing of .bam files using samtools and (ii) runs post-alignment QC, using RSeQC. For details, read the alignment workflow. | DZ, KD & RHR |
quantification_Salmon.R | Quantification | Performs mapping-based quantification of transcripts and genes (the latter is only if a transcript-to-gene map is provided). This script can be used following trimming, as it does not require aligned files. Instead, Salmon will perform quasi-mapping prior to quantification. The benefit of using Salmon for quantification is its speed and ability to correct for sequence-specific biases, GC-biases and positional biases. This script is adapted for paired-end reads. For more details, read the quantification workflow. | RHR |
leafcutter_ds_multi_pairwise.R | Differential splicing | Leafcutter's command line tool for differential splicing currently only permits pairwise comparisons. If a grouping variable contains more than two groups, multipe pairwise comparisons can be performed using this script, which still calls the original Leafcutter command, with the addition of looping across each of the pairwise comparisons performed. For more details, read the leafcutter workflow. | RHR |
leafviz_multi_pairwise.R | Differential splicing | To visualise results of Leafcutter's differential splicing, Leafviz can be used. This requires that the results of the differential splicing have been formatted for use. LeafCutter provides a script, prepare_results.R , which performs this formatting, albeit for only one pairwise comparison. To format the results of multiple pairwise comparisons requires looping across the various pairwise comparisons and running the prepare_results.R for each individual pairwise comparison. This is what the leafviz_multi_pairwise.R script does. For more details, read the leafcutter workflow. |
RHR |