Skip to content

Latest commit

 

History

History
106 lines (78 loc) · 16 KB

RNA-seq-cheatsheet.md

File metadata and controls

106 lines (78 loc) · 16 KB

Bulk RNA-seq Cheatsheet

The tables below consist of valuable functions or commands that will help you through this module.

Each table represents a different library/tool and its corresponding commands.

Please note that these tables are not intended to tell you all the information you need to know about each command.

The hyperlinks found in each piece of code will take you to the documentation for further information on the usage of each command.

Base R

Read the Base R package documentation here.

Library/Package Piece of Code What it's called What it does
Base R list.files() List files Produces a character vector of files or directories in the specified directory
Base R names() Names Gets or sets the names of an object
Base R colnames() Column names Gets or sets the column names of a matrix or data frame
Base R all.equal() All equal Checks if two R objects are nearly equal
Base R attr() Object Attributes Gets or sets the attributes of an object
Base R rowSums() Row Sums Returns the sum of the rows in a numeric matrix-like object (i.e.. a matrix, data.frame, etc.)
Base R relevel() Relevel Reorders the levels of a factor as specified
Base R summary() Object Summary Returns a result summary of an object
Base R as.data.frame() Data Frame Checks if an object is a data.frame, and transforms the object into one, if possible

DESeq2

Read the DESeq2 package documentation here, and the package vignette by Love, Anders, and Huber here.

Library/Package Piece of Code What it's called What it does
DESeq2 vst() Variance Stabilizing Transformation Applies variance stabilizing transformation to data (log2-like scale)
DESeq2 DESeqDataSet() DESeqDataSet constructor that can take a SummarizedExperiment Creates a DESeqDataSet object
DESeq2 DESeqDataSetFromMatrix() DESeqDataSet constructor Creates a DESeqDataSet object from a matrix of count data
DESeq2 DESeq() Differential Expression Analysis Based on the Negative Binomial Distribution Estimates size factors, estimates dispersion, and performs negative binomial fitting and Wald statistics as steps in the default DESeq2 differential expression analysis
DESeq2 plotPCA() PCA plot Produces a principal component analysis plot for transformed data. It can be used to visually inspect the data, which might allow an analyst to identify batch effects.
DESeq2 counts() Counts Returns count matrix from DESeqDataSet object
DESeq2 results() Results Returns the results table from a DESeq2 analysis
DESeq2 assay() Assay Returns matrix from the assay slot of a DESeqDataSet object

FastQC and fastp

Read the FastQC documentation here and the fastp documentation here.

Library/Package Piece of Code What it's called What it does
fastp fastp FASTQ preprocessor Preprocesses FASTQ files through adapter trimming, quality filtering, length filtering, and a number of additional options
FastQC fastqc FASTQC (Quality Control) Performs quality control checks on raw sequence data and outputs a QC(quality control) report

ggplot2

Read the ggplot2 package documentation here. A vignette on the usage of the ggplot2 package can be found here.

Library/Package Piece of Code What it's called What it does
ggplot2 ggsave() GG Save Saves the last plot in working directory
ggplot2 last_plot() Last plot Returns the last plot produced
ggplot2 geom_point() Geom point Creates a scatterplot (when added to the ggplot() function)
ggplot2 xlab(); ylab() X Axis Labels; Y Axis Labels Modifies the labels on the x axis and on the y axis, respectively
ggplot2 coord_fixed() Cartesian Coordinates with Fixed Aspect Ratio Coerces the coordinates on the plot to represent a fixed specified ratio

tximeta and SummarizedExperiment

Read the tximeta package documentation here, and the package vignette by Love et al. here. Read the SummarizedExperiment package documentation here, and the package vignette by Morgan et al. here.

Library/Package Piece of Code What it's called What it does
tximeta tximeta() tximeta Imports transcript-level estimates, attaches transcriptome annotation, and returns a SummarizedExperiment object
tximeta makeLinkedTxome() Make Linked Transcriptome Sets up transcriptome annotation to be used by the tximeta() function (Only necessary if tximeta() fails to find annotation, like for non-human, non-mouse species data)
tximeta summarizeToGene() Summarize to Gene Takes a SummarizedExperiment that was set up by tximeta and summarizes transcript data to the gene-level
SummarizedExperiment rowData() colData() Col/Row Data Accesses the row or column data from a SummarizedExperiment object
SummarizedExperiment assay() assayNames() Assay or AssayNames Accesses the assay data or the names of the assays from SummarizedExperiment object

stringr, readr, dplyr, pheatmap

Documentation for each of these packages can be accessed by clicking the package name in the table below.

Library/Package Piece of Code What it's called What it does
stringr word() Word Extracts words from a character vector
readr write_rds() Write RDS Writes data to a .RDS output file
dplyr pull() Pull Extracts a variable (column) as a vector
pheatmap pheatmap() Pretty heatmap Plots clustered heatmaps

Salmon

Read the Salmon tool documentation here.

Tool Piece of Code What it's called What it does
Salmon salmon index Salmon index Builds a transcriptome index which is required for Salmon quantification (from the command line)
Salmon salmon quant Salmon quantification Runs Salmon’s quantification of transcript expression (from the command line)

Useful command line commands:

Useful Command Line Commands

Feel free to give these commands a try on your own! (Note that our example begins in the training-module directory.)