Skip to content

hbctraining/Intro-to-rnaseq-hpc-gt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction to RNA-seq using high-performance computing (HPC)

Description

This repository has teaching materials for a 2-day Introduction to RNA-sequencing data analysis workshop. This workshop focuses on teaching basic computational skills to enable the effective use of an high-performance computing environment to implement an RNA-seq data analysis workflow. It includes an introduction to shell (bash) and shell scripting. In addition to running the RNA-seq workflow from FASTQ files to count data, the workshop covers best practice guidlelines for RNA-seq experimental design and data organization/management.

These materials were developed for a trainer-led workshop, but are also amenable to self-guided learning.

Learning Objectives

  1. Understand the necessity for, and use of, the command line interface (bash) and HPC for analyzing high-throughput sequencing data.
  2. Understand best practices for designing an RNA-seq experiment and analysis the resulting data.

Contents

Basics of Shell

Lessons Estimated Duration
Introduction to the shell 70 min
Searching and redirection in shell 45 min
Introduction to the Vim text editor 30 min
Shell scripts and for loops 75 min
Permissions and environment variables 50 min
Project and data organization 40 min

HPC and RNA-seq workflow

Lessons Estimated Duration
RNA-seq experimental design best practices 50 min
Introduction to High-Performance Computing 45 min
RNA-seq data QC with FastQC 75 min
RNA-seq workflow: Alignment and Counting 90 min
Automating the RNA-seq workflow 60 min
Alternative workflows for analyzing RNA-seq data 15 min
Quantifying expression using alignment-free methods (Salmon) 75 min

Installations

This workshop requires that the following programs are installed on your laptop:

Mac users

  1. Java

  2. Integrative Genomics Viewer (IGV)

  3. Sublime Text

Windows users

  1. Git Bash

  2. Java

  3. Integrative Genomics Viewer (IGV)

  4. Notepad++


These materials have been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.