# RNA-Seq expression analysis


## Introduction

RNA sequencing (**RNA-Seq**) is a high-throughput method used to profile the **transcriptome**, quantify gene expression and discover novel RNA molecules.  This tutorial uses RNA sequencing of **malaria parasites** to walk you through transcriptome visualisation, performing simple quality control checks and will show you how to profile transcriptomic differences by identifying differentially expressed genes.    

For an introduction to RNA-Seq principles and best practices see:

> **A survey of best practices for RNA-Seq data analysis**  
> Ana Conesa, Pedro Madrigal, Sonia Tarazona, David Gomez-Cabrero, Alejandra Cervera, Andrew McPherson, Michał Wojciech Szcześniak, Daniel J. Gaffney, Laura L. Elo, Xuegong Zhang and Ali Mortazavi  
> _Genome Biol. 2016 Jan 26;17:13 doi:[10.1186/s13059-016-0881-8](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0881-8)_

## Learning outcomes

By the end of this tutorial you can expect to be able to:  

 * Align RNA-Seq reads to a reference genome and a transcriptome 
 * Visualise transcription data using standard tools 
 * Perform QC of NGS transcriptomic data 
 * Quantify the expression values of your transcripts using standard tools 

## Tutorial sections
This tutorial comprises the following sections:    

  1. [Introducing the tutorial dataset](dataset-intro.ipynb) 
  2. [Mapping RNA-Seq reads to the genome with HISAT2](genome-mapping.ipynb) 
  3. [Visualising transcriptomes with IGV](transcriptome-visualisation.ipynb) 
  4. [Transcript quantification with Kallisto](transcript-quantification.ipynb) 
  5. [Identifying differentially expressed genes with Sleuth](sleuth-de.ipynb) 
  6. [Interpreting the results](de-interpretation.ipynb) 
  7. [Key aspects of differential expression analysis](key-aspects.ipynb)

## Authors
This tutorial was written by [Victoria Offord](https://github.com/vaofford) based on materials from [Adam Reid](https://www.sanger.ac.uk/people/directory/reid-adam-james).

## Prerequisites

This tutorial assumes that you have the following software or packages and their dependencies installed on your computer. The software or packages used in this tutorial may be updated from time to time so, we have also given you the version which was used when writing the tutorial. 

| Package               | Link for download/installation instructions                          | Version tested |
| :-------------------: | :------------------------------------------------------------------: |:-------------: |
| HISAT2                | https://ccb.jhu.edu/software/hisat2/index.shtml                      | 2.1.0          |
| samtools              | https://github.com/samtools/samtools                                 | 1.10           |
| IGV                   | https://software.broadinstitute.org/software/igv/                    | 2.7.2          |
| kallisto              | https://pachterlab.github.io/kallisto/download                       | 0.46.2         |
| R                     | https://www.r-project.org/                                           | 4.0.2          |
| sleuth                | https://pachterlab.github.io/sleuth/download                         | 0.30.0         |
| bedtools              | http://bedtools.readthedocs.io/en/latest/content/installation.html   | 2.29.2         |

## Where can I find the tutorial data?

You can find the data for this tutorial by typing the following command in a new terminal window.

In [None]:
cd /home/manager/course_data/rna_seq

Now, let's head to the first section of this tutorial which will be **[introducing the tutorial dataset](dataset-intro.ipynb)**.