# RNA-Seq

## Overview

**If this is your first time running this workflow, please read this section fully, as well as explanatory text in other sections. Skipping text can result in running the analysis incorrectly. eResearch does provide support and assistance in running these Jupyter workflows, but the assumption will be you've properly read and followed the instructions within.**

This is a [Jupyter notebook](https://jupyter.org/) containing a workflow for analysing RNA-Seq (gene expression) data.

This workflow consists of several analysis sections, separated into **two main sections**:

1. **Upstream analysis using nfcore/rnaseq, which is**:

> a bioinformatics pipeline that can be used to analyse RNA sequencing data obtained from organisms with a reference genome and annotation

nfcore/rnaseq was built in the [Nextflow](https://www.nextflow.io/) workflow manager system.

Upstream analysis involves mapping sequence reads to a reference genome and quantifying the number of reads that fall within defined genomic regions (genes, transcripts, exons).

2. **Differential expression analysis using R tools, mainly DESeq2**

The output files (mainly, the count table) of nfcore/rnaseq are used in downstream analysis.

Downstream analysis involves differential expression analysis using the R package DESeq2, and generation of gene expression tables and plots.

Even though this downstream workflow was specifically designed to work on nfcore/rnaseq output files, any count table (e.g. generated by featurecounts, another common workflow tool to generate RNA-Seq count tables) can be used as input, with some small modification to the workkflow script required.

This workflow was prepared by the [eResearch Office, QUT.](https://qutvirtual4.qut.edu.au/group/staff/governance/organisational-structure/academic-division/research-portfolio/research-infrastructure/eresearch)

For assistance in running this or other bioinformatics analysis, submit a request in the eResearch support portal:
https://eresearchqut.atlassian.net/servicedesk/customer/portal/14/create/184

**********************************

# Table of Contents

[How to use this Jupyter Notebook](#overview)

1. [nfcore/rnaseq workflow](./rnaseq_1.ipynb)

2. [Differential expression analysis using R tools](./rnaseq_2.ipynb)

Clicking on the above links will open a separate Jupyter Notebook to run either of the two main analysis sections.


***************************

## How to use this Jupyter Notebook <a class="anchor" id="overview"></a>

Juypter Notebooks run a 'kernel' that allow code to be run in code 'cells' in the Notebook. This Notebook is running the BASH kernel, which allows for commands to be run on QUTs high performance compute cluster (HPC).

You can run a code cell by clicking on the cell itself and clicking the run button (at the top of this Notebook), or by pressing shift+enter.

![](https://data36.com/wp-content/uploads/2021/07/how-to-run-cell-in-jupyter-notebook.png)

<div class="alert alert-block alert-warning">
As an example, run the following code cell to list the contents of your HPC home directory.
</div>

In [None]:
ls $HOME

**Before each code cell is a colour-coded text box that tells you what the cell does. The colour of the text box tells you whether a code cell is required to run as-is, optional or if it requires you to type input.** 

<div class="alert alert-block alert-success">
A green text box indicates a code cell that must be run, without alteration, to complete the workflow.
</div>

<div class="alert alert-block alert-warning">
A yellow text box indicates an optional code cell that doesn't have to be run to complete the workflow, but can be run to complete optional tasks.
</div>

<div class="alert alert-block alert-info">
A blue text box indicates a code cell that requires user input - this cell also must be run to complete the workflow, but the user needs to modify the command in the cell.
</div>

<div class="alert alert-block alert-danger">
In addition, some text boxes contain particularly important information. These will be coloured red.
</div>

*******************************

[**Click here to open the nfcore/rnaseq Notebook**](./rnaseq_1.ipynb)