qbic-pipelines · luiskuhn · Jan 14, 2021 · Jan 14, 2021 · Jan 14, 2021 · Jan 14, 2021
diff --git a/docs/markdown/rnaseq.md b/docs/markdown/rnaseq.md
@@ -55,3 +55,68 @@ nextflow run qbicsoftware/rnadeseq -r 1.3.2 -profile docker \
 ## Reporting
 
 The reporting is taken care of as part of the [qbic-pipelines/rnadeseq](https://github.com/qbic-pipelines/rnadeseq) pipeline. Check the previous section.
+
+## Differential Exon Usage analysis
+
+This guides through the execution of a DEU analysis, pointing out the necessary scripts and files.
+
+### Prerequisites
+
+* BAM files to be analysed
+* A conda environment with HTSeq installed (conda install -c bioconda htseq)
+* The script dexseq_prepare_annotation2.py from [https://github.com/vivekbhr/Subread_to_DEXSeq](https://github.com/vivekbhr/Subread_to_DEXSeq)
+* Python
+* The annotation .gtf file you have obtained the BAM files with
+* The file `DESeq2/raw_counts/merged_gene_counts.txt` copied in its location by the DESeq2 script
+
+### Definitions
+
+* DEU: Differential Exon Usage.
+
+### Process
+
+Run nf-core/rnaseq pipeline on rnaseq data **->** Transform your annotation .gtf **->** execute featureCounts on the BAM files in results/MarkDuplicates/ **->** Analyse this output in R
+
+### Procedure
+
+#### Step 1: Run nf-core/rnaseq pipeline on your data
+
+1. Create an appropriate workspace on the cluster
+2. Download the data
+3. Check that you have annotation and fasta file at hand for the organism you’re working with
+4. Start the workflow in a screen session
+
+#### Step 2: Transform your annotation .gtf file
+
+1. Create a conda environment, or use one that you already have, and activate it
+2. Install HTSeq:
+```bash
+conda install -c bioconda htseq
+```
+3. Download the script dexseq_prepare_annotation2.py from [https://github.com/vivekbhr/Subread_to_DEXSeq](https://github.com/vivekbhr/Subread_to_DEXSeq)
+4. Execute: 
+```bash
+python dexseq_prepare_annotation2.py -f out.gtf in.gtf out.gff
+```
+
+#### Step 3: execute featureCounts
+
+1. On cfc load the subread module: 
+```bash
+module load qbic/subread/1.6.0
+```
+2. Run featureCounts on the BAM files produced by the pipeline: 
+```bash
+featureCounts -p -F GTF -a out.gtf -o counts.out Cont_1.bam Cont_2.bam
+```
+3. As output you have now a count table with a column for every input file; copy it locally on your machine
+4. Open Rstudio and the script
+
+#### Step 4: execute R analysis
+
+1. Open Rstudio and the script demo_dexseq.R.
+2. In the same folder as the script you need to place:
+    * load_SubreadOutput.R
+    * Counts.out (output from featureCounts)
+    * out.gtf
+3. NOTE: in the script, the number of workers is set to 1 to adapt the condition to my local machine (setting it to higher numbers was causing R not to execute the command successfully), but this parameter is typically increased on other machines.