# Running Nextflow from Colab

This is a guide with code to be able to run nf-core pipelines from colab notebooks.

In [None]:
!apt update
!apt install openjdk-17-jdk
!export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
!export PATH=$JAVA_HOME/bin:$PATH
!source ~/.bashrc

## Installing Nextflow

In [None]:
!wget -qO- https://get.nextflow.io | bash # Download Nextflow
!mv nextflow /usr/bin/nextflow # Move to a path Colab can access
!chmod +x /usr/bin/nextflow # Make it executable
!nextflow -v # Test it

## Setting up Conda

In [None]:
!pip install -q condacolab # -q here means quite
import condacolab
condacolab.install()

In [None]:
!conda config --add channels bioconda
!conda config --add channels conda-forge
!conda config --add channels defaults
!conda config --set channel_priority strict

## Running a Test pipeline

In [None]:
#! nextflow pull nf-core/demo

In [None]:
#! nextflow run nf-core/demo -profile conda,test --outdir demo-results

# Running RNAseq pipeline

![RNASeq pipeline](https://raw.githubusercontent.com/Multiomics-Analytics-Group/course_multi-omics_data_science/refs/heads/main/transcriptomics/notebooks/img/rnaseq.png)

Data downloaded from [GSE137344](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE137344).

### Creating Data Folders for Transcriptomics Analysis

In [None]:
!mkdir transcriptomics
!mkdir transcriptomics/data

In [None]:
! rm -r transcriptomics/results

### Downloading Sample Sheet File for nf-core/rnaseq

In [None]:
! wget https://raw.githubusercontent.com/Multiomics-Analytics-Group/course_multi-omics_data_science/refs/heads/main/transcriptomics/data/sample_sheet.csv -O transcriptomics/data/sample_sheet.csv

### Getting the Config file for the pipeline

In [None]:
! wget https://raw.githubusercontent.com/Multiomics-Analytics-Group/course_multi-omics_data_science/refs/heads/main/transcriptomics/low_resources.config -O transcriptomics/low_resources.config

In [None]:
!wget https://trace.ncbi.nlm.nih.gov/Traces/sra-reads-be/fastq?acc=SRR10104255 -O transcriptomics/data/SRR10104255.fastq.gz
!wget https://trace.ncbi.nlm.nih.gov/Traces/sra-reads-be/fastq?acc=SRR10104256 -O transcriptomics/data/SRR10104256.fastq.gz
!wget https://trace.ncbi.nlm.nih.gov/Traces/sra-reads-be/fastq?acc=SRR10104257 -O transcriptomics/data/SRR10104257.fastq.gz
!wget https://trace.ncbi.nlm.nih.gov/Traces/sra-reads-be/fastq?acc=SRR10104258 -O transcriptomics/data/SRR10104258.fastq.gz

In [None]:
! nextflow pull nf-core/rnaseq

In [None]:
! nextflow run \
    nf-core/rnaseq \
    --input transcriptomics/data/sample_sheet.csv \
    --outdir transcriptomics/results \
    --igenomes_ignore \
    --genome null \
    -profile conda \
    -c transcriptomics/low_resources.config \
    -resume