# Nextflow Demo, Hsiao Lab (Sequence Analysis Workshop)
Author: Zohaib Anwar <br />
Date: April 29, 2021

## Setup

Setup can be found here on this link [Link to Nextflow -](https://www.nextflow.io/index.html) <br />
Only one prerequisite <br />
* Java (version 8 or higher)


In [1]:
# Check Java version on your system
java -version

openjdk version "11.0.9.1" 2020-11-04 LTS
OpenJDK Runtime Environment Zulu11.43+55-CA (build 11.0.9.1+1-LTS)
OpenJDK 64-Bit Server VM Zulu11.43+55-CA (build 11.0.9.1+1-LTS, mixed mode)


**If Java version in your system is less than 8, please use this [link](https://java.com/en/download/help/download_options.html) to install newer version.** 

In [2]:
# Installation
# curl -s https://get.nextflow.io | bash

In [2]:
# Lets check Nextflow version
nextflow -v

Lets try with Hello World of Nextflow to start with. 

In [3]:
nextflow run hello

N E X T F L O W  ~  version 20.10.0
Launching `nextflow-io/hello` [dreamy_curie] - revision: e6d9427e5b [master]
[-        ] process > sayHello -[K
[2A
executor >  local (4)[K
[d6/0f0f50] process > sayHello (2) [100%] 4 of 4 ✔[K
Hola world![K
[K
Hello world![K
[K
Bonjour world![K
[K
Ciao world![K
[K
[11A
executor >  local (4)[K
[d6/0f0f50] process > sayHello (2) [100%] 4 of 4 ✔[K
Hola world![K
[K
Hello world![K
[K
Bonjour world![K
[K
Ciao world![K
[K



: 1

When a _nextflow_ file isnt available in the directory,Nextflow looks at [nextflow.io](https://github.com/nextflow-io/) for possible workflow

# Introduction

## Basic components

* **Processes**
* **Channels**

In practice a Nextflow pipeline script is made by joining together different processes. Each process can be written in any scripting language that can be executed by the Linux platform (Bash, Perl, Ruby, Python, etc.).

Processes are executed independently and are isolated from each other, i.e. they do not share a common (writable) state. The only way they can communicate is via asynchronous FIFO queues, called channels in Nextflow.

Any process can define one or more channels as input and output. The interaction between these processes, and ultimately the pipeline execution flow itself, is implicitly defined by these input and output declarations.

## Processes

A process may contain five definition blocks, respectively: directives, inputs, outputs, when clause and finally the process script. The syntax is defined as follows:

```Nextflow

process < name > {

   [ directives ]

   input:
    < process inputs >

   output:
    < process outputs >

   when:
    < condition >

   [script|shell|exec]:
   < user script to be executed >

}

```

## Channels

Nextflow is based on the Dataflow programming model in which processes communicate through channels.

A channel has two major properties:

* Sending a message is an asynchronous operation which completes immediately, without having to wait for the receiving process.
* Receiving data is a blocking operation which stops the receiving process until the message has arrived.

### Channel factory

Channels may be created implicitly by the process output(s) declaration or explicitly using the following channel factory methods.

The available factory methods are:

* create
* empty
* from
* fromPath
* fromFilePairs
* fromSRA
* value
* watchPath


## Scripting language

Nextflow is designed to have a minimal learning curve, without having to pick up a new programming language. In most cases, users can utilise their current skills to develop Nextflow workflows. However, it also provides a powerful scripting DSL.

Nextflow scripting is an extension of the Groovy programming language, which in turn is a super-set of the Java programming language. Groovy can be considered as Python for Java in that is simplifies the writing of code and is more approachable.

## Working Demo

During this tutorial we will implement a proof of concept of a RNA-Seq pipeline which:

* Indexes a trascriptome file.
* Performs quality controls
* Performs quantification.
* Create a MultiqQC report.

In [1]:
nextflow run 1.indexing.nf


N E X T F L O W  ~  version 20.10.0
Launching `1.indexing.nf` [hungry_engelbart] - revision: 52b8519ad3
R N A S E Q - N F   P I P E L I N E    
transcriptome: /Users/au572806/GitHub/Nextflow_demo_HsiaoLab/data/transcriptome.fa
reads        : /Users/au572806/GitHub/Nextflow_demo_HsiaoLab/data/*_{1,2}.fq
outdir       : /Users/au572806/GitHub/Nextflow_demo_HsiaoLab/results

[-        ] process > index -[K
[2A
executor >  local (1)[K
[ca/4782d7] process > index (indexing_/Users/au57... [  0%] 0 of 1[K
[3A
executor >  local (1)[K
[ca/4782d7] process > index (indexing_/Users/au57... [100%] 1 of 1 ✔[K



: 1

In [7]:
nextflow run 2.fastqc.nf

nextflow run 2.fastqc.nf
N E X T F L O W  ~  version 20.10.0
Launching `2.fastqc.nf` [voluminous_poincare] - revision: 59604aac74
R N A S E Q - N F   P I P E L I N E    
transcriptome: /Users/au572806/GitHub/Nextflow_demo_HsiaoLab/data/transcriptome.fa
reads        : /Users/au572806/GitHub/Nextflow_demo_HsiaoLab/data/gut_{1,2}.fq
outdir       : results

executor >  local (2)[K
[cc/42ca64] process > index                  [  0%] 0 of 1[K
[24/810856] process > fastqc (FASTQC on gut) [  0%] 0 of 1[K
[4A
executor >  local (2)[K
[cc/42ca64] process > index                  [  0%] 0 of 1[K
[24/810856] process > fastqc (FASTQC on gut) [  0%] 0 of 1[K
[4A
executor >  local (2)[K
[cc/42ca64] process > index                  [100%] 1 of 1 ✔[K
[24/810856] process > fastqc (FASTQC on gut) [  0%] 0 of 1[K
[4A
executor >  local (2)[K
[cc/42ca64] process > index                  [100%] 1 of 1 ✔[K
[24/810856] process > fastqc (FASTQC on gut) [100%] 1 of 1 ✔[K



: 1