# HERA Bioinformatics training

Hi!

Welcome to the hands-on course part of the HERA bioinformatics training.

Before getting started, underneath this text you'll see a button with the text "*Show code*" with a play-button next to it.  
Please click that play button now, it will install and configure everything necessary for this course. 

Installation will take about 5~6 minutes.

---

## Table of contents

1. [Removing sequencing adapters & quality control](#scrollTo=O2yVf3BatvPS)
  * [Illumina data](#w)
  * [Nanopore data](#e)
2. [Removing primer sequences]()
  * [Illumina data]()
  * [Nanopore data]()
3. [Aligning reads to reference]()
  * [Illumina data]()
  * [Nanopore data]()
4. [Consensus calling]()

In [None]:
#@title
!pip install igv-jupyter --quiet > /dev/null 2>&1
!sed -i -e '1,2d' ~/.bashrc && source ~/.bashrc && bash -c "$(curl -sL https://raw.githubusercontent.com/RIVM-bioinformatics/HERA-Bioinformatics-Training/main/setup.sh)"

# Quality Control, cleaning reads and removing primers

In [None]:
%%bash
source activate base; conda activate Alignments
mkdir output_data/
minimap2 -ax sr source/extra/GCF_009858895_2_ASM985889v3_genomic.fasta example_data/illumina_fastq_1.fastq.gz example_data/illumina_fastq_2.fastq.gz | samtools view -F 256 -F 512 -F 4 -F 2048 -uS | samtools sort -o output_data/illumina_raw_alignment.bam
samtools index output_data/illumina_raw_alignment.bam

In [None]:
#@markdown << click to show alignment results
import igv_notebook
igv_notebook.init()
b = igv_notebook.Browser(
    {
        "genome": "ASM985889v3",
        "locus": "NC_045512.2:1-300",
        "tracks": [
          {
            "name": "Local BAM",
            "path": "/content/output_data/illumina_raw_alignment.bam",
            "indexPath": "/content/output_data/illumina_raw_alignment.bam.bai",
            "type": "alignment",
            "format": "bam",
            "showSoftClips": True,
            "colorBy": "strand"
           }
        ]
    }
)

In [None]:
!echo "Example of an Illumina read"
!zcat example_data/illumina_fastq_1.fastq.gz | head -n 4
!echo "Example of a Nanopore read"
!zcat example_data/nanopore_fastq.fastq.gz | head -n 4

In [None]:
%%bash
source activate base; conda activate Data_cleanup

fastp --in1 example_data/illumina_fastq_1.fastq.gz --in2 example_data/illumina_fastq_2.fastq.gz --out1 output_data/illumina_fastq_1.fastq --out2 output_data/illumina_fastq_2.fastq --unpaired1 output_data/illumina_fastq_unpaired.fastq --unpaired2 output_data/illumina_fastq_unpaired.fastq --detect_adapter_for_pe

In [None]:
!sed -i 's/http:\/\//https:\/\//g' ./*.html
from IPython.display import HTML

HTML(filename="/content/fastp.html")

In [None]:
%%bash
source activate base; conda activate Data_cleanup

ampligone -v

ampligone --input output_data/illumina_fastq_1.fastq -o output_data/illumina_fastq_1_cleaned.fastq -ref source/extra/GCF_009858895_2_ASM985889v3_genomic.fasta -pr source/extra/articv3.bed -at end-to-mid -ep output_data/illumina_1_found-primers.bed
ampligone --input output_data/illumina_fastq_2.fastq -o output_data/illumina_fastq_2_cleaned.fastq -ref source/extra/GCF_009858895_2_ASM985889v3_genomic.fasta -pr source/extra/articv3.bed -at end-to-mid -ep output_data/illumina_2_found-primers.bed