In [7]:
from IPython.display import Image, display, HTML

image_path = 'trimmomatic.png'
display(HTML(f'<div style="text-align:center;"><img src="{image_path}" /></div>'))


# **Trimmomatic** #

## **Adapter Trimming** ##



>- **Trimming of adapter sequences from short read data is a common preprocessing step during NGS data analysis.**
>- **Adapter sequences should be removed from reads because they interfere with downstream analyses, such as
>>- **alignment of reads to a reference.**
>- **The adapters contain 
>>- **the sequencing primer binding sites, 
>>- **the index sequences, and
>>- **the sites that allow library fragments to attach to the flow cell lawn. 

>- **Trimmomatic offers a wide range of trimming and filtering options, making it highly versatile for various sequencing platforms and experimental setups. It provides robust algorithms for adapter removal and quality trimming, and its flexible command-line interface allows users to customize their trimming strategy based on specific requirements.
How the tool works:
Trimmomatic operates on sequencing data in FASTQ format. It uses a sliding window approach to identify low-quality regions and applies quality thresholds to trim or remove bases. Additionally, Trimmomatic utilizes adapter sequence databases to detect and remove adapter contamination from reads. It employs different modules and algorithms for quality control and trimming steps, ensuring accurate and efficient processing of sequencing data.


## **Tool(s)** ##

>- **Trimmomatic offers a wide range of 
>>- **trimming and filtering options, 
>>- **making it highly versatile for various sequencing platforms and experimental setups. 
>- **It provides robust algorithms for 
>>- **adapter removal and quality trimming

## **How the tool works:** ##
>- **Trimmomatic operates on sequencing data in FASTQ format.**
>- **It uses a sliding window approach to identify low-quality regions and applies quality thresholds to trim or remove bases. 
>- **Additionally, Trimmomatic utilizes adapter sequence databases to detect and remove adapter contamination from reads. 
>- **It employs different modules and algorithms for quality control and trimming steps, ensuring accurate and efficient processing of sequencing data.

## **Installation:** ##
```bash
conda install -c bioconda trimmomatic
```

## **Input file:** ##
- ` <forward_reads.fastq.gz> `
- ` <reverse_reads.fastq.gz>`


## **Adapter file:** ##
TruSeq3-PE-2.fa


<forward_reads.fastq.gz>
## **Standard use:** 
``` bash 
trimmomatic PE -threads 92 <forward_reads.fastq.gz> <reverse_reads.fastq.gz> <forward_reads_Trim_P.fastq.gz> <forward_reads_Trim_S.fastq.gz> <reverse_reads_Trim_P.fastq.gz> <reverse_reads_Trim_S.fastq.gz> ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10 SLIDINGWINDOW:4:20 MINLEN:36 LEADING:20 TRAILING:20
```

### **Options used in the command:**


- ` PE: That it will be taking paired end file as input.`
- ` forward_reads.fastq.gz: The first input file name.`
- ` reverse_reads.fastq.gz: The second input file name.`
- ` forward_reads_Trim_P.fastq.gz: The output file for surviving pairs from the forward_reads file.`
- ` forward_reads_Trim_S.fastq.gz: files will contain the single reads that are produced after the trimming process.`
- ` reverse_reads_Trim_P.fastq.gz: The output file for surviving pairs from the reverse_reads file.`
- ` reverse_reads_Trim_S.fastq.gz: files will contain the single reads that are produced after the trimming process.`
- ` ILLUMINACLIP:TruSeq3-PE.fa:2:30:10: To clip the illumina adapters from the input file using the adapter sequences listed in TruSeq3-- PE.fa. The numbers 2:30:10 tell trimmomatic how to handle sequence matches to the TruSeq3 adapters.`
- ` SLIDINGWINDOW:4:20:  To use a sliding window of size 4 that will remove bases if their phred score is below 20.`
- ` MINLEN:36: This will discard and reads that do not have a at least 36 bases remaining after this trimming step.`


#### **Results:** ####

- ` forward_reads_Trim_P.fastq.gz`
- ` forward_reads_Trim_S.fastq.gz`
- ` reverse_reads_Trim_P.fastq.gz`
- ` reverse_reads_Trim_S.fastq.gz`



In [None]:
trimmomatic \
    PE fastq_1 $fastq_2 \
    forward_reads_Trim_P.fastq.gz forward_reads_Trim_S.fastq.gz \
    reverse_reads_Trim_P.fastq.gz reverse_reads_Trim_S.fastq.gz \
    ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10 \
    SLIDINGWINDOW:4:20 \
    MINLEN:36 \
    LEADING:20 \
    TRAILING:20 \
    -threads 92