
## **Tool: Unicycler**

**Authors: Ryan R. Wick, Louise M. Holt, et al.**


**Year: 2017**



### **Use of the tool:** ###

Unicycler is a bioinformatics tool used for hybrid assembly of bacterial genomes from short-read sequencing data. It combines both de novo assembly and read mapping approaches to produce highly accurate and complete genome assemblies. Unicycler is specifically designed for handling bacterial genomes, including those with repetitive regions, plasmids, and complex genomic structures.

#### **Distinguishing features:** ####  

Unicycler differs from other assembly tools by integrating the benefits of both de novo assembly and read mapping. It first constructs an initial assembly graph using de Bruijn graph-based assembly, followed by read mapping to resolve repetitive regions and correct errors. This hybrid approach results in highly accurate and complete bacterial genome assemblies.

#### **How the tool works:** #### 

Unicycler works in several steps. It starts by performing a preliminary de novo assembly using the SPAdes assembler. This step generates contigs, including circular contigs representing plasmids or other circular genomic elements. The tool then uses a combination of read mapping and iterative correction to improve the assembly. It maps the reads back to the contigs, identifies repetitive regions, and uses short reads to correct errors. The process is repeated until the assembly is refined and circularized.

#### **Installation:** #### 
``` bash
conda install -c bioconda unicycler

```

#### **Input files:** #### 

To run Unicycler, you need short-read sequencing data in FASTQ format. The tool accepts both paired-end and single-end reads. Additionally, you may provide long-read data in FASTQ or FASTA format for improved assembly results.

#### **Standard use:** #### 

``` bash 
unicycler -1 <forward_reads.fastq> -2 <reverse_reads.fastq> -o <output_directory> --min_fasta_length 300 -t 85 --keep 1

```

##### **Options used in the command:** #####


- ` -1 <forward_reads.fastq>: Specifies the input file containing forward reads in FASTQ format.`
- ` -2 <reverse_reads.fastq>: Specifies the input file containing reverse reads in FASTQ format.`
- ` -o <output_directory>: Specifies the output directory where the results will be saved.`
- ` --min_fasta_length: Specify the minimum length of contigs to be included in the assembly.`
- ` --keep 1: Specify which intermediate files to keep after the assembly process.`


##### **Additional options:** ##### 

Unicycler provides additional options to fine-tune the assembly process, such as specifying long-read data, changing the assembly mode, adjusting the number of threads, etc. You can refer to the tool's documentation for a comprehensive list of options and their usage.


##### **Results:** ##### 

- ` The output result files of Unicycler include:`
- ` Assembly.fasta: The final assembly in FASTA format.`
- ` Assembly.gfa: The assembly graph in GFA format.`
- ` Assembly.log: A log file containing information about the assembly process.`
- ` Subdirectories containing intermediate files and results.`

##### **Structure of important result files:** #####

Assembly.fasta: This file contains the final assembled genome, including the main chromosome and any plasmids or circular genomic elements. It can be used for downstream analyses, such as gene prediction, annotation, comparative genomics, etc.


In [None]:
conda activate bioinfo

In [None]:
rawpath="/home/asus/Desktop/CHRF_Project/Bioinformatics_Traning/Advance_Bioinformatics_Traning/Data"

# out_dir="/media/chrf/GenomeBack/SPN/All_Batchs_SPN"


for files in $rawpath\/*
do
    name=`basename $files| cut -f 1,2 -d '_'`

    echo $name
    mkdir -p -v $rawpath\/$name\/1_RawData
    mv $rawpath\/$name*\_R1_001.fastq.gz $rawpath\/$name*\_R2_001.fastq.gz $rawpath\/$name\/1_RawData/
    unicycler -1 $rawpath\/$name\/1_RawData\/$name*\_R1*.fastq.gz -2 $rawpath\/$name\/1_RawData\/$name*\_R2*.fastq.gz --min_fasta_length 300 -t 85 --keep 1 -o $rawpath\/$name\/3_Unicycler/
    mv $rawpath\/$name\/3_Unicycler\/assembly.fasta $rawpath\/$name\/3_Unicycler\/$name\_contigs.fasta
    break

done