### Prepare

In [None]:
# Create anvio232 virtual env
conda create -n anvio232 -c bioconda -c conda-forge gsl anvio=2.3.2
# Activate anvio232 virtual env
source activate anvio232

In [None]:
# Create working directory
mkdir anvio-work
cd anvio-work

### Co-assembly

In [None]:
# prepare reads
cp /d5/lin/Bee_microbiome/Fq/HN1.nobee_qc_1.fastq ./
cp /d5/lin/Bee_microbiome/Fq/HN1.nobee_qc_2.fastq ./
cp /d5/lin/Bee_microbiome/Fq/SX2.nobee_qc_1.fastq ./
cp /d5/lin/Bee_microbiome/Fq/SX2.nobee_qc_2.fastq ./

In [None]:
# Co-assembly (use metahit)
/home/lin/software/megahit/bin/megahit -1 AB1.nobee_qc_1.fastq -2 AB1.nobee_qc_2.fastq -o assembly
/home/lin/software/megahit/bin/megahit -1 HN1.nobee_qc_1.fastq,SX3.nobee_qc_1.fastq -2 HN1.nobee_qc_2.fastq,SX3.nobee_qc_2.fastq --min-contig-len 100 -o assembly -t 10
# --min-contig-len 1000 (usually)

In [None]:
# Convert format
# Simplify contig names and eliminate the short contigs--min-len 2500
anvi-script-reformat-fasta assembly/final.contigs.fa -o contigs.fa  --simplify-names --report name_conversions.txt

![image.png](attachment:image.png)

### Map

In [None]:
# Build an index for our contigs
bowtie2-build contigs.fa contigs

# Map reads to the co-assembly
mkdir map
# For HN sample
bowtie2 --threads 10 -x contigs -1 HN1.nobee_qc_1.fastq -2 HN1.nobee_qc_2.fastq -S map/sample_HN.sam
samtools view -F 4 -bS map/sample_HN.sam > map/sample_HN-raw.bam
anvi-init-bam map/sample_HN-raw.bam -o map/sample_HN.bam
# For SX sample
bowtie2 --threads 10 -x contigs -1 SX3.nobee_qc_1.fastq -2 SX3.nobee_qc_2.fastq -S map/sample_SX.sam
samtools view -F 4 -bS map/sample_SX.sam > map/sample_SX-raw.bam
anvi-init-bam map/sample_SX-raw.bam -o map/sample_SX.bam

![image.png](attachment:image.png)

#### Until now, we already have our contigs.fa (from co-assembly), and our BAM files (from map).
#### Therefore, we can start our Anvi'o journey.

### Create an anvi’o contigs database

In [None]:
anvi-gen-contigs-database -f contigs.fa -o contigs.db 
# When running this command, anvi-gen-contigs-database will,
# 1. Compute k-mer frequencies for each contig (the default is 4, but you can change it using --kmer-size parameter if you feel adventurous).
# 2. Soft-split contigs longer than 20,000 bp into smaller ones (you can change the split size using the --split-length). When the gene calling step is not skipped, the process of splitting contigs will consider where genes are and avoid cutting genes in the middle. For very very large assemblies this process can take a while, and you can skip it with --skip-mindful-splitting flag.
# 3. Identify open reading frames using Prodigal, the bacterial and archaeal gene finding program developed at Oak Ridge National Laboratory and the University of Tennessee. If you don’t want gene calling to be done, you can use the flag --skip-gene-calling to skip it. If you have your own gene calls, you can provide them to be used to identify where genes are in your contigs. All you need to do is to use the parameter --external-gene-calls .

### Identify single-copy genes (optional)

In [None]:
anvi-run-hmms -c contigs.db --num-threads 10
# When running this command, anvi-gen-contigs-database will,
# Utilize multiple default bacterial single-copy core gene collections and identify hits among your genes to those collections using HMMER

### Create an anvi’o profile database stores sample-specific information about contigs

In [None]:
# Each sample should have a profile database.
# For HN sample
anvi-profile -i map/sample_HN.bam -c contigs.db --min-contig-length 10000 --output-dir profiledb_HN
# For SX sample
anvi-profile -i map/sample_SX.bam -c contigs.db --min-contig-length 10000 --output-dir profiledb_SX

![image.png](attachment:image.png)

![image.png](attachment:image.png)

### Binning

#### As of version 6+, anvi’o no longer runs a default binning program with anvi-merge. Binning within anvi’o is now handled with anvi-cluster-contigs, and/or external binning results can be imported as described in the next section.
#### But here, we use version 2.3.2.

In [None]:
# Use default concoct for binning
anvi-merge */PROFILE.db -o bin -c contigs.db --enforce-hierarchical-clustering
# anvi-cluster-contigs -p */PROFILE.db  -c contigs.db --driver concoct
# --driver DRIVER 'concoct, metabat2, maxbin2, dastool, binsanity'.

In [None]:
# See how many collections have been clustered
anvi-script-get-collection-info -p bin/PROFILE.db -c contigs.db --list-collections
# or
anvi-show-collections-and-bins -p bin/PROFILE.db

##### However, no collection has been clustered in this demo....

![image.png](attachment:image.png)

### Visualization

#### The interactive interface allows us to browse data in an intuitive way as it shows multiple aspects of data, visualize the results of unsupervised binning, perform supervised binning, or refine existing bins.

####  In fact, we can use the interactive interface in every step.

In [None]:
# For all collections
anvi-interactive -p bin/PROFILE.db -c contigs.db
# For a specific collection
anvi-interactive -p bin/PROFILE.db -c contigs.db -C CONCOCT

### Summary

In [None]:
# For all collections
anvi-summarize -p bin/PROFILE.db -c contigs.db -o summary --list-collections
# For a specific collection
anvi-summarize -p bin/PROFILE.db -c contigs.db -o summary -C CONCOCT
# When running this command, anvi-summarize will,
# Compute completion and redundancy estimated for each bin in a collection and stored them in the output.

### Refine bin

In [None]:
anvi-refine -p bin/PROFILE.db -c contigs.db -b Bin_4 -C CONCOCT