## Compare VAMB as a binner to other bin outputs
- Compare results from VAMB to the results when inputting mgmanysearch results
- Take set of 10 SRAs, symlink, and bin (from same study)
- Compare those bins to original number of bins. Amount, quality. 
- Then look at input files for VAMB, see if we can create with smash


### To run vamb
- make bowtie2 db from all contigs
- map to all contigs
- sam --> bam
- feed info to vamb

### Both VAMB and metabat2 use a depth txt file
this file is created by the jgi_summarize_bam_contig_depths command, 


In [None]:
# Contig coverage depth quantification with MetaBAT 2, use no noIntraDepthVariance

mamba activate metabat2
jgi_summarize_bam_contig_depths --noIntraDepthVariance --outputDepth cov_depth_novar.txt coassembly_alignments/*.bam

# with variance
jgi_summarize_bam_contig_depths --outputDepth cov_depth_var.txt coassembly_alignments/*.bam


In [None]:
# creating bins with either vamb or metabat:
# METABAT
metabat -i contigs.fa \
-a metabat_depth.txt 
-o metabat_bins -t 20


VAMB (min size of 50kb as a bin)
vamb --outdir ./vamb_bins \
--fasta contigs.fa \
--jgi ./cov_depth.txt \
--minfasta 50000 

In [None]:
# run with intravariance var

srun --account=ctbrowngrp -p med2 -J metabat2 -t 10:00:00 -c 8 --mem=40gb --pty bash
metabat2 -m 1500 -i bowtie2/contigs.fa -o metabat_bins --a metabat_depth.txt -t 8
 
srun --account=ctbrowngrp -p med2 -J vamb -t 10:00:00 -c 1 --mem=40gb --pty bash
vamb --outdir ./vamb_bins \
--fasta bowtie2/contigs.fa \
--jgi ./metabat_depth.txt \
--minfasta 50000

In [None]:
# symlink the 11 SRAs
ln -s ../../2023-swine-sra/atlas/atlas_ERR113518* . 
ln -s ../../2023-swine-sra/atlas/atlas_ERR113517* . 

In [None]:
# symlink the reads
ln -s ../../swine_SRA/atlas_ERR*/ERR*/sequence_quality_control/ERR*_QC_R*.gz .

# srun for threads
srun --account=ctbrowngrp -p med2 -J fmg -t 10:00:00 -c 10 --mem40gb --pty bash

# concat all contigs (n=349,706)
cat ../swine_SRA/atlas_ERR11351*/ERR*/ERR*.fasta > contigs.fa

# create bowtiedb
mamba activate bowtie2
bowtie2-build contigs.fa contigs_db -p 10

# Mapping
cd reads

for f in *_R1.*
do
bowtie2 -p 10 -x ../contigs_db \
-1 $f \
-2 ${f%_R1*}_R2.fastq.gz \
-S ../samfiles/${f%_QC*}.sam \
--sensitive

# Compression
mamba activate samtools
for f in *.sam 
do
samtools view -@ 10 -F 4 -bS $f | samtools sort > ${f%.sam*}.bam
done

samtools index *.bam

In [None]:
# srun for threads
srun --account=ctbrowngrp -p med2 -J samtool -t 2:00:00 -c 10 --mem 40gb --pty bash
srun --account=ctbrowngrp -p med2 -J fmg_bin -t 36:00:00 -c 24 --mem 50gb --pty bash

In [None]:
# Use checkm2 to check the created bins
srun --account=ctbrowngrp -p med2 -J checkm2 -t 2:00:00 -c 10 --mem 40gb --pty bash

mamba activate checkm2
checkm2 predict --threads 10 --input metabat_bins -x .fa --output-directory checkm_results/metabat
checkm2 predict --threads 10 --input vamb_bins/bins --output-directory checkm_results/vamb