### Preparing assembly files for binning
To create a depth processing file, reads must be re-aligned to the contigs. This has been done using bowtie2 (can also be done using BWA). The next step would be to create a depth file with MetaBat2, convert that to be suitable for CONCOCT and MaxBin2, and then process these into bins. 

This is all assuming you have installed all of the softwares mentioned here. Use conda for quick install. If needed, the documentation for everything can be found here:

MetaBat2: 

MaxBin2:

CONCOCT: https://github.com/BinPro/CONCOCT

#### MetaBat2
The first piece of code here generates a fairly simple text file for the coverage of these files. Still looking into whether the same file cannot be used for all binners. So far no luck. The next set of code runs MetaBat2 itself. 

In [1]:
#this creates a depth file for MetaBat
#check names of the mapping files
for f in Coral1 Coral2 Coral4 Coral5 T3-21-Mmea T3-4-Past
do
jgi_summarize_bam_contig_depths --outputDepth ../data/working/metabat_depth_"$f".txt ../../03_mapping/data/results/"$f".bam
done

SyntaxError: invalid syntax (<ipython-input-1-22de35fddd80>, line 3)

In [None]:
#this is the actual MetaBat2 script
metabat2 -i assembly.fa.gz -a depth.txt -o resA1/bin -v
metabat2 -i ${out}/work_files/assembly.fa -a ${out}/work_files/metabat_depth.txt\
	 -o ${out}/metabat2_bins/bin -m $metabat_len -t $threads --unbinned
	if [[ $? -ne 0 ]]; then error "Something went wrong with running MetaBAT2. Exiting"; fi
	comm "metaBAT2 finished successfully, and found $(ls -l ${out}/metabat2_bins | grep .fa | wc -l) bins!"

	if [ $checkm = true ]; then
		run_checkm ${out}/metabat2_bins

#### MaxBin2
This creates the depth file for MaxBin2. This has been directly copied from the MetaWrap pipeline, because MaxBin2's own documentation is weirdly absent from the world. The program itself is also not very helpful, as it has no help message. Considering not running this module

In [2]:
#this creates the MaxBin depth file SEE IF THERE IS AN ALTERNATIVE
for f in Coral1 Coral2 Coral4 Coral5 T3-21-Mmea T3-4-Past; do
    jgi_summarize_bam_contig_depths --outputDepth ../data/working/mb2_master_depth_"$f".txt --noIntraDepthVariance ../../03_mapping/data/results/"$f".bam
    A=($(head -n 1 ../data/working/mb2_master_depth_"$f".txt)) 
    N=${#A[*]}
done
#the next bit breaks the depth file into different files for MaxBin to use       
for i in $(seq 4 $N); do 
		sample=$(head -n 1 ../data/working/mb2_master_depth_"$f".txt | cut -f $i) #dont think this line works yet
		grep -v totalAvgDepth ${out}/work_files/mb2_master_depth.txt | cut -f 1,$i > ${out}/work_files/mb2_${sample%.*}.txt
        

SyntaxError: invalid syntax (<ipython-input-2-70357cb867cc>, line 2)

In [5]:
#maxbin2 final script for processing #documentation for maxbin is unavailable, do we want to continue?


 #### CONCOCT
 This set of commands runs CONCOCT in its standard mode. It first creates a depth/coverage file for itself to use and then runs CONCOCT, with the standard settings. This means k-mer value is set to 4, minimum contig length is 1000, and CONCOCT runs on the exact amount of slots given to it by Hydra. 
 
CONCOCT creates a depth file out of the coverance created in the mapping step. It is key that this is all in the correct places before proceeding with binning.

In [6]:
#this creates the CONCOCT depth file

#this part cuts up the contigs into 10kb pieces for CONCOCT to use !Are the chunks too large?! check filenames
for f in Coral1 Coral2 Coral4 Coral5 T3-21-Mmea T3-4-Past
do
cut_up_fasta.py "$f".contigs-fixed.fa -c 1000 -o 0 --merge_last -b ../data/working/"$f"_contigs_cut.bed > ../data/working/"$f"_contigs_cut.fa

#this part estimates contig coverage
concoct_coverage_table.py ../data/working/"$f"_contigs_cut.bed ../../03_mapping/data/results/"$f".bam > ../data/working/coverage_table_"$f".tsv


SyntaxError: invalid syntax (<ipython-input-6-7db03b054bf7>, line 4)

In [9]:
#CONCOCT script
#make correct directories (can be omitted I think)
mkdir ../data/results/concoct_bins
mkdir ..data/working/concoct_temp
#run the following for all your samples
for f in Coral1 Coral2 Coral4 Coral5 T3-21-Mmea T3-4-Past
do
#this creates separate directories for all your samples, especially useful at later stages with many samples
mkdir ../data/results/concoct_bins/"$f"_concoct_bins
mkdir ../data/working/concoct_temp/"$f"_concoct_temp
#this next bit actually runs CONCOCT itself
concoct --composition_file ../data/working/"$f"_contigs_cut.fa --coverage_file ../data/working/coverage_table_"$f".tsv -t $NSLOTS -b ../data/working/concoct_temp/"$f"_concoct_temp/
merge_cutup_clustering.py ../data/working/concoct_temp/"$f"_concoct_temp/clustering_gt1000.csv > ..data/working/concoct_temp/"$f"_concoct_temp/"$f"_clustering_merged.csv
mkdir ../data/results/concoct_bins/"$f"_concoct_bins
extract_fasta_bins.py "$f"_assembly.fa ..data/working/concoct_temp/"$f"_concoct_temp/"$f"_clustering_merged.csv --output_path ../data/results/concoct_bins/"$f"_concoct_bins
done

SyntaxError: invalid syntax (<ipython-input-9-1ee438ed8a32>, line 3)

### Continuing
You should now have 3 sets of bins, each created with a slightly different algorithm. It is now important to run the CheckM software with the script below and generate output files for all of them. This will inform you towards the quality of your bins and your contamination/completion rate. After this, you can proceed to the "Refine Bins" part of the workflow.

