-
Notifications
You must be signed in to change notification settings - Fork 0
04_MAPPING
eolesin edited this page Feb 23, 2021
·
7 revisions
Yet another loop. Round and round we go... Within the 03_ASSEMBLIES directory:
# Reformat names
for SET in `cat set.txt`
do
anvi-script-reformat-fasta $SET/$SET-contigs.fa -l 2500 --simplify-names -o $SET.fa
anvi-gen-contigs-database -f 04_CONTIGS/$SET.fa -o 04_CONTIGS/$SET-CONTIGS.db
done
# HMM profiling
for SET in `cat set.txt`; do anvi-run-hmms --num-threads 20 -c 04_CONTIGS/$SET-CONTIGS.db ; done
# Run against NCBI COG database
for file in ./*.db; do anvi-run-ncbi-cogs --num-threads 16 -c $file ; done
# Build bowtie2 DB for each co-assembly
for SET in `cat set.txt`; do bowtie2-build 03_CONTIGS/$SET.fa 04_MAPPING/$SET --num-threads 20; done
while read line;
do
SET=$(echo $line | cut -d" " -f1);
samples=$(echo $line | cut -d" " -f2);
delimiter=",";
declare -a Smparray=($(echo $samples | tr "$delimiter" " "));
for samp in "${Smparray[@]}";
do
bowtie2 --threads 40 \
-x 05_MAPPING/$SET \
-1 02_HUMAN_Decontam/$samp-cleanR1.fq \
-2 02_HUMAN_Decontam/$samp-cleanR2.fq \
--no-unal \
-S 05_MAPPING/$samp.sam;
samtools view -F 4 -bS 05_MAPPING/$samp.sam > 05_MAPPING/$samp-RAW.bam;
samtools sort 05_MAPPING/$samp-RAW.bam -o 05_MAPPING/$samp.bam;
samtools index 05_MAPPING/$samp.bam;
rm 05_MAPPING/$samp.sam 05_MAPPING/$samp-RAW.bam;
done;
done < samples_in_sets.txt
In 2020 Dahle group sent 60 samples for sequencing from various chimneys across the AMOR. The wiki here is to share the pipeline I used to process this dataset. The intent is to be specific about all steps involved, and to provide other lab members with this information so that they do not have to repeat the same time-consuming processes. By using my Git page, there is an added benefit of accountability and having someone to email if something doesn't work for you. :)