Skip to content

03_ASSEMBLIES

eolesin edited this page Apr 21, 2021 · 14 revisions

Each sample will be assembled separately. This decision was made after much deliberation. The inability to coassemble large metagenome datasets is a burden to sample comparison downstream in some cases, but hopefully this can be overcome at the various levels of information we are seeking.

We employ MEGAHIT for the assembly. I have elected to process each sample separately in a loop, but this could perhaps be better optimized for instance if I were to do this on SAGA, by cutting it into several parallel jobs instead of one large job as I do here.

# On kjempefuru:
AMOR_2019_path='/export/dahlefs/work/Shotgun/Metagenomes_chimneys_2019/01_QC'
AMOR_2020_path='/export/dahlefs/work/Metagenomes_chimneys_2020_workfolder/02_HUMAN_Decontam/'

# 2019 data first
while read line; do mypath=$(echo $AMOR_2019_path); Dataset=$(echo $line);\
R1_suff="-QUALITY_PASSED_R1.fastq"; R2_suff="-QUALITY_PASSED_R2.fastq"; \
megahit -1 $mypath$line$R1_suff -2 $mypath$line$R2_suff --min-contig-len 1000 -m 0.85 -o 03_INDIV_ASSEMBLY/$Dataset -t 40; \
done < AMOR_2019

# Then all the 2020 samples we deemed "good"
# Then all the iron mat samples

Clone this wiki locally