# 1. Sequence read processing and metagenome assembly

## Software and versions used in this study

- Trimmomatic v0.38 / v0.39
- SortMeRNA v4.3.6 (database: smr_v4.3_fast_db)
- metaSPAdes v3.11.1
- seqmagick v0.7.0

***

## Quality trim and filter sequencing reads

Note:

- DNA sequencing reads in this study generated via Illumina HiSeq paired end 2x250 bp sequencing
- RNA sequencing reads in this study generated via Illumina HiSeq paired end 2x125 bp sequencing

In [None]:
# Make relevant adapter file if not already created (n.b. truncated adapter included here)
if [ ! -f iua.fna ]; then
    echo ">FastQC_adapter" > iua.fna
    echo "AGATCGGAAGAG" >> iua.fna
fi

#### DNA sequencing reads

Trim and filter with Trimmomatic

In [None]:
for i in {1..9}; do
    trimmomatic PE -threads 30 -phred33 -quiet \
    DNA/0.raw_data/S${i}_R1.fastq.gz DNA/0.raw_data/S${i}_R2.fastq.gz \
    DNA/1.Qual_filtered_trimmomatic/S${i}_R1.fastq DNA/1.Qual_filtered_trimmomatic/S${i}_R1.single1.fastq \
    DNA/1.Qual_filtered_trimmomatic/S${i}_R2.fastq DNA/1.Qual_filtered_trimmomatic/S${i}_R2.single2.fastq \
    ILLUMINACLIP:iua.fna:1:25:7 CROP:240 SLIDINGWINDOW:4:30 MINLEN:80
    # Tidy up singleton reads
    cat DNA/1.Qual_filtered_trimmomatic/S${i}_R1.single1.fastq DNA/1.Qual_filtered_trimmomatic/S${i}_R2.single2.fastq > DNA/1.Qual_filtered_trimmomatic/S${i}_single.fastq
    rm DNA/1.Qual_filtered_trimmomatic/*single1.fastq DNA/1.Qual_filtered_trimmomatic/*single2.fastq
done


#### RNA sequencing reads

Trim and filter with Trimmomatic

In [None]:
for i in {1..9}; do
    trimmomatic PE -threads 10 -phred33 -quiet \
    RNA/0.raw_data/S${i}_R1.fastq.gz RNA/0.raw_data/S${i}_R2.fastq.gz \
    RNA/1.Qual_filtered_trimmomatic/S${i}_R1.fastq RNA/1.Qual_filtered_trimmomatic/S${i}_R1.single1.fastq \
    RNA/1.Qual_filtered_trimmomatic/S${i}_R2.fastq RNA/1.Qual_filtered_trimmomatic/S${i}_R2.single2.fastq \
    ILLUMINACLIP:iua.fna:1:25:7 CROP:115 SLIDINGWINDOW:4:30 MINLEN:50
    # Tidy up singleton reads
    cat RNA/1.Qual_filtered_trimmomatic/S${i}_R1.single1.fastq RNA/1.Qual_filtered_trimmomatic/S${i}_R2.single2.fastq > RNA/1.Qual_filtered_trimmomatic/S${i}_single.fastq
    rm RNA/1.Qual_filtered_trimmomatic/*single1.fastq RNA/1.Qual_filtered_trimmomatic/*single2.fastq
done

Filter residual rRNA reads with SortMeRNA



In [None]:
# paired files
for i in {1..9}; do
    ${sortmerna_path}/bin/sortmerna --num_alignments 1 --fastx --paired_in --out2 \
    --reads RNA/1.Qual_filtered_trimmomatic/S${i}_R1.fastq \
    --reads RNA/1.Qual_filtered_trimmomatic/S${i}_R2.fastq \
    --ref ${sortmerna_path}/databases_v4.3.4/smr_v4.3_fast_db.fasta \
    --workdir RNA/2.rRNA_filtered/tmp/S${i}/ \
    --aligned RNA/2.rRNA_filtered/aligned/S${i}_rRNA \
    --other RNA/2.rRNA_filtered/unaligned/S${i}_non_rRNA
    rm -r RNA/2.rRNA_filtered/tmp/S${i}
done

# single files
for i in {1..9}; do
    ${sortmerna_path}/bin/sortmerna --num_alignments 1 --fastx \
    --reads RNA/1.Qual_filtered_trimmomatic/S${i}_single.fastq \
    --ref ${sortmerna_path}/databases_v4.3.4/smr_v4.3_fast_db.fasta \
    --workdir RNA/2.rRNA_filtered/tmp/S${i}_single/ \
    --aligned RNA/2.rRNA_filtered/aligned/S${i}_single_rRNA \
    --other RNA/2.rRNA_filtered/unaligned/S${i}_single_non_rRNA
    rm -r RNA/2.rRNA_filtered/tmp/S${i}_single
done

***

## Metagenome assembly via metaSPAdes

Assemble trimmed and filtered DNA sequencing reads with metaSPAdes

In [None]:
for i in {1..9}; do
    spades.py --meta -k 41,61,81,101,127 \
    -1 DNA/1.Qual_filtered_trimmomatic/S${i}_R1.fastq \
    -2 DNA/1.Qual_filtered_trimmomatic/S${i}_R2.fastq \
    -s DNA/1.Qual_filtered_trimmomatic/S${i}_single.fastq \
    -o DNA/1.assembly/S${i}.spades/
done


#### Filter out short contigs

In [None]:
mkdir -p DNA/1.assembly.m1000
for i in {1..9}; do
    seqmagick convert --min-length 1000 DNA/1.assembly/S${i}.spades/assembly.fasta DNA/1.assembly.m1000/S${i}.assembly.m1000.fasta
done


***