You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
sampled_reads=0
increment=50000
total_reads_number=1000000 # do not exceed the number of reads in your fastq file
fastq_file_R1=my_R1.fq # replace with path to your R1 fastq file
fastq_file_R2=my_R2.fq # replace with path to your R2 fastq file
refnuc=my_ref_nuc.fa # replace with path to your reference nucleotide fasta file
refprot=ref_prot.fa # replace with path to your reference protein fasta file
Subsample reads with seqtk and run rmseq pipeline
while [ "$sampled_reads" -lt "$total_reads_number" ]
do
sampled_reads=$((sampled_reads+increment))
echo "oooo Subsampling $sampled_reads reads with seqtk oooo"
seqtk sample -s100 $fastq_file_R1 $sampled_reads > $sampled_reads_R1.fq
seqtk sample -s100 $fastq_file_R2 $sampled_reads > $sampled_reads_R2.fq
echo "oooo Running rmseq on $sampled_reads subsampled reads"
rmseq run $sampled_reads_R1.fq $sampled_reads_R2.fq $refnuc $refprot $sampled_reads
done
Concatenate all amplicons.effect files obtained from different reads subsamplings
cat ./*/amplicons.effect >> all.amplicons.tab
Plot the number of consensus reads (from 10 reads) versus the number of reads (depth of sequencing) with Rstudio
Install and load the tidyverse package
install.packages("tidyverse")
library(tidyverse)
Import the concatenated consensus amplicon table as a dataframe