Skip to content

Running in parallel

Ryan Wick edited this page Jan 4, 2019 · 2 revisions

Badread is not intrinsically parallel – each instance will only use one thread. However, since each generated read is independent, you can simply run multiple instances and combine their output.

Here's a Bash loop that will accomplish this. Just set the total_bases and processes variables, and modify the badread simulate command as needed:

total_bases=500000000
processes=8

bases=$((total_bases / processes))
for p in $(seq 1 $processes); do
    log=badread_"$p".log
    reads=temp_"$p".fastq
    badread simulate --reference ref.fasta --quantity $bases 2> $log 1> $reads &
done
wait
cat temp_*.fastq | gzip > reads.fastq.gz
rm temp_*.fastq
Clone this wiki locally