Skip to content

Zach-Sisson-1/Demultiplex

Repository files navigation

Demultiplexing

Illumina output FASTQ files contain multiplexed sequence data from different biological samples. This repo contains the code for an algorithm that will sort each record within a set of FASTQ files to be reassigned to files containing only the FASTQ records for that biological sample group. To do this, paired-end sequence reads must be sorted using their unique dual-matched index barcode, however, the code needs to account for barcodes having undetermined base calls (N) as well as index hopping, which results in some reads containing two different indexes on either side of the sequence.

Final script titled Dplexer.py can be found under assignment, the third.

About

Code for demultiplexing Illumina sequence data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages