Demultiplexing

Illumina output FASTQ files contain multiplexed sequence data from different biological samples. This repo contains the code for an algorithm that will sort each record within a set of FASTQ files to be reassigned to files containing only the FASTQ records for that biological sample group. To do this, paired-end sequence reads must be sorted using their unique dual-matched index barcode, however, the code needs to account for barcodes having undetermined base calls (N) as well as index hopping, which results in some reads containing two different indexes on either side of the sequence.

Final script titled Dplexer.py can be found under assignment, the third.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
Assignment-the-first		Assignment-the-first
Assignment-the-second		Assignment-the-second
Assignment-the-third		Assignment-the-third
TEST-input_FASTQ		TEST-input_FASTQ
TEST-output_FASTQ		TEST-output_FASTQ
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Demultiplexing

About

Releases

Packages

Languages

Zach-Sisson-1/Demultiplex

Folders and files

Latest commit

History

Repository files navigation

Demultiplexing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages