Skip to content

Match Barcode and Enzyme

c0deb0t edited this page Jan 24, 2018 · 4 revisions

In order to maximize the number of matches and make sure that the matches are as accurate as possible, the program checks all of the candidate barcodes that match within a threshold. The list of candidates is narrowed down by checking if the enzyme sequence can be found after it. If there are paired-end reads, then the reversed read is checked to see if it matches the forwards read's barcode and enzyme. Out of the remaining candidates, the barcode that matched with the least edit distance is used. If there are multiple remaining candidates, the barcode with the longest length is used. The enzyme match is also determined by the match with the longest length. Semi-global aligment is also allowed for barcode matching. Matching for barcodes in the index files uses a similar algorithm, but enzymes are not matched in the index files. Barcodes and enzymes support probability based matching, which take into account the qualities of the base pairs.