how to handle IUPAC ambiguous codes #11

lituan · 2017-07-21T08:33:16Z

I read a paragrah from tmap-book, but does not understand the phrase " lexicographically smallest DNA base" and why R is converted to C, can you explain this?

Ambiguous IUPAC codes in the reference/target FASTA will be converted to the lexico-
graphically smallest DNA base that is not compatible to the IUPAC code to ensure mini-
mum reference bias. For example, an IUPAC base R, which represents an A or a G, will be
converted to a C. All Ns in the reference will be converted to As. Furthermore, any non-
IUPAC character will be treated as an N. The ambiguity codes will only be re-considered
when calculating the NM and MD SAM record optional tags.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to handle IUPAC ambiguous codes #11

how to handle IUPAC ambiguous codes #11

lituan commented Jul 21, 2017

how to handle IUPAC ambiguous codes #11

how to handle IUPAC ambiguous codes #11

Comments

lituan commented Jul 21, 2017