Skip to content

Latest commit

 

History

History
14 lines (12 loc) · 1.79 KB

sam_output.md

File metadata and controls

14 lines (12 loc) · 1.79 KB

Details on the SAM output generated by GraphMap

Description of special tags in the SAM output:

  • ZE - The E-value. More accurately - a pesimistic approximation of the E-value obtained by rescoring the generated alignment with scores/penalties for which pre-calculated Gumbel parameters exist. Concretely, scores/penalties are: match = 5, mismatch = -4, gap_open = -8, gap_extend = -6. By default, there is no threshold on the E-value so even weak homologies would be reported, but there is a parameter which provides this functionality (-z), e.g.: -z 1e0.
  • ZF - An internal parameter for quality of alignment calculated using equation (8) in our preprint: (http://biorxiv.org/content/early/2015/06/10/020719). In GraphMap, potential regions for a read are sorted by this parameter, and the primary alignment is the one with the largest ZF value. ZF values for different reads are not mutually comparable.
  • ZQ - Query (read) length.
  • ZR - Reference length.
  • H0 - Specified by SAM format as the "number of perfect hits", GraphMap reports here the number of possible mapping positions with the same number of kmer hits.
  • NM - Edit distance, specified by the SAM format.
  • AS - Alignment score, specified by the SAM format.

There are two hidden gems in GraphMap's output, providing more detailed reporting of the alignment process. Compiling GraphMap with make testing will generate a binary file on path bin/graphmap-not_release. Running this version using parameter -b 3 will generate a more verbose version of the SAM output file:

  • X3 - A string containing very verbose information about the alignment of a particular read.
  • X4 - Measurement of the CPU time spent on major parts of the algorithm, in a human-readible text format.