Chimeric alignment scores inconsistently checked #723

mengxiao · 2019-08-26T02:53:25Z

In STAR 2.7.2a, I believe there's a bug in chimeric alignment multimapper detection wherein the scores of candidate alignments are not consistently checked against all criteria.

For example, I observe the following line in Chimeric.out.junction:

6     91012516        +       2       89865676        +       2       0       0       NB501164:392:HHK3HBGX2:1:11106:2376:12280       91012498        18M20S  89865677        18S20M5372p10S28M       1       76      41      52      52      0

The default value of 20 was kept for --chimScoreDropMax, so the minimum acceptable chimeric score alignment is 76 - 20 = 56, but this alignment is actually scored at 52.

This issue arises during the looping behavior to identify the best possible chimeric alignment. Specifically, at ChimericDetection_chimericDetectionMult.cpp#L67, 4 criteria are checked:

score exceeds that of best non-chimeric alignment
score is within allowable range of best possible alignment score (i.e. read length)
score exceeds minimum allowable chimeric score
score is acceptably close to best chimeric alignment score

However, after stitching, which may reduce the score, only condition 1 is checked. If the alignment score after stitching exceeds the current high water mark but nonetheless fails conditions 2 or 3, this causes later code, which checks only condition 4 in counting chimeric multimapping alignments or outputting chimeric junctions, to potentially produce alignments that should be discarded based on the parameters.

I have attempted a fix for this in #722, but I'm not sure I fully understand the intended behavior. @brianjohnhaas would you be willing to chime in? Please note that this PR includes the change I suggested in #721, which addresses a memory leak but should not change outputs.

Rerunning with the suggested patch causes the line reported above to no longer appear in Chimeric.out.junction. Most of the outputs are identical, a non-trivial number of alignments are now removed, and a small number of additional alignments are output in situations where with consistent filtering, the cap on the number of multimappers is not exceeded.

Thanks!

The text was updated successfully, but these errors were encountered:

brianjohnhaas · 2019-08-27T01:42:33Z

Excellent find and fix!

alexdobin · 2019-08-29T22:15:29Z

Thanks a lot, Meng Xiao.
Release 2.7.2b is out.

Cheers
Alex

mengxiao · 2019-08-30T03:42:39Z

Thanks, Alex!

@BiocondaBot autobump star

alexdobin added the issue: code Likely to be an issue with STAR code label Aug 26, 2019

mengxiao mentioned this issue Aug 27, 2019

Ensure chimeric alignment score requirements are consistently checked #722

Merged

alexdobin added the resolved problem or issue that has been resolved label Aug 29, 2019

alexdobin closed this as completed Aug 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chimeric alignment scores inconsistently checked #723

Chimeric alignment scores inconsistently checked #723

mengxiao commented Aug 26, 2019 •

edited

Loading

brianjohnhaas commented Aug 27, 2019

alexdobin commented Aug 29, 2019

mengxiao commented Aug 30, 2019

Chimeric alignment scores inconsistently checked #723

Chimeric alignment scores inconsistently checked #723

Comments

mengxiao commented Aug 26, 2019 • edited Loading

brianjohnhaas commented Aug 27, 2019

alexdobin commented Aug 29, 2019

mengxiao commented Aug 30, 2019

mengxiao commented Aug 26, 2019 •

edited

Loading