Small alignment leads to error in consensus aligner #899

CBeelen · 2022-12-05T19:08:50Z

In sample MSVRNA3-HIV_S7 from run 210115_M04401, there is an alignment starting just 1 nucleotide before the end of vpr. Because there is nothing to align in amino acid space, there are no values in coord2conseq (the dictionary helping us translate conseq coordinates to reference coordinates), so the code fails in count_match when trying to find the maximum coordinate.
There are two options to solve this:

in the consensus aligner, in find_amino_alignments, check for the size of an alignment before adding it (we already check if it's larger than 0 for a match, we could check whether it's at least 3 nucleotides long), or
in the consensus aligner, in count_match, check whether the alignment is large enough to do anything.

I'd prefer to catch alignments that are too small as soon as possible (option 1), but option 2 might help catch other weird edge cases.

The text was updated successfully, but these errors were encountered:

donkirkby · 2022-12-05T19:25:33Z

Option 1 sounds good to me, particularly since it's just a change to an existing check.

CBeelen · 2023-05-09T18:17:51Z

This also happened for sample 58836A-HIV_S14 from run 150325_M01841.

CBeelen · 2023-05-12T19:32:31Z

Generally speaking, we can find the reading frame of an alignment, even if it is smaller than 3 nucleotides - we usually just round up to the nearest-larger integer number of amino acids and align. In this particular case, the error happened only because we were right at a region boundary and there was not enough sequence to align to.
I'm a little worried about very fragmented alignments if throw alignments of 1 or 2 nucleotides away - so I'm working on option 2 now, instead. I'm also double checking some cases with very bad alignments to see if we ever need these small alignments.

CBeelen added this to the 7.16 milestone Dec 5, 2022

CBeelen self-assigned this Mar 21, 2023

CBeelen added a commit that referenced this issue Mar 21, 2023

Fix for #899

91c164d

CBeelen mentioned this issue May 12, 2023

Small alignments bug fix #962

Merged

CBeelen closed this as completed in #962 May 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Small alignment leads to error in consensus aligner #899

Small alignment leads to error in consensus aligner #899

CBeelen commented Dec 5, 2022

donkirkby commented Dec 5, 2022

CBeelen commented May 9, 2023

CBeelen commented May 12, 2023

Small alignment leads to error in consensus aligner #899

Small alignment leads to error in consensus aligner #899

Comments

CBeelen commented Dec 5, 2022

donkirkby commented Dec 5, 2022

CBeelen commented May 9, 2023

CBeelen commented May 12, 2023