Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-frame deletion being missed by moPepGen #515

Closed
zhuchcn opened this issue Jul 15, 2022 · 3 comments · Fixed by #516
Closed

In-frame deletion being missed by moPepGen #515

zhuchcn opened this issue Jul 15, 2022 · 3 comments · Fixed by #516
Assignees

Comments

@zhuchcn
Copy link
Member

zhuchcn commented Jul 15, 2022

This case is caught by fuzz test with in-frame deletion. In the graph below, this in-frame deletion causing peptide RRA to be deleted. The node AP at the bottom is what this deletion produces, and has the same sequence as the AP on the top.

VMDASAFEIFSTFPPTLYQDDTLTLQAAGLVPK-AALLLR-AR-R-AP ->    VMDASAFEIFSTFPPTLYQDDTLTLQAAGLVPK-AALLLR-AR-R-AP-
                                                                                        \              /
                                                                                         AALLLR-AP-----

With our current algorithm setting, when creating the peptide cleavage graph, in end nodes of a variant bubbles are collapsed if they have the same sequence and the same outgoing nodes, and only keep the node without any variant. The graph is then turned to something like this:

VMDASAFEIFSTFPPTLYQDDTLTLQAAGLVPK-AALLLR-AR-R-AP-    ->  VMDASAFEIFSTFPPTLYQDDTLTLQAAGLVPK-AALLLR-AR-R-AP- 
                                 \              /                                         \           /
                                  AALLLR-AP-----                                           AALLLR-----

And the node AP is no longer a variant node, so then the peptide AALLLRAP is missed.

@lydiayliu
Copy link
Collaborator

OMG this is th edge case for bubble collapse! What are we gonna do about it?

@zhuchcn
Copy link
Member Author

zhuchcn commented Jul 15, 2022

I initially tried to separate the reference nodes from nodes with any variants, so nodes with any variants will be collapsed together. But then it causes weird problem that one of my existing test case fails for 50% of times. And I also realized that this isn't good because synonymous mutations can also be called. For coding it's fine, but not for noncoding. So I now switched to only separate those ones with INDELs. It seems to work now. Will open a PR shortly.

@lydiayliu
Copy link
Collaborator

So basically ignoring the INDELs in buuble collapse, treating everything else still the same

@zhuchcn zhuchcn self-assigned this Jul 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants