Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in how we determine problematic peptides after running path-finding algorithm #581

Merged
merged 1 commit into from
Oct 5, 2020

Conversation

susannasiebert
Copy link
Contributor

Previously, we would use a rudimentary approach to determining problematic peptides after a failed path-finding attempt by simply calling identify_problematic_peptides again. This wouldn't actually return any results since this test first had to be passed to even make a path-finding attempt. As a result if a path-finding attempt was made but no path was found, then any subsequent trimmings of the peptides would be unsuccessful since no peptides were marked for trimming. Trimming would only happen if the pre-check before path finding failed.

This bugfix now tracks the "best" (but failing) path created by the annealing algorithm and returns all node-pairs of that path that don't have an edge (aka no good junction). It then returns those peptides that have a problematic end or start.

@susannasiebert susannasiebert changed the base branch from master to hotfix August 14, 2020 19:12
Copy link
Member

@tmooney tmooney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change makes sense, but I don't really understand the implications of repeatedly overwriting problematic_start and problematic_end with the results for the latest spacer attempted when we come around for another try of the outer loop. Is there something special about the last spacer or does it just not matter?

try:
min_score = min(min_score, edge['weight'])
except:
min_score = edge['weight']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a clever way to avoid an initialization error the first time through the loop! I'd have been tempted to initialize it to math.inf before the loop.

@susannasiebert
Copy link
Contributor Author

Yeah, in the current implementation only the last spacer run matters. After the last spacer runs the problematic_start and problematic_end will be used to trim these problematic starts/ends from the peptide. The inner for loop over all the spacers then reruns with the trimmed peptides. In #580 this info will also be used to determine which junctions to add a spacer to (instead of adding spacers to all junctions).

@susannasiebert susannasiebert merged commit 88476bc into hotfix Oct 5, 2020
@susannasiebert susannasiebert deleted the pvacvector_hotfix branch January 27, 2021 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants