Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to extract most likely position of series of nodes #104

Open
RenzoTale88 opened this issue Mar 25, 2021 · 5 comments
Open

How to extract most likely position of series of nodes #104

RenzoTale88 opened this issue Mar 25, 2021 · 5 comments

Comments

@RenzoTale88
Copy link

Hello,
I've imported a pg graph in python using the bdsg module. I'm processing a series of alignments to the graph itself.
For practical reasons I'm processing them as gaf alignments generated with vg convert -G.
For each read, I've got a path represented in the >47102051>47102052>47102053 format. For these I can extract all the possible positions for each node on every path. However, this is quite impractical when it comes to defining the most likely contiguous set of intervals. Is there a way to extract this type of information based on this information?
For example, if node 47102051 can come from "chr1:0-10" and "chr1:50-70", and node 47102052 comes from chr1:11-24, then the interval succession is likely to be: chr1:0-10 > chr1:11-24. Not sure if I'm explaining my problems clearly, but I hope it makes sense.

Thank you in advance,
Andrea

@jeizenga
Copy link
Contributor

I'm not sure if this is a libbdsg problem per se, but it sounds to me like what you're looking for is an alignment score. vg surject has an algorithm much like this, but I'm not sure if it will be very efficient on graphs with complex topologies.

@RenzoTale88
Copy link
Author

Thank you for the reply, yes I did try surject, but with a graph derived from cactus it kept failing around very large areas of the genome. Alternatively, is there a way to test if a path has a sequence of nodes as consecutive?

@jeizenga
Copy link
Contributor

I think you could do what you're describing by using for_each_step_on_handle to get all of the path steps on the node and then using get_next_step or get_previous_step to walk the paths locally and check for the adjacent node.

@ekg
Copy link
Member

ekg commented Mar 26, 2021 via email

@RenzoTale88
Copy link
Author

@jeizenga ok thanks I can try implement that thanks.

@ekg yes I think it might be of help. Do you think it is possible to get this having a graph.og and a list of nodes' ids as above? (>47102051>47102052>47102053)

Thanks both for your help, I really appreciate!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants