Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logic of taking the subset within a given radius #8

Closed
snurk opened this issue Nov 22, 2019 · 8 comments
Closed

Logic of taking the subset within a given radius #8

snurk opened this issue Nov 22, 2019 · 8 comments
Labels
question Further information is requested

Comments

@snurk
Copy link

snurk commented Nov 22, 2019

Dear @lh3,
Can you please provide a comment on how the subgraph extraction with a given radius works?
I tried using this function and was getting weird results. In particular the resulting graph had some non-branching paths that to the best of my knowledge shouldn't have been there and my own implementation wasn't producing them.

@lh3
Copy link
Owner

lh3 commented Nov 22, 2019

The code is here. Just a DFS. I think it doesn't guarantee to find all nodes within a radius, but the nodes it finds should be ok.

gfatools/gfa-util.c

Lines 6 to 43 in 5f7fe55

void gfa_sub(gfa_t *g, int n, char *const* seg, int step)
{
int32_t i;
int8_t *flag;
kvec_t(uint64_t) stack = {0,0,0};
if (n == 0) return;
GFA_CALLOC(flag, g->n_seg * 2);
for (i = 0; i < n; ++i) {
int32_t s;
s = gfa_name2id(g, seg[i]);
if (s >= 0) {
kv_push(uint64_t, stack, (uint64_t)(s<<1|0)<<32);
kv_push(uint64_t, stack, (uint64_t)(s<<1|1)<<32);
}
}
for (i = 0; i < g->n_seg; ++i) // mark all segments to be deleted
g->seg[i].del = 1;
while (stack.n) {
uint64_t x = kv_pop(stack);
uint32_t v = x>>32, r = (uint32_t)x;
if (flag[v]) continue; // already visited
flag[v] = 1;
g->seg[v>>1].del = 0;
if (r < step) {
uint32_t nv = gfa_arc_n(g, v);
gfa_arc_t *av = gfa_arc_a(g, v);
for (i = 0; i < nv; ++i) {
if (flag[av[i].w] == 0)
kv_push(uint64_t, stack, (uint64_t)av[i].w<<32 | (r + 1));
if (flag[av[i].w^1] == 0)
kv_push(uint64_t, stack, (uint64_t)(av[i].w^1)<<32 | (r + 1));
}
}
}
free(stack.a);
free(flag);
gfa_arc_rm(g);
}

@lh3
Copy link
Owner

lh3 commented Nov 22, 2019

Probably I should change DFS to BFS...

@snurk
Copy link
Author

snurk commented Nov 22, 2019

Got you! Do I understand correctly that the current implementation will produce weird result in a quite common case of querying the subgraph around the path in the graph?

@lh3
Copy link
Owner

lh3 commented Nov 22, 2019

the current implementation will produce weird result in a quite common case of querying the subgraph around the path in the graph?

I don't think so. The nodes gfatools brings should be correct.

@snurk
Copy link
Author

snurk commented Nov 22, 2019

Maybe I am missing something, but it seems that if I ask for radius 3 then when I start with the first segment in the path I will reach the 4-th segment from it. Then 4-th segment will already be visited and we will not start the 'fresh' DFS from it, missing its neighbours. I can try to allocate some time next week to provide an example missing relevant vertices.

@lh3
Copy link
Owner

lh3 commented Nov 23, 2019

gfatools may have false negatives, but it should not have false positives. If you can provide an example, I can explore BFS. Thanks.

@snurk
Copy link
Author

snurk commented Nov 26, 2019

You are correct, it will not have false positives. If you think that false negatives are not a problem please close the issue :)

@lh3 lh3 added the question Further information is requested label Nov 26, 2019
@lh3
Copy link
Owner

lh3 commented Nov 26, 2019

Thanks for the confirmation. I will mark this as a question and close it. Nonetheless, I still want to come back to subgraph extraction at some point to address the false negative part. BFS sounds the better approach anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants