-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could not compute similarity for... #30
Comments
Hey @liz-is , thank you for the detailed bug report. Can you please try to plot the O/E matrix of a chromosome (or part thereof) that fails? I have a suspicion that the expected values might be the issue here, in which case this is probably related to the FAN-C dev version. Thanks! |
Hey Liz! |
Hi @nickmachnik , this was an issue with the FAN-C development version, which we could figure out independently, so I am closing this! |
Hi folks,
Some of my region pairs are being deemed invalid, but I don't think they fall into any of the possible reasons given. Do you have any other ideas what the issue might be? Is there a way I can get more diagnostic info to try to debug this myself (without having to dig deep into the code and run each step manually, which I can do if necessary)?
Here's the error message:
This is Drosophila Hi-C data. I've tried different resolutions and two different window sizes (100x and 150x the bin size). The pairs file for each parameter combo was generated with
chess pairs
from the same text file with the chromosome sizes (and these files look okay to me from a quick glance).In each example, all bins from certain chromosomes are missing! In particular, chr 2R and 3R. However I get results for these chrs at 25kb resolution so I don't think there is a chromosome naming mismatch between the files or anything like that.
(N.B., it makes sense that there are no valid pairs on chr 4 at 25kb resolution, since I'm using a window size of at least 2.5 Mb, which is larger than the chromosome size. Same for 10 kb resolution with 150x window size)
I would have thought that it would be a resolution issue (i.e. too many unmappable bins), but having plotted each chromosome at 10kb resolution in both my query and my reference, they look fine. Some unmappable bins but I'd expect to get some results - they don't look any worse than other chromosomes.
I'm happy to look into this further myself since I have some familiarity with the code by now, but I'm not really sure where to start. Do you have any ideas?
I am using a development version of FAN-C, but @kaukrise said that it should work fine.
Also, as a more general comment, would it be possible to implement a more informative version of this message?
2021-01-15 14:45:01,759 INFO Could not compute similarity for 6316 region pairs.This can be due to faulty coordinates, too smallregion sizes or too many unmappable bins
I've seen other questions relating to this, so it seems like a common issue/point of confusion. Although most of the time this is easy to solve, it would be helpful to know which of those three possibilities accounts for the invalid pairs as a starting point for debugging.
The text was updated successfully, but these errors were encountered: