Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cis/trans different using cooler vs. cooltools #244

Closed
biofilos opened this issue Mar 4, 2021 · 3 comments
Closed

cis/trans different using cooler vs. cooltools #244

biofilos opened this issue Mar 4, 2021 · 3 comments

Comments

@biofilos
Copy link

biofilos commented Mar 4, 2021

Hi. I am using the sample cool file you provide in the notebook tutorial to learn how the cool* environment works.
I was trying to calculate cis/trans contacts by traversing the matrix, like this:

filepath = 'data/Rao2014-GM12878-MboI-allreps-filtered.5kb.cool'
c = cooler.Cooler(filepath)
def allvsall(lst):
    for ix1, i1 in enumerate(lst):
        for ix2, i2 in enumerate(lst[ix1:]):
            yield i1, i2
cis, trans = 0, 0
for c1, c2 in allvsall(c.chromsizes.index):
    print(c1, c2)
    mat = c.matrix(sparse=False, balance=False).fetch(c1, c2)
    counts = np.nansum(np.triu(mat))
    if c1 == c2:
        cis += counts
    else:
        trans += counts

When comparing the cis and trans values from this approach with the ones in the metadata on the cool file, I find that the cis values are the same, but the trans values are different.
When doing the same calculation using cooltools:

from concurrent import futures
with futures.ProcessPoolExecutor(4) as pool:
    cis, total = coverage.get_coverage(c, ignore_diags=0, map=pool.map)
cis_total = cis.sum()
trans_total = total.sum() - cis_total

cis_total and trans_total correspond to what I see in the metadata of the cool file.

I coded the first method as a way of learning how to use the library, which is incredibly slow. However, I expected the results to be consistent with the metadata of the file.
My question is, why do those two methods differ in their values?

@nvictus
Copy link
Member

nvictus commented Mar 4, 2021

From a quick glance, it looks like you are applying np.triu(mat) even when mat comes from trans. Could that be the issue?

@biofilos
Copy link
Author

biofilos commented Mar 4, 2021

oh yes, that makes sense. Thanks!
By the way, do you guys have a google group or forum, or something like that?
I realize that github issues is not the best place to post these kinds of questions

thanks again

@nvictus
Copy link
Member

nvictus commented Mar 4, 2021

Yes, we're on Slack! Join here: https://bit.ly/2UaOpAe

@nvictus nvictus closed this as completed Mar 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants