Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing pairs - somalier relate v. >= 0.2.7 #76

Closed
LolloPero opened this issue Jun 10, 2021 · 5 comments
Closed

missing pairs - somalier relate v. >= 0.2.7 #76

LolloPero opened this issue Jun 10, 2021 · 5 comments

Comments

@LolloPero
Copy link

I am running somalier -relate and compared the pairs.tsv output across different versions of somalier package.

The input to somalier relate is n=875 .somalier files, then the corresponding pairs.tsv output file should contain the relatedness between all possible pairings n=382.375 (no repetition, order does not count).

Yet, this is true only for some versions, and NOT for the latest (v 0.2.13):
somlaier_relate_comparison

  • How is the pairs.tsv output computed in versions >= 0.2.7 ?
@brentp
Copy link
Owner

brentp commented Jun 10, 2021

Hi, I agree this is confusing,
somalier writes a message to stderr that it won't write all sample-pairs to file with large numbers. I changed the cutoff for this to occur in e8a2c29 from 400K to 200K samples so you were between those and therefore see a difference.

There is some randomness and that's why you see mildly different numbers between versions -- you'd also see this if you ran the same version multiple times.

Let me know if this answers your question or if you have suggestions to make it less jarring. I suppose I could see the random number generator with a fixed number so you'd get identical results post v0.2.6.

@LolloPero
Copy link
Author

Hi, thank you for clarifying this.

As I understood it, when the number of sample-pairs is above the cutoff, some pairings will be left out for the sake of the html plot.

Is there a way to force somalier relate to write down all pairings to the pairs.tsv outupt file?
Or could this be implemented next?

Thanks

@brentp
Copy link
Owner

brentp commented Jun 11, 2021

Yes, I can probably make this a hidden option. Just to clarify, somalier will only skip writing pairs that are both:

  1. unrelated by genotypes (calculated rel < 0.05)
  2. expected to be unrelated (no relatedness in pedigree file)

all other pairs will be written.

@LolloPero
Copy link
Author

Hi Brent,

I would like to follow up on this thread and ask if it would be possible to implement a silent option to output all pairings in the .group.tsv file, regardless of the 2 conditions written above (1. unrelated by genotypes 2. expected to be unrelated).

It would be great if tis could be done in the latest version (0.2.13).

Thanks :)

@brentp brentp closed this as completed in c0feff3 Jan 9, 2022
@brentp
Copy link
Owner

brentp commented Jan 10, 2022

Hi, this is available in the latest release by setting the environment variable SOMALIER_REPORT_ALL_PAIRS to a non empty value, e.g. export SOMALIER_REPORT_ALL_PAIRS=true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants