Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why many SNPs were excluded as invariant or singleton in population 2 ? #57

Closed
yuan102379 opened this issue Jun 21, 2020 · 3 comments
Closed

Comments

@yuan102379
Copy link

Hi hardingnj,

First, thanks a lot for your outstanding XP-CLR software.

Now, I have tried to employ XP-CLR to conduct selective sweep analysis. Howerver, I noticed that some SNPs were excluded as invariant or singleton in population 2, which were not used for the analysis. The log information is listed as follows:

####################
2020-06-21 11:32:38 : INFO : running xpclr v1.1.2
2020-06-21 11:32:38 : INFO : Loading TXT
2020-06-21 11:33:54 : INFO : TXT loading complete
2020-06-21 11:33:54 : INFO : 951,143 SNPs in total are in the provided input files
2020-06-21 11:33:54 : INFO : 0 SNPs excluded as multiallelic
2020-06-21 11:33:54 : INFO : 0 SNPs excluded as missing in all samples in a population
2020-06-21 11:33:54 : INFO : 62,460 SNPs excluded as invariant or singleton in population 2
2020-06-21 11:33:54 : INFO : 888,683/951,143 SNPs included in the analysis (93.43%)
2020-06-21 11:33:55 : INFO : Done dropping above SNPs from analysis. XP-CLR algorithm starting.
2020-06-21 11:33:56 : INFO : Omega estimated as : 0.655994
2020-06-21 11:33:56 : DEBUG : Processing window 1/5713...
2020-06-21 11:38:02 : DEBUG : Processing window 11/5713...
2020-06-21 11:41:45 : DEBUG : Processing window 21/5713...
2020-06-21 11:41:59 : DEBUG : Processing window 31/5713...

#####################

Could you give the reason for the the exclueded SNPs, please ? Where can I find the detail information of the excluded SNPs ? If these SNPs were specific in population 2, they should be important for the population 2. Were these SNPs were also filtered in the old XP-CLR program of Chen (2010) ?

Thanks for your help & have a nice day.

Yours,

Yuan

@hardingnj
Copy link
Owner

hardingnj commented Jun 22, 2020

Hi there,

XPCLR is a 1 way scan. So, it's only looking for selection in population 1, not population 2. Think of them as object pop and reference pop. If you flip the populations (1->2, 2->1), then SNPs excluded from the first analysis may well be informative.

This isn't mentioned explicitly in the paper, but the likelihoods don't work if your reference population is fixed, as you have no way to drift to the value in pop1 under null assumption. You can drift from 0.5 to 0.0, but not 0.0 to 0.5.

Hope that helps.

@yuan102379
Copy link
Author

Hi @hardingnj ,

Thanks for your outstanding explaining, which could help me to understand XPCLR a lot.

Can I have another question ?

I want to use map and geno files as the input files as same as the example data of Chen. However, In the map file, there were six column data. The thrid column is the genetic distance of SNP location. I want to make sure the unite of genetic distance. The unite is morgan or centi-mogran ? 1 morgan = 100 centi-mogran ?

image

Thanks a lot, hardingnj.

Yours,

Yuan

@hardingnj
Copy link
Owner

My apologies @yuan102379 - I missed your reply.

cM, so 1/100 of a Morgan. However, it's not important to the output because it's only used relatively within a window.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants