Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unique markers #2

Open
DrLaurenCWhite opened this issue Apr 17, 2019 · 4 comments
Open

unique markers #2

DrLaurenCWhite opened this issue Apr 17, 2019 · 4 comments

Comments

@DrLaurenCWhite
Copy link

Hi,
I've been trying to run CLAPPER, but I get an error that I cannot figure out how to solve.

"Thre are more than one markers in the same position in filename.tped! Every marker must have a unique position."

I have double checked that all positions are unique. However, I don't have genetic distances in my tped file. This values of this column are all set to 0. (in my options file #conditiononld is also set to 0). Could this be the problem? Can you suggest a way around it?

Cheers

@amyko
Copy link
Owner

amyko commented Apr 17, 2019 via email

@DrLaurenCWhite
Copy link
Author

Thanks for the quick reply! And yes, that's really helpful. Cheers

@agilly
Copy link

agilly commented Apr 28, 2019

Hi there, I'm having the same issue. I am not familiar with cM distances and recombination. My PLINK files don't have recombination rates, so I have to add them. I fetched them from : http://bochet.gcc.biostat.washington.edu/beagle/genetic_maps/plink.GRCh38.map.zip

and I converted them to a format PLINK likes, using the recombination rate average of 1.2:

for i in {1..22}; do awk '{print $4, "1.2", $3*100}' plink.chr$i.GRCh38.map | sponge plink.chr$i.GRCh38.map; done

This gives a file like this per chromosome (position, recomb_rate, Morgan distance):

55550 1.2 0
82571 1.2 8.0572
88169 1.2 9.2229
285245 1.2 43.9456
629218 1.2 147.815
629241 1.2 147.821
630053 1.2 148.056
632942 1.2 148.889
633147 1.2 148.948
785910 1.2 193.179

I then annotate my file with plink --cm-map plink.chr@.GRCh38.map

I still get the same error. What am I doing wrong?

Thanks,

A

@agilly
Copy link

agilly commented May 1, 2019

So, it turns out that if you have sequencing data, you might have consecutive variants that are really close, and therefore they might have the same coordinates in Morgan. I went around this by removing variants that have the same coordinates. Another way would be to add a random small noise to the duplicates using R.

I also ran into another error after that saying that my coordinates were decreasing. It turns out the chromosome map resets after every chromosome end, whereas clapper expects monotonically increasing positions throughout the genome. This is solved by adding the previous position at the beginning of each chromosome.

I wonder if it would be possible to use chromosome and position instead of cM distance in a future release. Files annotated with cM distance are now becoming rarer and rarer, and recombination maps are hard to find for new builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants