-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when using physical map sorted based on physical position and ID #26
Comments
Hi Vanessa, Sorry for the delay in getting back to you. At the moment selscan can only handle biallelic variants, and so when multiple variants are reported at the same physical position it will throw this error, although I can see how it can be a confusing message. I think your best bet would be to filter these two sites from your dataset. I hope this helps! -Zach |
Hi Zach, |
Hi Zach, My problem is along the same lines as Vanessa's so I am posting in the same thread. I am trying to use selscan to calculate EHH scores. I have 1000 genomes vcf files which I have used to produce map files with vcftools, such that they look like this:
My selscan command is:
This produces the following error:
Now I have tried to identify the problem row by trying to grep for "-9999" but get nothing. I have also tried to sort on the physical position column but get the same error. There are no blank rows at the start or end of the file. To ensure there was no issue with my map file, I tried using different chromosomes but keep getting this error. By the way, I have also tried using hapbin with the same files using the following command:
But I always get an error:
I have checked and the locus is definitely within the .map and the .hap files (which were created from the vcf files). Therefore I think the problem must be within my map files but I cannot fathom what the issue is. |
So my first thought is that you should request the site by rsid and not position. Please try |
Hi Zach, Many thanks for getting back to me. Unfortunately this does not solve the issue. I still get the exact same error. Also would using IDs not prove an issue for de novo variants that have not been assigned an ID? |
Sorry for the delay in getting back to you. Yes, I think that I should modify the lookup scheme to allow for rsid or genomic position. I typically assign variants without an rsid a temporary id based on the chromosome and position, but I forget this isn't what everyone does. Are you using a publicly accessible vcf file? I would like to try to reproduce this problem. |
Physical map duplicated locations are now allowed, and statistics that are integrated over a map can directly use physical positions with |
Hi Zach, I am new to Selscan and I am having a similar issue. I have been able to get nSL output for one chromosome but when I attempt to run whole genomes, I get this error: ERROR: Variant physical position must be monotonically increasing. example code: It looks like --pmap is not available for nSL . . . any thoughts? Thanks in advance for your help (and for making Selscan user friendly), 😃 |
Hi Tim,
You’ll have to separate your files by chromosome and run each separately.
You can then normalize them together with norm. Let me know if you have
more issues.
Zachary
Le mer. 14 juin 2023 à 5:42 PM, TimothyCiesielski ***@***.***>
a écrit :
… Hi Zach,
I am new to Selscan and I am having a similar issue. I have been able to
get nSL output for one chromosome but when I attempt to run whole genomes,
I get this error:
ERROR: Variant physical position must be monotonically increasing.
2:10610:G:A 10610 appears after 1:248945650:C:G 248945650
example code:
selscan --nsl --vcf nameofVCFfile.vcf --out selscannSLresults
It looks like --pmap is not available for nSL . . . any thoughts?
Thanks in advance for your help (and for making Selscan user friendly),
Tim
😃
—
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABAKRQSZ7B43GMT5TLGUK3TXLIV5DANCNFSM4ECUKICQ>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
Thanks Zach - I appreciate the help on this. |
Hello! Please tell me how to solve the problem with the sheep genome map file.
With best regards, Lesya |
Hello,
From my perspective, the important question is why are there sites with map
positions out of order relative to the physical positions. In principle
this should be impossible, so I would investigate why this seemed to
happen. I could see this possibly resulting from a liftover of a genetic
map between genome builds, for example.
However, the simplest solution would be to drop one of the two offending
sites from your data. If one of the two sites has low MAF, you might as
well drop that one, as selscan would filter it out anyway.
Hope this helps,
Zachary
…On Tue, Mar 26, 2024 at 3:10 PM malteze2024 ***@***.***> wrote:
Hello! Please tell me how to solve the problem with the sheep genome map
file.
If the map file is sorted by genetic position, the program generates a
physical position error and vice versa. Of the 26 chromosomes, only 12 are
processed without errors.
The initial map file was generated through GenomeStudio.
for reg in $(seq 1 26) ; do selscan --xpehh --vcf
phasedRMMrenchr$reg.vcf.gz --vcf-ref phasedDMrenchr$reg.vcf.gz --map
MAP_sorted$reg.map --threads 12 --out 2xpEhhcheap$reg; done
selscan v2.0.0
Opening phasedRMMrenchr1.vcf.gz...
Loading 108 haplotypes and 61259 loci...
Opening phasedDMrenchr1.vcf.gz...
Loading 106 haplotypes and 61259 loci...
Opening MAP_sorted1.map...
Loading map data for 61259 loci
ERROR: Variant genetic position must be monotonically increasing.
oar3_OAR1_101700644 122.639 appears after oar3_OAR1_101688882 122.64
for reg in $(seq 1 26) ; do selscan --xpehh --vcf
phasedRMMrenchr$reg.vcf.gz --vcf-ref phasedDMrenchr$reg.vcf.gz --map
2MAP_sorted$reg.map --threads 12 --out 2xpEhhcheap$reg; done
selscan v2.0.0
Opening phasedRMMrenchr1.vcf.gz...
Loading 108 haplotypes and 61259 loci...
Opening phasedDMrenchr1.vcf.gz...
Loading 106 haplotypes and 61259 loci...
Opening MAP_sorted1.map...
Loading map data for 61259 loci
ERROR: Variant physical position must be monotonically increasing.
OAR19_64803054.1 204694 appears after DU281551_498.1 315497
With best regards, Lesya
—
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABAKRQQGITP7HJ7P27QDSKTY2HB2DAVCNFSM4ECUKIC2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBSGEZDOMBWGMZQ>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
Thank you for such a quick response! |
Hello,
Well, generally they should be comparable, although you may find slightly
more extreme scores in regions of low recombination. You could also choose
to use xp-nsl, which doesn't use either distance, although it may still
have similar properties. On the whole, I don't think it is too much of a
concern.
Zachary
…On Wed, Mar 27, 2024 at 4:22 AM malteze2024 ***@***.***> wrote:
Thank you for such a quick response!
Apparently, I will still have to use the --pmap option.
Does using a physical map have a big impact on my results? There are 600k
SNP in my file.
—
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABAKRQSMWQCQNFXTPDYFYITY2J6TFAVCNFSM4ECUKIC2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBSGIYTQNZXGQ3A>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
hello sir i m working on cow genome and need to run selscan for ihh12. I have phased the the 29 chromosomes into a single vcf file. I am running the command the error it is showing is variant physical position must be monotonically increasing. i am just starting my studies in bioinformatics. can you guide me how to navigate through it. thank you sanchit |
Hello,
You will need to split your vcfs by chromosome in order to run it through
selscan.
…-Zachary
On Mon, Apr 22, 2024 at 2:40 AM drsancho ***@***.***> wrote:
selscan.problem.jpg (view on web)
<https://github.com/szpiech/selscan/assets/167742045/ebdd5f39-62b2-4cf4-9765-e199c37ceece>
i have tried sorting also, it gives the similar error using command sort
-nk 4 xyz.map > xyz1.map
—
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABAKRQVCJGR5K7HZ6GLDAWTY6SWHPAVCNFSM4ECUKIC2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBWHA3DAMRSGQ2Q>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
okay sir |
it is showing the same error that is variant genetic position should be monotonically increasing. |
Hello,
This error means that your positions are out of order in your file. You
need to either put them in order or remove the sites that are out of order.
…-Zachary
On Wed, Apr 24, 2024 at 12:59 AM drsancho ***@***.***> wrote:
it is showing the same error that is variant genetic position should be
monotonically increasing.
can you please help me further?
—
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABAKRQVTRSPIOBD2ZVPOZOLY6432HAVCNFSM4ECUKIC2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBXGQYDGMBSGQ2A>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
I'm running into a silly problem. My physical map file has repetitive physical positions with unique IDs. The data is sorted based on and then in . Example at the end of the message.
When I try to use iHS, I get the following problem: ERROR: Variant physical position must be strictly increasing.
rs201044430 216605 comes after rs112068709 216605
My data is already sorted so that 'rs201044430 216605' comes after 'rs112068709 216605'. So I'm not sure what to do differently.
Best,
Vanessa
Sample file
7 rs28527214 216426 216426
7 rs66644650 216512 216512
7 rs148463803 216515 216515
7 rs28485819 216569 216569
7 rs28498692 216570 216570
7 rs112068709 216605 216605
7 rs201044430 216605 216605
7 rs188651719 216660 216660
7 rs193275413 216662 216662
7 rs137869704 216672 216672
7 rs139968177 216735 216735
The text was updated successfully, but these errors were encountered: