Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge issues with monoallelic positions #412

Closed
dwaggott opened this issue Apr 15, 2016 · 7 comments
Closed

merge issues with monoallelic positions #412

dwaggott opened this issue Apr 15, 2016 · 7 comments

Comments

@dwaggott
Copy link

I'm not quite sure how to deal with the following scenario.

I'm merging lots of vcf's. Some vcf's have the REF homozygotes called with no ALT specified. When merging using -m none, positions with no ALT don't seem to get merged properly. i.e. each observed genotype at the same position ends up on it's own line.

How might I avoid this so that the merging creates one line per position. I guess this requires two pieces of logic. First that the mono-allelics get merged and second that mono-allelics can be merged with bi-allelics.

If this can't be done via merging is there a way to resolve and collapse the variants using norm. Conventional multi-allelics should be split across multiple lines.

example output:
https://gist.github.com/655c3c579098aa41881c15c68a5aa5cd

@dwaggott
Copy link
Author

Running bcftools norm -m + on the above file seems to just remove the mono-allelic positions. The genotypes 0/0 from these positions are not present in the output.

@pd3
Copy link
Member

pd3 commented Apr 18, 2016

Indeed, this is a bug. There is a new version of bcftools merge which should handle this properly, please check here http://pd3.github.io/bcftools (the commit df7f451 is required for this to work). Thank you for the bug report.

@dwaggott
Copy link
Author

Thanks! Are the fixes in the experimental fork being merged back here?

@dwaggott
Copy link
Author

Output looks wonky. Each GT at the same position is now on it's own line.

https://gist.github.com/cf8c16830b71efea264d577eaf0e03ee

@pd3
Copy link
Member

pd3 commented Apr 20, 2016

Apologies, the bug did not show up in my tests. This commit should fix it pd3@9746e17

RE your question about merging the fixes back into the main repository: The merge command in the experimental fork has been significantly reworked about a year ago and since then we've been waiting for other work to happen before it can be merged back. There is a long overdue bugfix release under way, the merge is planned after that. Unfortunately, I cannot give any time estimate when this finally happens.

@dwaggott
Copy link
Author

Ah, thanks. I'll use the experimental for merging.

I'm still getting multiple lines for the same position.
https://gist.github.com/e7c37b9bd3326db387413ff36391c78d

@pd3
Copy link
Member

pd3 commented Apr 21, 2016

The output you showed was produced by the previous commit df7f451, not the fixed one 9746e17

mcshane added a commit that referenced this issue Apr 28, 2016
Major overhaul of merge to accommodate merging of gvcf files
produced by the new bcftools mpileup.

Update also closes a number of long standing issues.

Closes #412, #408, #361, #296 and possibly resolves #401

[NEWS] Major overhaul of `bcftools merge` to allow merging
       of gvcf files produces by `bcftools mpileup`
mcshane added a commit that referenced this issue Jun 20, 2016
Major overhaul of merge to accommodate merging of gvcf files
produced by the new bcftools mpileup.

Update also closes a number of long standing issues.

Closes #412, #408, #361, #296 and possibly resolves #401

[NEWS] Major overhaul of `bcftools merge` to allow merging
       of gvcf files produces by `bcftools mpileup`
mcshane added a commit that referenced this issue Jul 22, 2016
Major overhaul of merge to accommodate merging of gvcf files
produced by the new bcftools mpileup.

Update also closes a number of long standing issues.

Closes #412, #408, #361, #296 and possibly resolves #401

[NEWS] Major overhaul of `bcftools merge` to allow merging
       of gvcf files produces by `bcftools mpileup`
@mcshane mcshane closed this as completed in 6d202ed Aug 5, 2016
mcshane added a commit that referenced this issue Aug 5, 2016
[NEWS]
* merge: major overhaul of `bcftools merge` to allow merging of gVCF files produced by `bcftools mpileup`
* merge: resolved a number of longstanding issues #296, #361, #401, #408, #412
* merge: new options -- `-F` to control filter logic and `-0` to set missing data to REF
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants