Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusion about SV genotypes in merged delly results #60

Closed
LeileiCui opened this issue Nov 9, 2016 · 2 comments
Closed

Confusion about SV genotypes in merged delly results #60

LeileiCui opened this issue Nov 9, 2016 · 2 comments

Comments

@LeileiCui
Copy link

Hi Tobias,

Recently, I have tried your "Germline SV calling" pipeline by Delly on human NGS data.
Step 1: Using "delly call" to generate the individual-separated BCF files;
Step 2: Using "delly merge" to generate a unified site list;
Step 3: Using "delly call" again to regenerate the new individual-separated BCF files;
Step 4: Using "bcftools merge" to merge all the new individual-separated BCF files together;
Step 5: Using "delly filter" to filter out bad SVs.

According to the definition of VCF file (http://samtools.github.io/hts-specs/VCFv4.2.pdf), it's easily to understand that "./." stands for "missing allele" , "0/0" stands for "homo-reference allele", "0/1" stands for "heterozygote" and "1/1" stands for "homo-alternative allele"

And then my confusions are:
(1) During Step1, if a SV was coded "./." in one individual-separated BCF file, is the "./." stands for "This SV happened in this individual, but we just don't know its genotype" or "This SV didn't happen in this individual"? (In fact, for many other SV Callers who didn't generate SV genotype, it seems they just use include or exclude SV in the BCF file to tell us weather one individual has one SV or not.)

(2) During Step3+Step4, when we try to merge all the individual-separated BCF file together based on a unified site list, many individuals must don't have some SVs instead of genotype missing, so how did you code those locus? (Did you just remove those SVs away just because some individual don't have them or you just code those locus as "./." ?)

Looking for your response!

Best regards,
Leilei

@tobiasrausch
Copy link
Member

The main reason to merge and re-genotype is to get accurate genotypes across the same loci in all samples. This is very powerful for filtering germline SVs and remove redundant SV calls, especially if you have several dozens or hundreds of samples. ./. means that the data is insufficient to make a confident genotype call whereas 0/0 means the data supports hom. reference.

@tobiasrausch
Copy link
Member

By the way, there is a user group that has more information on these usage questions.

https://groups.google.com/forum/#!forum/delly-users

You are, of course, welcome to join that group if you have further questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants