Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancy between number of SNVs and count of signatures #24

Open
kmavrommatis opened this issue Mar 23, 2023 · 0 comments
Open

Discrepancy between number of SNVs and count of signatures #24

kmavrommatis opened this issue Mar 23, 2023 · 0 comments

Comments

@kmavrommatis
Copy link

Hi,
i am working with a VCF file containing 2950214 records.
This is the output of filtering by bcftools for high quality snps, and after normalization, i.e. multiallelic events are split to separate records.
After running helmsman I get the output with a total of 4329064 events distributed in various triplets

The command line I run is:

python /opt/helmsman-1.5.3/helmsman.py --input test.filter.vcf.gz --fastafile genome.fa -w --mode vcf -m test --projectdir .

For comparison running the equivalent process with R MutationalPatterns results in the expected number of events (i.e. the sum of all triplet events is the same as the input vcf)

Am I missing something here?
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant