-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compatibility of paragraph and manta #42
Comments
Related, Manta calls inversions using two I renamed the question to fit both queries |
Hi there, sorry for the thread, got two more errors. One about unresloved reported by manta (ALV values have a placeholder The second is |
Hi Kamil, You have quite a few questions and here I try to answer them in one comment. Let me know if I miss anything!
|
Cool, thanks a lot for a quick and clear response.
I had this worry before, but manually inspecting the calls showed that loads of the SV calls in different individuals have actually exactly the same borders, suggesting that at least some of SVs have well-resolved borders. Anyway, it will take a bit of exploring on my side now. I will let you know how it went. |
Hello again, I am looking at this manta conversion script and I started to turn it into
I am actually not sure if all this will yield in success, I am testing § in parallel. So, my question is, is this script of any interest to § or manta developers? If yes, I am happy to do a PR. More generally, are you interested to tune the tools for non-model datasets? Wonder how much effort I should make to share my progress/findings... |
Hello again, I really hope this spamming is not too annoying. I am just actively trying to make paragraph work on my data and every time I find it relevant to you, I get back here. This time I got an error I don't actually understand
Not even sure where to start to look for the cause... |
The command:
The error:
|
Interesting it's very rare to see an error message from the genotyper without any further details. Will it be ok to share your input VCF? |
The
Here is the vcf file: Tms_00_manta_diploidSV_corrected.vcf.zip |
|
Sorry for a delayed answer, it took me a while to get everything sorted out. I followed your advice. I adjusted the Manta script for converting BND to INV and added all the other adjustments you suggested, the script is available here. Furthermore, I added a flag to separate SVs to INV/INS/DEL/DUP, to speed up the testing. Then, I run in parallel all four SV types, but I ended up with the problem reported abobe #42 (comment) for three of them (INS/DUP/DEL), while I found that although that's all that is in log, there are some temporary files created. There are a bunch of
followed with plenty of other lines My apologies for another long post. I am trying to provide as many details as possible. The vcf entry:
The json file:
--- edit --- I tried to dig a bit deeper. So I managed to turn on the info log, to zoom better to place where the program fails, and I get:
|
I think figured it out. Paragraph can not handle zipped fasta files. |
Sorry for the silence. I think I have now managed to get everything working. This script takes the manta vcf output file (the diploidSV.vcf) and generates a paragraph compatible file while reporting the filtering stats (not all SVs are "genotypable"). If you would be interested in having the script as part of this repo, let me know I can do a PR. I also found what non-INS SVs with the SVINSSEQ tag are. These are variants that are a combination of something and insertion (for instance when there is an insertion on a side of inversion): It's briefly discussed here: Illumina/manta#158 I am closing the issue as I believe I finally managed to make manta and paragraph compatible. Thanks for your help, it was truly essential. |
Hello, thanks a lot for making paragraph!
I am just figuring out how it works now, so I started with just taking one
.vcf
file generated by manta, and I used the exact same.bam
file to genotype the variants I called (just to see the consistency of manta and paragraph), but I got back an errortried to dig a bit, but the lines of code were not very indicative of why manta does not have troubles to call variants close to the scaffold edges and paragraph does. I removed all variants that started < 150 bases from the start of scaffolds and restarted genotyping and now it seems it runs.
So, I wanted to ask. What is the point? Why it is not possible to genotype SVs close borders? And would it be worth making manta and paragraph compatible?
Thanks
background
I have a bunch of Illumina reseq data (1 reference, 5 reseq individuals) with reasonable coverage (~60x ref, ~15x reseq). I have a non-model species, i.e. without a good library of SVs, but I still think that genotyping individuals is by far smarter idea than just merging SV calls. I am just figuring out what is the best way to create a library of SVs out of SV calls that I will feed to Paragraph to get the same data genotyped on the pool variants I found in the population.
I was thinking before about using SURVIVOR, but the merging does not explicitly resolve the sequences of SVs (discussed here), now I am thinking about just pasting
.vcf
files of all 6 individuals while filtering out only the exact overlaps. Not sure what is the best approach here, any input welcome.The text was updated successfully, but these errors were encountered: