-
-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with Pathogen Profiler combine_vcf_variants.py script #345
Comments
Hi @taranewman , Thanks for reporting this. I'll take a look now. |
@taranewman , can you give me the command you used to download the reads and run tb-profiler? I ran it through without any error on my system but maybe there might be some differences in our commands |
Hi @jodyphelan , Thanks so much for taking a look. These are the commands: Download the reads:
Run fastp Run tb-profiler:
Please let me know if you need any other info! |
Thanks, running those commands still seems to work for me to it might be the versions of one of the packages? |
Thanks for testing that out! Attached is the conda environment. This has bcftools v1.20. I've tried downgrading this environment to use bcftools v1.12 which was the version we were using prior to the update but didn't have luck there. |
Good to know, thank you for your time checking this! I am running this within a nextflow wrapper so I'll look further on my side if there is something with nextflow that could be causing this. |
Hi again, I manually ran
to produce I then used this vcf file as input into the combine_vcf_variants.py script within a jupyter notebook. I'm not sure why the error only occurred when running with the nextflow wrapper, but it looks like 'AF' is missing from the variant info here: Do you think adding a line |
One thing I did notice was that the vcf file produced when running tbprofiler on this SRA sample outside of nextflow was that the VCF file was empty Whereas the file produced when running the same command within nextflow had many more SNPs: In this issue here it seems the AF column may need to be calculated first by the AN and AC columns? samtools/bcftools#1060 |
Hi again, just double checking - are you using
Seems to work on my end. Bcftools also seems to be giving an empty vcf for me so I'll investigate that. |
Thanks Jody! Yes, I am using bcftools. |
Thanks, I think we're getting closer. When bcftools doesn't call any variants then the Why bcftools isn't working I'm not entirely sure yet. I noticed in your commands you seem to be using the forward read twice, is this a typo?
This might explain why bcftools doesn't call any variants. |
Oops so sorry! Yes that was a typo, thanks for noticing that! When I fix the command to use the reverse read, I'm getting the same variants in the vcf file and AF error I got when running within the nextflow wrapper. Mystery solved :) |
Great, I'll release the patch this week |
Thank you Jody! Will this be a pathogen-profiler v4.1.1 patch release? |
I've made a release as v4.2.0 as there were are few bigger things I changed in pathogen-profiler. This will be paired with tb-profiler v6.2.1. Should be available on conda tomorrow. |
Hello,
I came across an error with the Pathogen Profiler combine_vcf_variants.py script that seems to occur in approximately half of my samples with TBProfiler v6.2.0. The same samples previously ran successfully using v4.3.0. The samples causing this error don't appear to have a clear lineage/QC pattern.
An SRA sample that produces this error is SRR10869015
If line 171 is commented out, then everything appears to run fine.
System specifications: conda, Linux HPC, SLURM
The text was updated successfully, but these errors were encountered: