-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High coverage genotyping outputs only G #27
Comments
Hi Liren, Can you try running NanoCaller with `-nbr_t '0.2,0.6' parameter? Also, is it possible for you to share an IGV plot of what this region 511-2232 looks like? |
Hi Mian Umair Ahsan, thanks for your response. I have tried the new parameter that you suggested. It seems it removes the SNPs that are close neighbours. But the G genotypes still exist, see below: ##fileformat=VCFv4.2 For IGV plot, please see attached. |
Dear Developers,
I used the following parameters:
|
Hi, Thank you bringing this to our attention and we would like to test this on our end too. Is this data publicly available for us to download and test? If not, would it be possible for you to share a small sample of the data for us to evaluate? |
Hi, MD5SUM: |
Thank you, I will get back to you shortly after running some tests. |
Hi, the error should be fixed now. There was problem with properly normalizing coverage, and also an integer overflow due to very high coverage. Both issues have been fixed and you should get appropriate variant calls now. With the settings you used, I was able to get the following variants:
Thank you for bringing this to our attention. Let me know if you still have any trouble running NanoCaller. |
Hi, Best, |
Hi Mian,
|
I don't think the indel calling would be affected, but I will take a look at it to make sure. As for v3.0.1 or 3.1.0, I think there may be an issue with syncing version numbers between github and conda. But there was only one release after v3.0.0 so both 3.0.1 on conda and 3.1 on github should be the same. I will try to fix the version numbers. |
Dear Developer,
I am trying to use NanoCaller on a high coverage (2000x on Microbial genomes) datasets. When I omit the sub-sampling process, the result genotypes all turned to G
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=umi1bins>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Depth">
##FORMAT=<ID=FQ,Number=1,Type=Float,Description="Alternative Allele Frequency">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
xxx 511 . T G 37.230 PASS . GT:DP:FQ 1/1:2519:0.2588
xxx 1077 . C G 53.094 PASS . GT:DP:FQ 1/1:2519:0.2060
xxx 1078 . T G 60.799 PASS . GT:DP:FQ 1/1:2519:0.2557
xxx 1944 . C G 40.271 PASS . GT:DP:FQ 1/1:2518:0.2121
xxx 1949 . C G 31.177 PASS . GT:DP:FQ 1/1:2518:0.1898
xxx 2173 . C G 43.839 PASS . GT:DP:FQ 1/1:2517:0.5161
xxx 2232 . C G 38.336 PASS . GT:DP:FQ 1/1:2504:0.1773
Liren
The text was updated successfully, but these errors were encountered: