Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

50bp deletion not detected #64

Closed
HenrivdGeest opened this issue Nov 15, 2021 · 4 comments
Closed

50bp deletion not detected #64

HenrivdGeest opened this issue Nov 15, 2021 · 4 comments

Comments

@HenrivdGeest
Copy link

Im strugling with clair3 to get a ~50bp deletion detected. Any other variant I can see by eye is detected, but I have a deletion in ~ 100% of my reads, but for clair3 it seems hard to find.
image

I use the 0.1-r8 release, with the following parameters:
/opt/bin/run_clair3.sh --bam_fn ${NEWBASE}.bam --ref_fn ${reference} --threads=4 --platform=ont --model_path="/opt/models/ont"
--output ${NEWBASE}_clair3 --include_all_ctgs --snp_min_af=0.01 --indel_min_af=0.001
My data is ONT, pcr amplicon reads, >1000x coverage, downsampled to ~300x.
I've tested a few subsamplings, and in 1 case at 50x coverage it did find the deletion, but in another random sampling to again 50x, it did not find it.
Is there something I can do about this, or tweak some parameter so that it does find it?

@aquaskyline
Copy link
Member

Clair3 reports SNP and indels, and it follows the definition that indels are <50bp (≥50bp are structural variants). A hardcoded threshold in Clair3 limits the longest indels to be reported (https://github.com/HKU-BAL/Clair3/blob/main/clair3/CallVariants.py#L27). Elevating the threshold enables Clair3 to report insertions and deletions ≥50bp. But we have not tested Clair3's performance on indels ≥50bp, it very much depends on the read length and the performance of the sequence aligner.

@HenrivdGeest
Copy link
Author

I see, that makes sense. I will try a SV caller for this example. Thanks!

@aquaskyline
Copy link
Member

No. Actually, you reminded me to question the rationality of using 50bp as the indel size cutoff in the ONT era. The 50bp cutoff was to a large extent a practical cutoff imposed by the short length of NGS reads. Before NGS, the cutoff was 1kbp. I will be extending the limit to a larger value say maybe somewhere between 200bp to 500bp depending on the maximum reliable length of an opened gap in a typical length ONT read.

Indel length cutoff elevation scheduled for v0.1-r9, stay tuned.

@aquaskyline
Copy link
Member

Please try out the latest version v0.1-r9 with option --enable_long_indel to see if the 50bp deletion in interest could be identified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants