Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stoi: no conversion error #2

Open
bw2 opened this issue Jan 10, 2021 · 3 comments
Open

stoi: no conversion error #2

bw2 opened this issue Jan 10, 2021 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@bw2
Copy link

bw2 commented Jan 10, 2021

I got this error on one out of ~300 samples. When I ran

REViewer --reads CDS-ulp4x8.expansion_hunter4_realigned.sorted.bam --vcf CDS-ulp4x8.expansion_hunter4.vcf --reference hg38.fa --catalog ./FXN_variant_catalog_with_0.01_threshold_off_targets.json --locus  FXN-chr9-69037286-69037304-GAA --output-prefix CDS-ulp4x8_FXN-chr9-69037286-69037304-GAA_ExpansionHunter4

the output was

[2021-01-10 18:14:17.325] [info] Loading specification of locus FXN-chr9-69037286-69037304-GAA
[2021-01-10 18:14:17.389] [info] Extracted 79 frags
[2021-01-10 18:14:17.389] [info] Calculating fragment length
[2021-01-10 18:14:17.389] [info] Fragment length is estimated to be 171
[2021-01-10 18:14:17.389] [info] Extracting genotype paths
[2021-01-10 18:14:17.393] [error] stoi: no conversion

for these input files:
files.zip

@egor-dolzhenko
Copy link
Contributor

Thank you for catching this! The error was caused by a missing genotype, which prevents REViewer from determining haplotype sequences. I just updated to program to generate the following error message:

[error] Cannot create a plot because the genotype of FXN-chr9-69037286-69037304-GAA is missing

Do you think this is sufficient to address the issue?

@bw2
Copy link
Author

bw2 commented Jan 11, 2021

Thanks, yes.
I know it's standard to generate missing genotypes in SNP calling pipelines, when there's no coverage etc. and it's not considered an error.
IIUC, EH also doesn't treat this as an error, or even a warning. The EHv4 log output for the above example was

2021-01-10T21:44:06,[Starting ExpansionHunter v4.0.2]
2021-01-10T21:44:06,[Analyzing sample CDS-ulp4x8.wgs_ccle]
2021-01-10T21:44:06,[Initializing reference /gcsfuse_mounts/gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fasta]
2021-01-10T21:44:06,[Loading variant catalog from disk FXN_variant_catalog_with_0.01_threshold_off_targets.json]
2021-01-10T21:44:06,[Running sample analysis in seeking mode]
2021-01-10T21:44:06,[Analyzing FXN-chr9-69037286-69037304-GAA]
2021-01-10T21:44:09,[Analyzing FXN-chr9-69037286-69037304-GAA-official]
2021-01-10T21:44:09,[Analyzing FXN-chr9-69037286-69037304-GAA-with-0.01-threshold-off-targets]
2021-01-10T21:44:41,[Writing output to disk]

One thought is - both EHv4 and REViewer should treat this as normal, or both should treat it as an error (or warning?). It would make sense to me if REViewer just showed an empy plot for no-coverage situations. A warning by EH and/or REViewer might also be helpful so pipelines that expect to get all genotypes could catch it and treat it as an error.

@egor-dolzhenko
Copy link
Contributor

Thanks for the suggestion! Generating an empty plot (perhaps with a warning message) sounds good. I'll implement this.

@egor-dolzhenko egor-dolzhenko added the bug Something isn't working label Jan 14, 2021
@egor-dolzhenko egor-dolzhenko self-assigned this Jan 21, 2021
@egor-dolzhenko egor-dolzhenko added this to To do in Improvements Jan 21, 2021
@egor-dolzhenko egor-dolzhenko moved this from To do to In progress in Improvements Jan 21, 2021
egor-dolzhenko added a commit that referenced this issue Jun 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Improvements
In progress
Development

No branches or pull requests

2 participants