Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The script returns '-2147483648' in the TSV file if any value is a period ('.') in the Sample ID column of the VCF file. #2

Closed
razshaikh opened this issue Jan 11, 2022 · 9 comments

Comments

@razshaikh
Copy link
Contributor

No description provided.

@razshaikh
Copy link
Contributor Author

Input VCF File values

Screenshot from 2022-01-11 12-43-49

Output TSV File values

Screenshot from 2022-01-11 12-46-12

@razshaikh razshaikh changed the title The script returns '-2147483648' in the TSV file if any value is a period ('.') in the Format column of the VCF file. The script returns '-2147483648' in the TSV file if any value is a period ('.') in the Sample ID column of the VCF file. Jan 11, 2022
@sigven
Copy link
Owner

sigven commented Jan 11, 2022

Thanks @razshaikh for letting me know about this error! Could you share your VCF with me, so I can test and make sure that it can be fixed? It might be related to the underlying library (cyvcf), but I will check and see if I can fix it.

best,
Sigve

@razshaikh
Copy link
Contributor Author

razshaikh commented Jan 11, 2022

Hi @sigven, thank you for your response, Really appreciated.
Please find the sample VCF below:
test_sample_run.zip

@razshaikh
Copy link
Contributor Author

Hi @sigven, any suggestions on how to solve this issue.

Thanks and Regards,
Razin

@sigven
Copy link
Owner

sigven commented Jan 19, 2022

Hi @razshaikh,
I am working on it, sorry for the delayed response.

@razshaikh
Copy link
Contributor Author

Hi @sigven, I understand you must be having other tasks on your hand. I really appreciated you looking into it. Thank you.
Best Wishes,
Razin

@sigven
Copy link
Owner

sigven commented Jan 25, 2022

Hi again @razshaikh,

Been digging a bit, and it seems this matter is related to how cyvcf2 works. See this related issue, for instance.

Either way, I have now added a simple check for this in vcf2tsv (v0.3.6), which should fix the error you encountered. On another note, your test VCF was full of other strange formatting errors, so I had to change it quite a bit for a test run.

kind regards,
Sigve

@sigven sigven closed this as completed Jan 25, 2022
@razshaikh
Copy link
Contributor Author

Hi @sigven,
Hope you are doing well.

Thank you for the update. I checked it, and it works for the 'GQ' values, but other values still has the same issue.

For example: the values in AD column, if any value is missing and has a period('.') instead, the latest script still replaces it with '-2147483648'.

Let me know if you have any suggestions for it.

Thanks and regards,
Razin

@razshaikh
Copy link
Contributor Author

Hi @sigven

I have added a pull request, let me know if that makes sense.

Best wishes,
Razin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants