Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with protein/transcript identifiers #2

Open
TDDB-limagrain opened this issue Aug 16, 2022 · 1 comment
Open

Problem with protein/transcript identifiers #2

TDDB-limagrain opened this issue Aug 16, 2022 · 1 comment

Comments

@TDDB-limagrain
Copy link

Hi Hesham,

this is not a real problem but solving it would make life easier :-D
I successfully ran VCF2PROT v0.1.4 but I add to correct the transcript identifier in the VCF file as well as in the reference fasta file.

My bcftools-annotated VCF file has Solyc02g062560.3|Solyc02g062560.3.1 identifiers in the BCSQ fields and my protein file header is Solyc02g062560.3.1.
It seems that the . in the sequence name is causing some problem and in this case, the output .fasta file was empty. After removing the end of the sequence name (moving to Solyc02g062560.3|Solyc02g062560 in the VCF and to Solyc02g062560 in the reference fasta), vcf2prot finally succeeded in writing the proper corrected sequences.

Hope it will help for the future!

Best regards,

Thomas

@HeshamElAbd
Copy link
Collaborator

Hi Thomas,

Indeed, this is an important feature to have, current version depend upon exact matching between the name of the transcript in the VCF file and in the reference FASTA file. Nevertheless, I will try my best to add a 'smarter function' that would parse the transcript names in the VCF and FASTA files, respectively, and then do the matching. This features shall be added to included in the next release of VCF2Prot (0.1.5). Hopefully coming within 3-4 weeks.

Thanks a lot for your time and feedback,
All the best,
Hesham

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants