Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gget seq encounters missing gene name from uniprot and throws type error #107

Closed
rhngla opened this issue Oct 13, 2023 · 2 comments · Fixed by #109
Closed

gget seq encounters missing gene name from uniprot and throws type error #107

rhngla opened this issue Oct 13, 2023 · 2 comments · Fixed by #109

Comments

@rhngla
Copy link

rhngla commented Oct 13, 2023

What happened?

  • Error only occurs only for a small fraction of ids (I went through about 19,000).
  • Issue: df_uniprot['gene_name'] is NaN (which is np.float, gget_seq.py expects str)

ids where I encounter the error:

['ENSG00000275740', 'ENSG00000249624', 'ENSG00000288716', 'ENSG00000288712', 
 'ENSG00000288708', 'ENSG00000288706', 'ENSG00000288654', 'ENSG00000288646', 
 'ENSG00000288644', 'ENSG00000288634', 'ENSG00000288626', 'ENSG00000288625', 
 'ENSG00000288623', 'ENSG00000288608', 'ENSG00000288570', 'ENSG00000286224', 
 'ENSG00000286131', 'ENSG00000288629', 'ENSG00000288645', 'ENSG00000284934', 
 'ENSG00000284895', 'ENSG00000284684', 'ENSG00000285976', 'ENSG00000288715',
 'ENSG00000257046']

gget version

0.27.9

Operating System (OS)

Linux, macOS

User interface

Python

Are you using a computer with an Apple M1 chip?

Not M1

What is the exact command that was run?

import gget
gget.seq(['ENSG00000257046'], translate=True)

Which output/error did you get?

Thu Oct 12 18:16:04 2023 INFO Requesting amino acid sequence of the canonical transcript ENST00000540229 of gene ENSG00000257046 from UniProt.
Thu Oct 12 18:16:05 2023 WARNING No reviewed UniProt results were found for ID ENST00000540229. Returning all unreviewed results.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/fruity/miniconda3/envs/biopy/lib/python3.8/site-packages/gget/gget_seq.py", line 379, in seq
    ">"
TypeError: can only concatenate str (not "numpy.float64") to str
lauraluebbert added a commit that referenced this issue Oct 16, 2023
@lauraluebbert
Copy link
Member

Hi Rohan, thank you very much for bringing this to my attention. I pushed a fix to the dev branch, which will be part of the next release. You can install gget from the dev branch as shown here:
https://colab.research.google.com/drive/1ifJeQPEzrziF0kkh7SX_Vd3NEe6QoUx3?usp=sharing

lauraluebbert added a commit that referenced this issue Oct 16, 2023
@rhngla
Copy link
Author

rhngla commented Oct 20, 2023

Type conversion fix works - thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants