Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue finding ensembl database for homo_sapiens release 112 37 #129

Closed
JackCurragh opened this issue May 20, 2024 · 3 comments
Closed

Issue finding ensembl database for homo_sapiens release 112 37 #129

JackCurragh opened this issue May 20, 2024 · 3 comments

Comments

@JackCurragh
Copy link

What happened?

gget.search() defaults to the first database that matches the specified organism however an error is raised as the db cannot be found although the ftp directory exists and contains the sql file.

I have tested hg38 by specifying the db name and it works fine. Specifying homo_sapiens_core_112_37 reproduces the issue.

I suspect it is an ensembl issue but wanted to make you guys aware as I am a massive fan of the package. Perhaps a check can be added so the default selected in the case of multiple options is definitely a working db?

All the best,
Jack

gget version

0.28.4 - run in google colab

Operating System (OS)

Other (please specify above)

User interface

Google Colab (please include a shareable link above)

Are you using a computer with an Apple M1 chip?

Not M1

What is the exact command that was run?

options = gget.search([gene_name], organism)['ensembl_id']

For context see here. Last cell



### Which output/error did you get?

```shell
WARNING:root:Species matches more than one database. Defaulting to first database: homo_sapiens_core_112_37.

All available databases can be found here:
Vertebrates: http://ftp.ensembl.org/pub/release-112/mysql/ 
Invertebrates: http://ftp.ensemblgenomes.org/pub/release-58 + kingdom + mysql/
---------------------------------------------------------------------------
MySQLInterfaceError                       Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/mysql/connector/connection_cext.py](https://localhost:8080/#) in _open_connection(self)
    245         try:
--> 246             self._cmysql.connect(**cnx_kwargs)
    247             self._cmysql.converter_str_fallback = self._converter_str_fallback

MySQLInterfaceError: Unknown database 'homo_sapiens_core_112_37'

During handling of the above exception, another exception occurred:

ProgrammingError                          Traceback (most recent call last)
12 frames
ProgrammingError: 1049 (42000): Unknown database 'homo_sapiens_core_112_37'

During handling of the above exception, another exception occurred:

MySQLInterfaceError                       Traceback (most recent call last)
MySQLInterfaceError: Access denied for user 'anonymous'@'%' to database 'homo_sapiens_core_112_37'

During handling of the above exception, another exception occurred:

ProgrammingError                          Traceback (most recent call last)
ProgrammingError: 1044 (42000): Access denied for user 'anonymous'@'%' to database 'homo_sapiens_core_112_37'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
[/usr/local/lib/pyWARNING:root:Species matches more than one database. Defaulting to first database: homo_sapiens_core_112_37.

All available databases can be found here:
Vertebrates: http://ftp.ensembl.org/pub/release-112/mysql/ 
Invertebrates: http://ftp.ensemblgenomes.org/pub/release-58 + kingdom + mysql/
---------------------------------------------------------------------------
MySQLInterfaceError                       Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/mysql/connector/connection_cext.py](https://localhost:8080/#) in _open_connection(self)
    245         try:
--> 246             self._cmysql.connect(**cnx_kwargs)
    247             self._cmysql.converter_str_fallback = self._converter_str_fallback

MySQLInterfaceError: Unknown database 'homo_sapiens_core_112_37'

During handling of the above exception, another exception occurred:

ProgrammingError                          Traceback (most recent call last)
12 frames
ProgrammingError: 1049 (42000): Unknown database 'homo_sapiens_core_112_37'

During handling of the above exception, another exception occurred:

MySQLInterfaceError                       Traceback (most recent call last)
MySQLInterfaceError: Access denied for user 'anonymous'@'%' to database 'homo_sapiens_core_112_37'

During handling of the above exception, another exception occurred:

ProgrammingError                          Traceback (most recent call last)
ProgrammingError: 1044 (42000): Access denied for user 'anonymous'@'%' to database 'homo_sapiens_core_112_37'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/gget/gget_search.py](https://localhost:8080/#) in search(searchwords, species, release, id_type, seqtype, andor, limit, wrap_text, json, save, verbose)
    219 
    220         except Exception as e:
--> 221             if "Access denied" in e:
    222                 raise RuntimeError(
    223                     f"""

TypeError: argument of type 'ProgrammingError' is not iterable
.10/dist-packages/gget/gget_search.py](https://localhost:8080/#) in search(searchwords, species, release, id_type, seqtype, andor, limit, wrap_text, json, save, verbose)
    219 
    220         except Exception as e:
--> 221             if "Access denied" in e:
    222                 raise RuntimeError(
    223                     f"""

TypeError: argument of type 'ProgrammingError' is not iterable
@lauraluebbert
Copy link
Member

Hi Jack, Thank you very much for reaching out and for using gget! I am aware of this issue, and it is indeed caused by some structure changes in the latest Ensembl release 112 (specifying release=111 or species=homo_sapiens_core_112_38 should prevent the error, as you also already noted). I'll add a fix to the next gget release and am also working with Ensembl to hopefully prevent further issues in the future.

@JackCurragh
Copy link
Author

JackCurragh commented May 21, 2024

Hi Laura, specifying the release is also leading to an Unknown Database error but species works perfectly just so you are aware.

edit: The unknown database error was occurring because of a misspelling

@lauraluebbert
Copy link
Member

lauraluebbert commented May 30, 2024

This issue should be solved in gget version ≥0.28.6. Please let me know if you encounter any other problems. :)

There are still some issues when calling information about Drosophila and WromBase IDs (affecting gget info), but this is something Ensembl is working on fixing on their end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants