Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip installs broken version of code #26

Open
gszep opened this issue Jul 13, 2021 · 4 comments
Open

pip installs broken version of code #26

gszep opened this issue Jul 13, 2021 · 4 comments

Comments

@gszep
Copy link

gszep commented Jul 13, 2021

On Windows 10 with Python 3.8.5 I've set up a MySQL database and can successfully connect to it from command prompt

> mysql --user=gszep  --password=station --host=localhost pyuniprot
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 78
Server version: 8.0.25 MySQL Community Server - GPL

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

successfully connected it to the python library

>pyuniprot mysql
server name/ IP address database is hosted [localhost]:
MySQL/MariaDB user [pyuniprot_user]: gszep
MySQL/MariaDB password [pyuniprot_passwd]: station
database name [pyuniprot]:
character set [utf8]:
Connection was successful

however when attempt to update the database with a small virus

>pyuniprot update --taxids 133704
WARNING: Update is very time consuming and can take several
hours depending which organisms you are importing!

bin was anderes
346934it [00:11, 30587.45it/s]
Traceback (most recent call last):
  File "c:\programdata\miniconda3\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\programdata\miniconda3\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\gszep\AppData\Roaming\Python\Python38\Scripts\pyuniprot.exe\__main__.py", line 7, in <module>
  File "C:\Users\gszep\AppData\Roaming\Python\Python38\site-packages\click\core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\gszep\AppData\Roaming\Python\Python38\site-packages\click\core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "C:\Users\gszep\AppData\Roaming\Python\Python38\site-packages\click\core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\Users\gszep\AppData\Roaming\Python\Python38\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\gszep\AppData\Roaming\Python\Python38\site-packages\click\core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\gszep\AppData\Roaming\Python\Python38\site-packages\pyuniprot\cli.py", line 72, in update
    database.update(taxids=taxids, connection=conn, force_download=force_download, silent=silent)
  File "C:\Users\gszep\AppData\Roaming\Python\Python38\site-packages\pyuniprot\manager\database.py", line 836, in update
    db.db_import_xml(urls, force_download, taxids, silent)
  File "C:\Users\gszep\AppData\Roaming\Python\Python38\site-packages\pyuniprot\manager\database.py", line 160, in db_import_xml
    self.import_xml(xml_gzipped_file_path, taxids, silent)
  File "C:\Users\gszep\AppData\Roaming\Python\Python38\site-packages\pyuniprot\manager\database.py", line 233, in import_xml
    self.insert_entries(entry_xml, taxids)
  File "C:\Users\gszep\AppData\Roaming\Python\Python38\site-packages\pyuniprot\manager\database.py", line 258, in insert_entries
    self.insert_entry(entry, taxids)
  File "C:\Users\gszep\AppData\Roaming\Python\Python38\site-packages\pyuniprot\manager\database.py", line 280, in insert_entry
    taxid = self.get_taxid(entry)
  File "C:\Users\gszep\AppData\Roaming\Python\Python38\site-packages\pyuniprot\manager\database.py", line 628, in get_taxid
    return int(entry.find(query).get('id'))
AttributeError: 'NoneType' object has no attribute 'get'

on an Ubuntu 18.04 with Python 3.6 we get the same error

Import 173479167 lines:   0%|                         | 346934/173479167
[00:01<12:09, 237194.02it/s]

Traceback (most recent call last):
  File "/home/gszep/.local/bin/pyuniprot", line 8, in <module>
    sys.exit(main())
  File "/home/gszep/.local/lib/python3.6/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/gszep/.local/lib/python3.6/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/gszep/.local/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/gszep/.local/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/gszep/.local/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/gszep/.local/lib/python3.6/site-packages/pyuniprot/cli.py", line 72, in update
    database.update(taxids=taxids, connection=conn, force_download=force_download, silent=silent)
  File "/home/gszep/.local/lib/python3.6/site-packages/pyuniprot/manager/database.py", line 836, in update
    db.db_import_xml(urls, force_download, taxids, silent)
  File "/home/gszep/.local/lib/python3.6/site-packages/pyuniprot/manager/database.py", line 160, in db_import_xml
    self.import_xml(xml_gzipped_file_path, taxids, silent)
  File "/home/gszep/.local/lib/python3.6/site-packages/pyuniprot/manager/database.py", line 233, in import_xml
    self.insert_entries(entry_xml, taxids)
  File "/home/gszep/.local/lib/python3.6/site-packages/pyuniprot/manager/database.py", line 258, in insert_entries
    self.insert_entry(entry, taxids)
  File "/home/gszep/.local/lib/python3.6/site-packages/pyuniprot/manager/database.py", line 280, in insert_entry
    taxid = self.get_taxid(entry)
  File "/home/gszep/.local/lib/python3.6/site-packages/pyuniprot/manager/database.py", line 628, in get_taxid
    return int(entry.find(query).get('id'))
@gszep
Copy link
Author

gszep commented Jul 13, 2021

the scripts seem to have created some tables though

mysql> SHOW TABLES;
+--------------------------------------+
| Tables_in_pyuniprot                  |
+--------------------------------------+
| pyuniprot_accession                  |
| pyuniprot_alternativefullname        |
| pyuniprot_alternativeshortname       |
| pyuniprot_appuser                    |
| pyuniprot_dbreference                |
| pyuniprot_disease                    |
| pyuniprot_diseasecomment             |
| pyuniprot_ecnumber                   |
| pyuniprot_entry                      |
| pyuniprot_entry__keyword             |
| pyuniprot_entry__pmid                |
| pyuniprot_entry__subcellularlocation |
| pyuniprot_entry__tissueinreference   |
| pyuniprot_feature                    |
| pyuniprot_function                   |
| pyuniprot_keyword                    |
| pyuniprot_organismhost               |
| pyuniprot_othergenename              |
| pyuniprot_pmid                       |
| pyuniprot_sequence                   |
| pyuniprot_subcellularlocation        |
| pyuniprot_tissueinreference          |
| pyuniprot_tissuespecificity          |
| pyuniprot_version                    |
+--------------------------------------+
24 rows in set (0.01 sec)

but all of them have 0 rows except this one

mysql> SELECT * FROM pyuniprot_version;
+----+---------------+--------------+--------------+-------------------+-----------------------+
| id | knowledgebase | release_name | release_date | import_start_date | import_completed_date |
+----+---------------+--------------+--------------+-------------------+-----------------------+
|  1 | Swiss-Prot    | 2021_03      | 2021-06-02   | NULL              | NULL                  |
|  2 | TrEMBL        | 2021_03      | 2021-06-02   | NULL              | NULL                  |
+----+---------------+--------------+--------------+-------------------+-----------------------+
2 rows in set (0.00 sec)

it appears the .xml.gz file is being downloaded correctly but is not imported into the SQL database.

@gszep
Copy link
Author

gszep commented Jul 14, 2021

fixed the bug I think this line should be

for entry in entries.iterfind("./entry"):

pip install pyuniprot seems to install a broken version of the code. I now installed the master branch and it seems to work 👍🏼

pip install https://github.com/cebel/pyuniprot/archive/master.zip

@gszep gszep changed the title AttributeError: 'NoneType' object has no attribute 'get' pip installs broken version of code Jul 14, 2021
@psi-cmd
Copy link

psi-cmd commented Jul 29, 2021

@gszep you may keep the issue title, I almost miss out the solution 😂

@psi-cmd
Copy link

psi-cmd commented Jul 30, 2021

BTW, the download function was changed to download and extract in latest code, which means if you have downloaded .xml.gz file in ~/.pyuniprot/data with pyuniprot from pypi, you have to decompress it with gzip -d -c uniprot_sprot.xml.gz > uniprot_sprot.xml. to avoid uniprot_sprot.xml not found error. (If you don't want to download twice)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants