Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import of Unimod schema crashes when Unimod is down #128

Closed
jonasscheid opened this issue Oct 5, 2023 · 5 comments · Fixed by #129
Closed

Import of Unimod schema crashes when Unimod is down #128

jonasscheid opened this issue Oct 5, 2023 · 5 comments · Fixed by #129

Comments

@jonasscheid
Copy link

jonasscheid commented Oct 5, 2023

Thanks for the great tool!

I ran today in this issue

File "/usr/local/lib/python3.10/site-packages/pyteomics/proforma.py", line 299, in database
  self._database = self.load_database()
File "/usr/local/lib/python3.10/site-packages/pyteomics/proforma.py", line 360, in load_database
  return Unimod()
File "/usr/local/lib/python3.10/site-packages/pyteomics/mass/mass.py", line 1085, in __init__
  self._tree = etree.parse(urlopen(source))
File "/usr/local/lib/python3.10/urllib/request.py", line 216, in urlopen
  return opener.open(url, data, timeout)
File "/usr/local/lib/python3.10/urllib/request.py", line 525, in open
  response = meth(req, response)
File "/usr/local/lib/python3.10/urllib/request.py", line 634, in http_response
  response = self.parent.error(
File "/usr/local/lib/python3.10/urllib/request.py", line 563, in error
  return self._call_chain(*args)
File "/usr/local/lib/python3.10/urllib/request.py", line 496, in _call_chain
  result = func(*args)
File "/usr/local/lib/python3.10/urllib/request.py", line 643, in http_error_default
  raise HTTPError(req.full_url, code, msg, hdrs, fp)

urllib.error.HTTPError: HTTP Error 404: Not Found

Since unimod is down this error is thrown. This impacts lots of tools build on pyteomics (deeplc, ms2pip etc.).
Is there a way to implement a fall-back option if Unimod is not accessible?
Thanks!

@mobiusklein
Copy link
Contributor

mobiusklein commented Oct 8, 2023

It was decided in #82 to not bundle a copy of Unimod with Pyteomics, but to let other libraries that depend upon it being available even when the Unimod web server is not can bundle a specific version with their code and use pyteomics.proforma.set_unimod_path to resolve Unimod.

Relevant piece from the documentation: https://pyteomics.readthedocs.io/en/latest/api/proforma.html#cv-disk-caching

@levitsky
Copy link
Owner

levitsky commented Oct 8, 2023

Thank you @mobiusklein for the recap. Indeed, I was hesitant of including the copy of Unimod and increasing the distribution size 10-fold, even though it's still not a lot.

I am still open to ideas though. I realize that the current solution puts some extra load on the users and it doesn't seem ideal.
To me, an ideal solution would be to optionally install a fallback copy at installation time. However, it's not immediately clear to me how to do it cleanly in a way compatible with modern Python packaging tools.

@mobiusklein
Copy link
Contributor

One option would be to depend upon the caching and fallback behavior in psims when it is available, which would be an install-time option already.

Another would be to write another caching mechanism in pyteomics whereby a backup copy is downloaded and stored the first time Unimod is used and updated periodically if the version changes by checking the version tag/digest. This could be done with around 300 lines or less, depending upon whether we aim for full XDG compatibility or just default to ~/.pyteomics_cache or something similar (or copy https://github.com/matplotlib/matplotlib/blob/main/lib/matplotlib/__init__.py#L510-L637).

I think going the psims route is better because A) it's already there so there's no duplication of files or effort, B) it automatically uses the same caching that all the other CVs in pyteomics.proforma do, and C) managing config files and caches is a huge pain for a library.

Some applications might need to keep all their data in one place for easy deletion on uninstall, or they need to keep the reference file consistent so the program doesn't abruptly break because of a remote change.

@levitsky
Copy link
Owner

levitsky commented Oct 9, 2023

@mobiusklein thank you for bearing with me. I looked at psims CV code again and I want to make sure I understand how it works.

As far as I can tell, there are two different mechanisms in there, caching and fallback. While caching needs to be enabled and configured at runtime, fallback to bundled versions is always available and psims covers Unimod fallbacks seamlessly.

If that is the case and we can add a psims Unimod resolver to the current functionality, that would be absolutely awesome because psims, if available, would work even offline and without a local cache from a prior download, but ProForma parsing (with Unimod refs) would still be possible without necessarily installing psims dependencies.

@jonasscheid
Copy link
Author

Thank you! 🙏🏼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants