Skip to content

Lazily load cli with module __getattr__ #13721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

hoodmane
Copy link

This is a reland of #12962. That PR broke accesses to spacy.cli:

import spacy
spacy.cli # NameError

This avoids such a problem by using module __getattr__.

@hoodmane
Copy link
Author

Well this does not work quite yet.

@hoodmane
Copy link
Author

Okay now it's better.

@nro-bot
Copy link

nro-bot commented Dec 23, 2024

Hi @adrianeboyd, spacy + pyodide still blocked on this and wondering if you could take a look! Thanks (and also to hoodmane)

@nro-bot
Copy link

nro-bot commented Jan 3, 2025

Would a comment like

Lazily load `cli` submodule (used by `info`) to avoid requiring packages such as `requests` to run SpaCy

be appropriate for __getattr__?

And for curiosity, is __all__ required for lazy loading or added as "this is good practice"?
Thanks!

@hoodmane
Copy link
Author

Yes. And __all__ is good practice but it's also needed to avoid dropping cli from the things imported by from mod import *

@nro-bot
Copy link

nro-bot commented Feb 13, 2025

@adrianeboyd Another ping on this (as it's been two months) :)

@honnibal
Copy link
Member

@nro-bot Sorry for the lack of attention on this. Adriane no longer works at Explosion (we're no longer venture-backed), and my time for support has been split over a lot of different things.

I'm now getting more on top of it, having just finished the support for Python 3.13.

Could you summarise a little more what the lazy loading is intending to do? Is it that you want to avoid the import of the pipeline at loading, so that you can avoid triggering stuff that doesn't work?

An issue with the lazy-loading is that spaCy heavily depends on entry-points. These end up importing everything to get decorators to run as side-effects. I've refactored this in the patch for 3.13, so we no longer rely on these import side-effects to run decorators. That said, the registration functions (which need to run to do anything much) still need to import pretty much everything. So it's not clear to me what the net impact for your use-case would be.

@hoodmane
Copy link
Author

Could you sumarise a little more what the lazy loading is intending to do? Is it that you want to avoid the import of the pipeline at loading, so that you can avoid triggering stuff that doesn't work?

Yeah that's what we're after.

it's not clear to me what the net impact for your use-case would be.

The test here works fine with this patch:
https://github.com/pyodide/pyodide/blob/144c3de286ad41575b7d7fcc10438465a1e093fe/packages/spacy/test_spacy.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants