Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate __all__ to find submodules #1845

Open
htgoebel opened this issue Feb 28, 2016 · 7 comments
Open

Evaluate __all__ to find submodules #1845

htgoebel opened this issue Feb 28, 2016 · 7 comments
Labels
area:modulegraph feature Feature request kind:to be discussed Please add our opionon and rational @low

Comments

@htgoebel
Copy link
Member

Issue #1834 shows an interesting case: The package Crypto.PublicKey uses __all__ to refer to the submodules, but does not contain any import statement. Thus for finding the imports, evaluation __all__ would be sufficient and may save some hooks.

We could extend modulegraph as follows:

  • If there is no import statement, check __all__.
  • From this list, remove all known globalnames
  • Assume the remaining to be submodules to be imported.

Deficits:

  • Modulegraph already collects global module attributes, but does not yet store values for them. We would need to extend this mechanisms (for storing constant lists, tuples and sets).
  • This most propably would not work for sqlalchemy which uses imports like from ..dialects.sqlite import base as sqlite plus lists these in __all__.

Is this feature worth the effort for implementing it?

@htgoebel htgoebel added feature Feature request @low area:modulegraph kind:to be discussed Please add our opionon and rational labels Feb 28, 2016
@bjones1
Copy link
Contributor

bjones1 commented Mar 15, 2016

I think this is a good idea: it's standard Python, so supporting it might cover a few more cases that otherwise would require a hook.

@leycec
Copy link
Contributor

leycec commented Apr 29, 2016

Seconded. It's not simply a good idea; it's a great idea. Three delicious green apples up! 🍏 🍏 🍏

But shouldn't this only (but always) be done for package-level star imports? My ideal algorithm might resemble:

  1. Ignore module-level star imports, as we currently do.
  2. For each package-level star import from a package foo into a module bar, set a list starimports of the names of all submodules imported from foo into bar as follows:
    • If foo.__init__.__all__ exists, set starimports = foo.__init__.__all__. (This is only pseudocode.)
    • Else, set starimports to the list of all public (i.e., not prefixed by "_") submodules of foo empty list. This is what we currently do, so... great! Laziness prevails.
  3. Remove all known globalnames of foo.__init__ from starimports.
  4. Import each remaining element of starimports as a submodule of foo into bar.

Also, we probably don't want to unconditionally store the values of all globalnames – just the values of all __all__ globalnames for all __init__.py modules. PyInstaller is already a bit memory consumptive. My bushy eyebrows shudder at the space costs of storing everything.

Also, I'm unconvinced that globalnames are behaving as intended at the moment. Our _scan_code() implementation contains a PyInstaller-specific hack that makes my stomach churn butter:

    def _scan_code(self, m, co, co_ast=None):
        ...

        # Actually import the modules collected while scanning.
        # We need to suspend the globalnames as otherwise
        # `_safe_import_hook()` would take identfiers imported via
        # `from . import abc` as existing global names and not try to
        # import them. Please note: This only effects this form of
        # import where relative_module is "." For all other cases, the
        # imported module is already processed and all global names
        # are in place anyway.
        globalnames = m.globalnames
        m.globalnames = set()
        self._process_imports(m)
        m.globalnames = globalnames

This hack appears to be responsible for at least one related issue. Why? Because _safe_import_hook() is called (under certain reproducible edge cases) in a manner in which globalnames is not suspended and the passed import is thus ignored entirely. This is bad.

But, yeah... I'm surprised we don't already process __all__ lists. It's a fantabulous idea.

@leycec
Copy link
Contributor

leycec commented May 10, 2016

O.K. If we could defer this until after I submit my in-process pull request for #1919, that'd be awesome. I've refactored ModuleGraph a wee bit, which should (in theory) simplify the implementation of this feature. :shipit:

@Legorooj
Copy link
Member

Has this been addressed, or is it closable?

@tourist-C
Copy link

Hi , I am facing similar issue when packaging tqdm.

tqdm_init_.py

__all__ = ['tqdm', 'tqdm_gui', 'trange', 'tgrange', 'tqdm_pandas',
           'tqdm_notebook', 'tnrange', 'main', 'TMonitor',
           'TqdmTypeError', 'TqdmKeyError',
           'TqdmWarning', 'TqdmDeprecationWarning',
           'TqdmExperimentalWarning',
           'TqdmMonitorWarning', 'TqdmSynchronisationWarning',
           '__version__']

@Legorooj
Copy link
Member

@tourist-C please open a new issue for this.

@pyinstaller pyinstaller locked and limited conversation to collaborators Apr 22, 2020
@Legorooj
Copy link
Member

@htgoebel I'll look into implementing this when we switch over to modulegraph2.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area:modulegraph feature Feature request kind:to be discussed Please add our opionon and rational @low
Projects
None yet
Development

No branches or pull requests

5 participants