Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mypy doesn't recognize intra-package imports when qualified by __all__ #10826

Open
stinodego opened this issue Jul 15, 2021 · 3 comments
Open
Labels
bug mypy got something wrong

Comments

@stinodego
Copy link

stinodego commented Jul 15, 2021

Bug Report

I have a package and am trying to control which classes/functions are available at the 'top level' of the package. For this purpose, I am using the __all__ dunder and star imports in the __init__.py files. I get unexpected behaviour from mypy when doing this: some of my objects are "not defined", when they in fact are.

To Reproduce

This bug has to do with imports between files, so to reproduce this, we need a bit of a package file structure. Consider the package gym:

gym
  sup/
    __init__.py
    bro.py
  __init__.py
  cool.py

File contents would be:

# sup/bro.py - This is the object we want to be available at the top level
__all__ = ["Bro"]

class Bro:
    pass
# sup/__init__.py - This brings the Bro object to the sup module level
from . import bro
from .bro import *

__all__ = bro.__all__
# __init__.py - This brings the sup module, including the Bro object, to the top module level
from .sup import *
# cool.py - This file uses the Bro object in a function
import gym

__all__ = ["story"]

def story() -> gym.Bro:
    return gym.Bro()

Now, when I run mypy gym/ I get the following errors:

gym/cool.py:6: error: Name "gym.Bro" is not defined
gym/cool.py:7: error: Module has no attribute "Bro"
Found 2 errors in 1 file (checked 4 source files)

However, when I remove the __all__ dunder from the sup/__init__.py file, everything works as expected, and I get no errors! Unfortunately this has the unwanted side-effect that the submodule gym.sup.bro becomes available as gym.bro, which I do not want.

Somehow that __all__ dunder stops mypy from recognizing that the Bro object should arrive at the module level above it. I can also work around the error by using the full path to the Bro object, gym.sup.bro.Bro, or even gym.sup.Bro, but not gym.Bro. This unfortunately gets very verbose in larger packages.

What's going on here?

Your Environment

  • Mypy version used: 0.910
  • Mypy command-line flags: None
  • Mypy configuration options from mypy.ini (and other config files): None
  • Python version used: 3.9.5
  • Operating system and version: WSL Ubuntu 20
@flisboac
Copy link

flisboac commented Mar 7, 2022

This also happens when your "import" module starts with underline _ (which is somewhat standard to denote said module is not a "public package"). Problem occurs when you want to export components from your internal modules into the public package.

Suppose the following project structure:

test.py
my_lib/
│    __init__.py
└─── func1/
    │    __init__.py
    └─── _concepts/
        │    __init__.py
        └─── some/
                 some_concept.py
                 __init__.py

And the following file contents:

File "my_lib/func1/_concepts/some/some_concept.py": has a class and associated definitions

Many of such module files exist, grouped by a sub-package (in this case, some) of an internal package (in this case, _concepts, in func1).

from abc import ABCMeta
from typing import TypeVar


TMyConcept = TypeVar('TMyConcept', bound='MyConcept')

TMyConcept_cov = TypeVar('TMyConcept_cov', bound='MyConcept', covariant=True)


class MyConcept(metaclass=ABCMeta):
    """Class representing some concept."""
File "my_lib/func1/_concepts/some/__init__.py": exports all elements of internal module "_concepts/some"
from .some_concept import TMyConcept, MyConcept, TMyConcept_cov


__all__ = [
    'TMyConcept',
    'MyConcept',
    'TMyConcept_cov',
]
File "my_lib/func1/_concepts/__init__.py": exports all public elements of internal module "_concepts" (uses star import)
from .some import *
from .some import __all__ as _some_all


__all__ = [*_some_all]
File "my_lib/func1/__init__.py": exports all public elements of external module "func1"; not supposed to be a namespace package

Note that __all__ is declared in terms of other inner modules, to avoid repetition. Star imports are used for the same reason.

from ._concepts import *
from ._concepts import __all__ as _concepts_all


__all__ = [*_concepts_all]
File "my_lib/__init__.py": a namespace (and public) package
# AFAICT, from the PEPs, I think this file can simply be deleted and it would work,
# but in my codebase that's how it is now, and I'm trying to replicate the bug, so...

__path__ = __import__('pkgutil').extend_path(__path__, __name__)

Note that __all__ in some of those modules are created in terms of other inner modules, to avoid repetition. Star imports are used for the same reason.

Now, looking at the my_lib/func1/_concepts/some/__init__.data.json files in mypy_cache, I can see that all the star-imported elements from my_lib.func1._concepts.some.some_concept are marked as "module_public": false, even though they're exported via __all__ -- albeit through some sort of indirection. This causes mypy to sometimes flag a deeply re-exported class import as "not defined", when it's accessed through the public package.

My guess is that mypy understands __all__, but only considers literal values. Such levels of indirection don't seem to be detected. Therefore, things like __all__ = some_other_lib.__all__ or __all__ = [*some_ther_lib.__all__] won't work, because mypy would need to import or execute code to get the depended-upon values, which goes a bit outside the realm of static analysis as modules in python are somewhat executable. And because imports are private by default unless explicitly exported with __all__, those symbols don't become visible outside of the module. The equivalent technique to a star-import for __all__ don't seem to be available (at static-analysis time).

But mypy could perhaps identify when __all__ is effectively final, ie. iff its definition is non-conditional, AND consists of string literals or splat'd references to other such sequences. In doing so, it could prevent us from having to do a myriad of imports and/or re-exports, either in __init__.py or amongst all class-specific source files, in case the codebase is big enough. It would also eliminate a lot of repetition. For example:

# Class names are fictitious

# ---
# In module "my_mod._internal1
from typing import Final
from .class1_1 import Class1_1
from .class1_2 import Class1_2
# ...
# Here, __all__:
# 1. is definitely final,
# 2. is non-conditionally defined, and
# 3. has a clearly immutable value (a tuple)
__all__: Final = ("Class1_1", "Class1_2", ...)

#---
# In module "my_mod._internal2
from typing import Final
from .class1_1 import Class1_1
from .class1_2 import Class1_2
# ...
# Same here
__all__: Final = ("Class2_1", "Class2_2", ...)

#---
# in module "my_mod"
from typing import Final
import .internal1
import .internal2
# ...
# Because __all__ from other modules are final, and non-conditionally
# defined, I think it's safe to assume the contents of __all__ as
# immutable and statically inferrable as well.
__all__: Final = (*internal1.__all__, *internal2.__all__)

Anyways, I'm not sure if that's already the case, please correct me if I'm wrong.

@erictraut
Copy link

erictraut commented Mar 8, 2022

The dunder all mechanism presents a challenge for static type checking in general. I've seen some "creative" (i.e. unnecessarily convoluted) dynamic manipulation of __all__ in some libraries. We discussed this in a typing-sig thread a couple of years ago, and I proposed that we standardize the list of supported idioms for defining and manipulating __all__ such that all Python type checkers, linters, and language servers would work consistently. This would also help library authors know which idioms to use and which to avoid.

The proposed list is documented here: https://github.com/python/typing/blob/master/docs/source/libraries.rst#library-interface. Pyright implements all of these idioms (and does not implement any others) and emits a diagnostic if it detects the use of an unsupported manipulation of __all__. I think mypy implements most (perhaps all?) of the idioms on the proposed list.

Note that Final should not be required, but your sample above does use forms that are not in the proposed "supported idioms" list. In particular, the form that uses an unpack operator is unsupported. If we were to stick to my proposed list, you'd need to switch to:

__all__ = ["Class2_1", "Class2_2"]
__all__ += internal1.__all__
__all__ += internal2.__all__

I'm open to changing (augmenting, etc.) my proposed list if there's good justification for doing so and it helps us get consensus among Python tool authors and library maintainers. Standardization in this area will benefit everyone.

I apologize if I went a bit off topic from the original bug report, but I thought this information might be relevant.

@flisboac
Copy link

flisboac commented Mar 8, 2022

@erictraut your input was invaluable, and I learned a lot from the proposed list you linked! Thank you very much!

The idiom I proposed was written out of ignorance in regards to how __all__ in __init__.py is identified by linters. I just tried to propose an idiom that would make it easier for linters to identify that __all__ will not change, neither locally nor outside the module it's defined in. I had no idea that there was a proposal for this use case, and I'm sorry for that.

In my case, specifically, could just put everything at once at variable definition time, like in the examples, without any need for extends, += or other list manipulation functions, because the list is somewhat "fixed" (even though some values come from other lists), and therefore it can be immutable (tuple) at the time it's assigned. I have no idea if that's a common idiom, though (probably not; otherwise it would be in the proposal).

In fact, looking into the proposal, for some of the codebases I work with, I realise that I can omit __all__ altogether (and it works in mypy); unfortunately, that's not always possible, and it seems like mypy does not implement the whole proposal (or at least, in my brief tests, it does not seem to implement; when __all__ is provided, no import idioms seem to result in public exports, neither does __all__ += ..., etc).

So, in a way, your comment unblocked me for some use cases, and I thank you for that. :)

But it seems like pyright (or pylance + some other selected linter? I'm using VSCode) identifies public imports in __init__.py as unused imports when __all__ is omitted. That's one of the reasons I started assigning __all__ explicitly (ie. to ensure such imports are "used").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug mypy got something wrong
Projects
None yet
Development

No branches or pull requests

3 participants