Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'import my_pkg.__init__' creates duplicate modules #59143

Closed
rlamy mannequin opened this issue May 28, 2012 · 18 comments
Closed

'import my_pkg.__init__' creates duplicate modules #59143

rlamy mannequin opened this issue May 28, 2012 · 18 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@rlamy
Copy link
Mannequin

rlamy mannequin commented May 28, 2012

BPO 14938
Nosy @brettcannon, @asmeurer, @rlamy
Files
  • my_pkg.tar.bz2
  • 1.patch
  • 2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/brettcannon'
    closed_at = <Date 2012-06-16.00:01:40.896>
    created_at = <Date 2012-05-28.18:52:19.510>
    labels = ['interpreter-core', 'type-bug']
    title = "'import my_pkg.__init__' creates duplicate modules"
    updated_at = <Date 2012-06-16.00:01:40.896>
    user = 'https://github.com/rlamy'

    bugs.python.org fields:

    activity = <Date 2012-06-16.00:01:40.896>
    actor = 'brett.cannon'
    assignee = 'brett.cannon'
    closed = True
    closed_date = <Date 2012-06-16.00:01:40.896>
    closer = 'brett.cannon'
    components = ['Interpreter Core']
    creation = <Date 2012-05-28.18:52:19.510>
    creator = 'Ronan.Lamy'
    dependencies = []
    files = ['25749', '25789', '25792']
    hgrepos = []
    issue_num = 14938
    keywords = ['patch']
    message_count = 18.0
    messages = ['161799', '161809', '161825', '161826', '161889', '161914', '161958', '161964', '162007', '162016', '162068', '162075', '162083', '162103', '162106', '162152', '162240', '162942']
    nosy_count = 5.0
    nosy_names = ['brett.cannon', 'Arfrever', 'Aaron.Meurer', 'python-dev', 'Ronan.Lamy']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue14938'
    versions = ['Python 3.3']

    @rlamy
    Copy link
    Mannequin Author

    rlamy mannequin commented May 28, 2012

    If an __init__.py file contains relative imports, doing 'import my_pkg.__init__' or calling __import__('my_pkg.__init__') creates duplicate versions of the relatively imported modules, which (I believe) causes cryptic errors in some cases (cf. the metaclass issue in http://code.google.com/p/sympy/issues/detail?id=3272 ).

    More precisely, with my_pkg/init.py containing (see attachment for the full setup):

    from .module1 import a
    from my_pkg.module2 import b

    I get:

    Python 3.3.0a3+ (default:0685f51e9891, May 27 2012, 02:22:12) 
    [GCC 4.6.1] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import my_pkg.__init__
    >>> import sys
    >>> sorted(name for name in sys.modules if name.startswith('my_'))
    ['my_pkg', 'my_pkg.__init__', 'my_pkg.__init__.module1', 'my_pkg.module1', 'my_pkg.module2']
    >>> 

    Note the bogus 'my_pkg.__init__.module1' entry. For reference, in Python 3.2, the last line is:

    ['my_pkg', 'my_pkg.__init__', 'my_pkg.module1', 'my_pkg.module2']

    NB: calling __import__('my_pkg.__init__') might seem odd (I didn't actually expect it to work on any Python version), but doctest apparently relies on it to test __init__.py files.

    @rlamy rlamy mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels May 28, 2012
    @brettcannon
    Copy link
    Member

    You can't have it both ways. If you explicitly import __init__ then it becomes just another module to Python, but you still need the implicit package module (i.e. without the __init__ name) for everything else to work since so much of the import system relies on the name of the module itself. If one chooses to work around the import system by doing stuff that is non-standard you will get quirky results like this.

    Closing as "won't fix" as it isn't worth the code complication to support such an edge case.

    @rlamy
    Copy link
    Mannequin Author

    rlamy mannequin commented May 29, 2012

    my_pkg.__init__ isn't treated as just another module but as a package. That is the reason why the bogus my_pkg.__init__.module1 is created, I guess:

    >>> import sys
    >>> sorted(name for name in sys.modules if name.startswith('my_'))
    []
    >>> import my_pkg.__init__
    >>> sorted(name for name in sys.modules if name.startswith('my_'))
    ['my_pkg', 'my_pkg.__init__', 'my_pkg.__init__.module1', 'my_pkg.module1', 'my_pkg.module2']
    >>> sys.modules['my_pkg.module1'].__package__
    'my_pkg'
    >>> sys.modules['my_pkg.__init__'].__package__
    'my_pkg.__init__'

    I agree that importing __init__ is a hack, but the way 3.3 reacts to it is nasty, because it can cause a whole application to be executed multiple times.

    @rlamy rlamy mannequin reopened this May 29, 2012
    @rlamy
    Copy link
    Mannequin Author

    rlamy mannequin commented May 29, 2012

    Grmf. I didn't mean to change the status.

    @rlamy rlamy mannequin closed this as completed May 29, 2012
    @brettcannon
    Copy link
    Member

    If you can come up with a patch to fix this in a way that is not messy then I will be happy to look at it.

    @rlamy
    Copy link
    Mannequin Author

    rlamy mannequin commented May 29, 2012

    Well, I can try. I have trouble wrapping my head around all the finders and loaders, but I'm slowly understanding what does what.

    What I don't know is what the expected behaviour is. Should my_pkg.__init__ be allowed to exist as a separate module (that's backwards-compatible but seems wrong)? Can it exist as a synonym of my_pkg? Should the import attempt raise an ImportError?

    @brettcannon
    Copy link
    Member

    If you directly import __init__ then it would just be a module within the package (the "magic" of packages should stay with the implicit interpretation of __init__).

    @rlamy
    Copy link
    Mannequin Author

    rlamy mannequin commented May 30, 2012

    Reverting to the previous behaviour, then? OK.

    As I understand it, the issue comes from a DRY violation: both FileFinder.find_loader() and _LoaderBasics.is_package() have their own notion of what is a package and they disagree. Since the finder needs to know what is a package, I think that the loader should be told whether it's a package or not instead of trying to guess.

    @brettcannon
    Copy link
    Member

    I see what you mean about the discrepancy, but you don't need to complicate the constructor to get the desired result. If you have is_package() check if the module name ends in __init__ to skip the package check and just say False (e.g. only if the path ends in __init__ but the module name does not) then you will get your desired semantics out of is_package (since this is what find_loader() is doing).

    @rlamy
    Copy link
    Mannequin Author

    rlamy mannequin commented May 31, 2012

    That would force the Loaders to know how to convert a module name into a file path, which isn't the case now since FileLoader.get_filename() is just a shim that returns self.path. So I'd rather add an optional argument to FileLoader. Actually, I feel that the clean solution would be to have packages be a separate Loader class, but that would be a significant change...

    @rlamy
    Copy link
    Mannequin Author

    rlamy mannequin commented Jun 1, 2012

    Here's a preliminary patch, without tests or documentation, implementing my approach (adding an optional is_package=False to FileLoader.__init__()).

    Next question (quite independent of the chosen implementation): how should this be tested?

    @brettcannon
    Copy link
    Member

    I think you misunderstood what I was suggesting as a solution: there is no conversion of module name to file name. It would simply be::

    fullname.rpartition('.')[2] == os.path.splitext(os.path.split(self.path)[1])[0]

    for path comparison. But honestly since init is the only special case here, is_package() could just do its current check and fullname.rpartition('.')[2] != '__init__'.

    And a problem with your patch is that it is only for FileLoader, which precludes SourceLoader from getting the beneficial fix for the issue and thus leaves out all other loaders which might use an alternative storage mechanism than files (e.g. zip files).

    @rlamy
    Copy link
    Mannequin Author

    rlamy mannequin commented Jun 1, 2012

    Your second solution is simple and solves the problem, so I guess that's it, then.

    Still, I don't really understand the purpose of SourceLoader, which makes me uneasy about the implementation of _LoaderBasics.is_package(). It seems to be meant to load an arbitrary resource yet it assumes that get_filename returns an actual file path, so I don't see how it could be used to import from a URL, for instance.

    @brettcannon
    Copy link
    Member

    I have yet to see anyone use a URL loader in serious code beyond people just using it as an example. What I do see is lots of custom zipfile importers and thus for the majority if people this will continue to make sense. And even with a URL loader, get_filename would return a URL who probably has something like http://example.com/pkg/__init__.py which would still work as expected. If someone chooses to break from expectations they can just overload is_package().

    @brettcannon brettcannon reopened this Jun 1, 2012
    @brettcannon brettcannon self-assigned this Jun 1, 2012
    @rlamy
    Copy link
    Mannequin Author

    rlamy mannequin commented Jun 1, 2012

    OK, that makes sense pragmatically.

    Here's the patch then. I wasn't sure where to put the test, it doesn't actually have much in common with the rest of Lib/importlib/test/import_/test_packages.py but the name fits, so...

    @brettcannon
    Copy link
    Member

    Thanks for the patch, Ronan! The fix seems fine and I will have a more thorough look at the test later and figure out where it should go (probably only going to worry about testing is_package() directly since that was the semantic disconnect). I will also update the docs when I commit the patch.

    And as a matter of procedure, Ronan, can you submit a contributor agreement? http://python.org/psf/contrib/

    @rlamy
    Copy link
    Mannequin Author

    rlamy mannequin commented Jun 3, 2012

    I'm not sure that it's enough to test is_package() because that only involves the loader and not the interaction between it and FileFinder. That's the reason why my test works at a higher level.

    BTW, I sent the contributor agreement.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 16, 2012

    New changeset 240b7467e65c by Brett Cannon in branch 'default':
    Issue bpo-14938: importlib.abc.SourceLoader.is_package() now takes the
    http://hg.python.org/cpython/rev/240b7467e65c

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant