Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meta path finders and path entry finders are different, but share an ABC #59707

Closed
ncoghlan opened this issue Jul 30, 2012 · 64 comments
Closed
Assignees
Labels
release-blocker stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@ncoghlan
Copy link
Contributor

BPO 15502
Nosy @warsaw, @brettcannon, @birkenfeld, @pjeby, @ncoghlan, @ericvsmith, @ericsnowcurrently
Dependencies
  • bpo-15295: Import machinery documentation
  • Files
  • issue15502_file_path_entry_finder_of_doom.diff
  • issue15502_new_abc.diff: new ABCs (with doc and test changes)
  • issue15502_windows_registry_finder.diff: just a rename; for further fussing with this class see issue 15519
  • issue15502_invalidate_caches.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/brettcannon'
    closed_at = <Date 2012-08-10.16:21:52.434>
    created_at = <Date 2012-07-30.13:39:41.581>
    labels = ['type-bug', 'library', 'release-blocker']
    title = 'Meta path finders and path entry finders are different, but share an ABC'
    updated_at = <Date 2012-08-10.21:41:31.702>
    user = 'https://github.com/ncoghlan'

    bugs.python.org fields:

    activity = <Date 2012-08-10.21:41:31.702>
    actor = 'python-dev'
    assignee = 'brett.cannon'
    closed = True
    closed_date = <Date 2012-08-10.16:21:52.434>
    closer = 'brett.cannon'
    components = ['Library (Lib)']
    creation = <Date 2012-07-30.13:39:41.581>
    creator = 'ncoghlan'
    dependencies = ['15295']
    files = ['26657', '26658', '26659', '26671']
    hgrepos = []
    issue_num = 15502
    keywords = ['patch']
    message_count = 64.0
    messages = ['166897', '166919', '166959', '166960', '166961', '167039', '167061', '167065', '167073', '167075', '167079', '167084', '167085', '167086', '167087', '167088', '167090', '167134', '167138', '167147', '167151', '167171', '167174', '167176', '167177', '167185', '167186', '167187', '167188', '167190', '167192', '167203', '167204', '167205', '167209', '167212', '167214', '167216', '167218', '167219', '167225', '167227', '167235', '167238', '167244', '167245', '167253', '167258', '167259', '167277', '167281', '167286', '167289', '167291', '167525', '167875', '167876', '167877', '167878', '167881', '167887', '167888', '167897', '167918']
    nosy_count = 9.0
    nosy_names = ['barry', 'brett.cannon', 'georg.brandl', 'pje', 'ncoghlan', 'eric.smith', 'Arfrever', 'python-dev', 'eric.snow']
    pr_nums = []
    priority = 'release blocker'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue15502'
    versions = ['Python 3.3']

    @ncoghlan
    Copy link
    Contributor Author

    In the review of bpo-15295, Brett noted that the APIs of meta path finders and path entry finders have now diverged substantially thanks to PEP-420, thus it's rather dubious that they continue to share a single Finder ABC in importlib.

    @ncoghlan ncoghlan added release-blocker stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jul 30, 2012
    @brettcannon
    Copy link
    Member

    Making this a dependency on the import machinery docs issues as the terminology settled on in those docs to differentiate between finders involving sys.meta_path and sys.path will help dictate what to name the new ABCs.

    @ericsnowcurrently
    Copy link
    Member

    Would importlib.abc.Finder become something like MetapathHandler, or would we just add a new one (like PathEntryHandler)? Obviously the actual names would depend on the outcome of bpo-15295, but the fate of importlib.abc.Finder is independent of the names.

    @ncoghlan
    Copy link
    Contributor Author

    As we want to eventually deprecate find_module in the PathEntryHandler API, I'd suggest two new subclasses, MetaPathFinder and PathEntryHandler, both inheriting from Finder.

    Current path hooks would be updated to inherit from PathEntryHandler, meta path hooks from MetaPathFinder.

    @ericsnowcurrently
    Copy link
    Member

    Yeah, but right now the API of importlib.abc.Finder is strictly find_module(name, path=None). I would expect that to remain the same for MetaPathFinder (bikeshedding aside :). So what would be left in importlib.abc.Finder if the ultimate plan is that PathEntryHandler would not implement find_module()? Would it simply be an alias to MetaPathFinder?

    I agree with the plan, but wonder if a common base class is warranted. The connection to "finder" in PEP-302 is nice, but rather superfluous in a practical context. Having a Finder class is necessary at this point for backward-compatibility, but I don't see how it is different from the more appropriately named MetaPathFinder.

    @brettcannon
    Copy link
    Member

    So I think Nick has the inheritance inverted and I would have Finder inherit from everything to stay backwards-compatible and to ease future deprecation (which could potentially happen starting in Python 3.3), although my approach weakens the usefulness of the ABCs to the point that Nick is probably right in the end. =)

    Ideally you would have MetaPathFinder which has find_module(fullname, path). Nnotice there is no default value for path since meta path finders always have something passed in for 'path', even if it's None.

    The other ABC would be PathImporter which defines find_loader(fullname). Notice how 'path' is no longer defined as it isn't used by PathFinder when calling a path importer.

    importlib.abc.Finder would then inherit from both MetaPathFinder and PathImporter. This is because if someone wants to be compatible with both Python 3.2 and 3.3 without any trickery, they will want to simply inherit from Finder for either a meta path finder or a path importer. Otherwise in Python 3.2 you would have to do some conditional work or define a fake e.g. MetaPathFinder ABC in order to inherit from the new ABCs.

    Finder would have find_module(fullname, path=None) defined but returning None by default. It would also inherit from PathImporter, but this is where it gets tricky since importlib decides whether to call find_loader() based on whether find_loader() is defined, which means PathImporter couldn't require it to instantiated. Otherwise some hack will be needed in Finder for it to claim that find_loader() is defined but while still failing a hasattr() check (which I can't think of how to do; maybe some insane metaclass which temporarily sets find_loader() so abstractmethod doesn't complain but then isn't defined in the end so __getattr__ still raises an AttributeError?).

    IOW I don't see an ideal situation here and we will probably want to do what Nick suggested.

    @ncoghlan
    Copy link
    Contributor Author

    I would largely use Brett's proposed interface for MetaPathFinder (but keep
    the path argument optional) and his PathImporter interface for
    PathEntryHandler.

    Finder can then just become a legacy alias for MetaPathFinder.

    @ericsnowcurrently
    Copy link
    Member

    I'm working up a patch. My intention is to match up the names to those that Barry has pushed out to the updated glossary: MetaPathFinder and PathEntryFinder. I'll also include doc changes.

    I'm going to toy with not having any inheritance between the 3, but I'm not going to bother if it's not simple. Brett made some good points that I'll have to cover. Otherwise I'll likely go with Nick's recommendation.

    I'm in favor of deprecating Finder sooner rather than later. Also, I plan on registering importlib.machinery.PathFinder against MetaPathFinder and importlib.machinery.FileFinder against PathEntryFinder.

    One last thing: with Barry's use of "path importer" in the docs, I'd like to change the name of importlib.machinery.PathFinder to PathImporter (and leave in a deprecated legacy alias like with importlib.abc.Finder). This would line up with the docs and would also match the naming of the other two default importers. If Brett has strong feelings on this I'll let it go.

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Aug 1, 2012

    (With the draft docs now comitted, this issue is the home for thrashing out the terminology. I'm not going to propose taking this to import-sig or python-dev, because the bikeshed would end up being fuschia or pink-with-yellow-polkadots or something)

    However this turns out, the names in the glossary and the ABCs in importlib should match for 3.3b2. I wouldn't bother with programmatic deprecation here (it's a PITA), just note them as deprecated aliases in the docs.

    However, I also think we need to take into account the naming precedents established (or endorsed) by importlib and think carefully about whether which ones we want to change.

    1. importlib currently uses "importer" to mean an object that is both a finder *and* a loader (hence BuiltinImporter, FrozenImporter as meta path hooks). This is the original meaning of the term as defined in PEP-302

    2. importlib uses the "PathFinder" term for what Barry has proposed to call the "path importer". If we go with this change in terminology, then we should change the general term "meta path finders" to "meta path importers", as that would make the usage of the term "importer" consistent across the three default meta path hooks. There would no longer be a special term to denote an importer that returned itself as the module loader. However, I'm now -1 on making this change: I'd prefer to preserve the distinction that an importer is a finder that returns itself as the loader, which means the path finder doesn't qualify.

    3. I still find it confusing to have two different kinds of "finder". If the meta path finders keep their name (as I now prefer), then it is the path entry finders that need to move. While "handler" is a ridiculously generic term, the chain "path finder"->"path hook"->"path entry handler"->"module loader" strikes me as being easier to follow than "path finder"->"path hook"->"path entry finder"->"module loader". The only name change required for this adjustment in terminology would be that importlib.FileFinder would become importlib.DirectoryHandler

    4. There would be a slight tweak to the definition of "importer" to mean objects that are either a meta path finder and module loader (i.e. BuiltinImporter, FrozenImporter) or else a path entry handler and module loader (i.e. zipimport.zipimporter)

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Aug 1, 2012

    Further thoughts:

    1. Let's keep Finder as a generic base class for "things which find module loaders", with its current implementation but an updated docstring and documentation.

    2. The reason both the MetaPathFinder and PathEntryHander ABCs need to preserve the Finder API for find_module is for backwards compatibility with third party reimplementations of the PEP-302 protocol.

    3. importlib changes:

    • add MetaPathFinder and PathEntryHandler
    • inherit from MetaPathFinder or PathEntryHandler as appropriate (instead of directly from Finder)
    • rename FileFinder to DirectoryHandler (keeping the former as a legacy alias)
    1. Longer term, we should deprecate find_module() completely, and add a meta-path appropriate find_loader(fullname, path) definition that allows multiple meta path finders to contribute to a namespace package. Finder would effectively become an empty categorisation ABC. This plan should be elaborated in a PEP for 3.4

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Aug 1, 2012

    Reading through What's New, I withdraw the suggestion of changing the "path entry finder" term. It's just not worth the hassle.

    That also simplifies things a lot. All that needs to change to make things consistent:

    1. Switch the docs over to using "path finder" instead of "path importer"
    2. Add the MetaPathFinder and PathEntryFinder ABCs (with only the latter have find_loader() defined)
    3. Adjust the inheritance in importlib to use the appropriate parent rather than inheriting from Finder directly

    @ericsnowcurrently
    Copy link
    Member

    Here's a patch that addresses Nick's 2 & 3. I think I found a satisfactory solution for the inheritance that is both correct and backward compatible.

    One question for now: does FileFinder need to keep its find_module() method? That should be resolved before 3.3 is released to avoid backward compatibility issues.

    @ericsnowcurrently
    Copy link
    Member

    I just noticed that WindowsRegistryImporter does not have load_module(). If we are going for consistency, we should change the name to WindowsRegistryFinder.

    @ericsnowcurrently
    Copy link
    Member

    With the draft docs now comitted, this issue is the home for thrashing
    out the terminology.

    In my mind the key is consistency. While a worthy goal regardless, for the import system and its attendant complexity, consistency is crucial.

    ("importer" == finder + loader) sounds fine with me. Even if some of these terms have been ambiguous in the past, I think that a consistent usage in the current docs will leave the ambiguity in the past. I will advocate Barry's recommendation that we put a warning in PEP-302 that the terms there may be outdated.

    I'm find with changing "path importer" to "path finder". If no one does a patch first, I'll put one up tomorrow.

    I actually prefer "path entry handler" for the distinctiveness, but I'm fine with leaving it be. However, I get a sense that this is our one shot at this decision.

    I'm also okay with renaming FileFinder to DirectoryHandler, particularly since it is a new class to 3.3 (no backward-compatibility issues). Naturally this would parallel any decision on "path entry handler".

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Aug 1, 2012

    Reading the 3.3 What's New reminded me that we aren't really as free to redefine this terminology as we might hope (or as I claimed on the previous issue). We can still try, of course, but the PEP-302 naming scheme has been around for 10 years, and there are plenty of guides and other things that assume that the things on the meta path and things returned by path hooks are both finders. That's actually reasonable, since the key role of both is to find a module loader given a module name. They differ in API details, but they fundamentally do the same thing. Better to go with the flow, especially given how close we are to release.

    As far as the implementation goes:

    +1 for renaming WindowsRegistryImporter to WindowsRegistryFinder

    -1 on breaking any find_module implementations. Backwards compatibility requirements still apply to the importlib API - while the default import system won't call FileFinder.find_module() any more, third party import reimplementations are still free to do so.

    @ericsnowcurrently
    Copy link
    Member

    Backwards compatibility requirements still apply to the importlib API -
    while the default import system won't call FileFinder.find_module() any
    more, third party import reimplementations are still free to do so.

    Except that a good portion of the importlib API is new in 3.3, including FileFinder. I've removed the changes to the 3 default meta path finders don't belong in the patch.

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Aug 1, 2012

    Fair point. However, the find_module() variant of the API isn't dead yet, so we should still support it. find_module() likely won't die completely until 4.0.

    @warsaw
    Copy link
    Member

    warsaw commented Aug 1, 2012

    I've mostly run out of time to work on the docs, but I do want to say that I
    thought long and hard about all the terminology decisions. Please don't
    change them lightly, and definitely don't change them until you've tried to go
    through the documentation and make all the updates and read through them to
    see how well it flows and makes sense. It is *not* easy.

    On Aug 01, 2012, at 05:13 AM, Eric Snow wrote:

    ("importer" == finder + loader) sounds fine with me.

    That's how I tried to think about it. In some sense, it's just an
    implementation detail, but where it comes into the play for the documentation
    is the "path importer", which is the thing on sys.meta_path that implements
    all the sys.path/sys.path_hooks/sys.path_importer_cache<sigh> semantics.

    While technically a finder, calling it the "path finder" just does not work,
    IMHO. It's too confusing and causes the documentation to be less
    comprehensible. So the one compromise I made was to call this thing the "path
    importer" because I think conceptually, if not strictly implementation-wise,
    that term ties everything together.

    A diagram would seal the deal I think.

    I'm find with changing "path importer" to "path finder". If no one does a
    patch first, I'll put one up tomorrow.

    Please don't, for reasons stated above!

    I actually prefer "path entry handler" for the distinctiveness, but I'm fine
    with leaving it be. However, I get a sense that this is our one shot at this
    decision.

    I don't think "handler" is a very good term. These things *are* finders, even
    if the API they support is richer and slightly different than meta path
    finders. Also, I think "path entry handler" could almost be just as useful
    for the things on sys.path_hooks, although I do use "path entry hook" here.
    But in a sense the callables on sys.path_hooks also "handle" path entries,
    just in a different way. I think "path entry finder" neatly describes both
    its similarities and differences from other objects in the picture.

    I'm going to leave all the abc names and implementations to you guys. :)

    Thanks for all your excellent feedback and your own work on this nasty little
    corner of Python. :)

    @brettcannon
    Copy link
    Member

    So just to weigh in on this, I think "meta path finder" and "path entry finder" (as Barry stated in the glossary already) clearly delineate what is from sys.meta_path and what is from some path entry hook by having a clarifying word to go with "finder" (either kind of finder is still trying to find a loader). Finder/loader/importer has history in terms of usage so we can't muck with these terms too much.

    This does mean, though, that PathFinder and FileFinder are misnamed. Unfortunately PathFinder existed in previous versions of Python, so it will need an alias to its old name from a new name (PathImporter as Barry wants or MetaPathEntryFinder?). But FileFinder is new, so that can be renamed to FilePathEntryFinder (or FileEntryFinder). And of course WindowsRegistryImporter should be changed to *Finder.

    I think if we harp on the "meta path" vs. "path entry" and use it consistently and extensively then the finder/loader/importer terminology does not need to change meaning and we don't have to misuse it for path importer to classify PathFinder while keeping the rest of Barry's terms intact which make it obvious when something is directly because of how import works vs. just a single meta path finder that happens to be what nearly everyone ends up using to handle their imports.

    @ericsnowcurrently
    Copy link
    Member

    @barry: I'm fine with what you've said. My big concern is that we be really consistent here because of the complexity of the import system. Things are at a point that I think we're pretty close in the regard, so I'm mostly fine if we stay put. My only real regret would likely be "path importer" vs. PathFinder, but I can live with it. :)

    @brett: I'll update my patch for WindowsRegistryImporter and FileFinder.

    @anyone: do the doc-based deprecations in the patch look okay? The inheritance model?

    @brettcannon
    Copy link
    Member

    The patch will have Finder break all subclasses that don't define find_loader which is backwards-incompatible. And if you really want to deprecate Finder you should have a warning in an __init__ method (as one would hope people are calling super() as necessary). Otherwise the patch looks fine (from a cursory glance).

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Aug 1, 2012

    As a final attempt at giving the path import machinery the emphasis it
    deserves without misusing the "importer" term, how about:

    1. "path import subsystem" for that whole section of the import machinery;
      and
    2. "path import finder" specifically for the meta path finder that
      importlib calls "PathFinder" and the new docs currently call "path importer"

    "path_importer_cache" would still be a bit of a misnomer, but that ship
    sailed a long time ago.

    @warsaw
    Copy link
    Member

    warsaw commented Aug 1, 2012

    On Aug 01, 2012, at 10:03 PM, Nick Coghlan wrote:

    1. "path import subsystem" for that whole section of the import machinery;
      and
    2. "path import finder" specifically for the meta path finder that
      importlib calls "PathFinder" and the new docs currently call "path importer"

    -0. To me, this doesn't make things vastly more clear for the additional
    awkwardness in readability.

    "path_importer_cache" would still be a bit of a misnomer, but that ship
    sailed a long time ago.

    Yeah, we all agree that one's unfortunate! ;)

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Aug 1, 2012

    The problem with "path importer" is it's just plain *wrong*. That object is
    not an importer and thus calling it one makes it much harder to learn the
    finder/loader/importer distinctions correctly. "import finder" is only
    slightly longer than "importer" and has the huge advantage of *not being
    wrong*.

    @warsaw
    Copy link
    Member

    warsaw commented Aug 1, 2012

    On Aug 01, 2012, at 10:44 PM, Nick Coghlan wrote:

    The problem with "path importer" is it's just plain *wrong*. That object is
    not an importer and thus calling it one makes it much harder to learn the
    finder/loader/importer distinctions correctly.

    The term "importer" is sufficiently squishy though, and I think we could just
    as easily adjust the term's definition in the glossary to match what this
    thing is.

    "import finder" is only slightly longer than "importer" and has the huge
    advantage of *not being wrong*.

    The problem is that "import finder" isn't correct either, because this thing
    doesn't load modules, and it doesn't find imports. It finds loaders through
    different kinds of finders.

    It is unquestionably a finder, so probably the most accurate term would be
    "path finder" but that just didn't read well when I tried to use it. Part of
    the problem is that "path finder" isn't different enough from "path entry
    finder" to avoid confusion, and I really like the latter to describe the
    things that the sys.path_hooks callables return, and I'm sure none of us want
    to call it the "path entry loader finder finder" :).

    Keep trying, but so far "path importer" is the least crappy of the lot, IMHO.

    @ericsnowcurrently
    Copy link
    Member

    The patch will have Finder break all subclasses that don't define
    find_loader which is backwards-incompatible.

    How so? Finder is registered against PathEntryFinder. It doesn't inherit, so it doesn't get the abc.ABCMeta as a metaclass and thereby there is no instantiation-time checks for find_loader().

    Attached is a patch that makes the two names changes (FileFinder was all over the place).

    @ericsnowcurrently
    Copy link
    Member

    Regarding deprecating importlib.abc.Finder, I'll add a DeprecationWarning to the patch later.

    @warsaw
    Copy link
    Member

    warsaw commented Aug 2, 2012

    What about "Path Scanner"? Came to me in a dream, so it has to be perfect,
    right? :)

    @ericsnowcurrently
    Copy link
    Member

    Regarding the ABCs, I'm okay with it as long as invalidate_caches() is meant to be a part of meta path finders. I had considered this while writing the patch, so how I had it wasn't a fluke. The fluke was that I didn't bring it up for discussion like I had meant to do.

    Anyway, my rationale was that importlib.invalidate_caches() is specific to path entry finders. My patch reflected this. If invalidates_caches() were meaningful for meta path finders then we could address that later without losing backward compatibility.

    As to pulling find_module() off Finder, I felt that having a distinct signature, while semantically unimportant, made a clearer distinction between the old and the new.

    @warsaw
    Copy link
    Member

    warsaw commented Aug 2, 2012

    On Aug 02, 2012, at 11:33 AM, Nick Coghlan wrote:

    Specifically, what I did was to drop "find_module" from the Finder ABC, but
    keep the ABC itself as a way to document the common "invalidate_caches"
    API. The ABC definition no longer cares whether you implement find_module()
    or not. MetaPathFinder then enforces find_module(), while PathEntryFinder
    enforces find_loader().

    One problem I have, which I'm not sure how to solve, is that the protocol
    defines a couple of optional methods, specifically find_loader() on meta path
    finders, and module_repr() on loaders.

    You'd like for the ABC to be able to capture both the support of these methods
    if they exist, and their optional nature, but I don't believe that's possible
    with @abc.abstractmethod. Correct me if I'm wrong, but by adding these
    methods to the ABC, it requires subclasses to define them. I noticed this
    when I broke the test suite, and solved it by just adding the methods to the
    mocks and other test classes. But I think this is not the best solution.

    Maybe we need an @abc.optionalabstractmethod decorator? ;)

    @ericsnowcurrently
    Copy link
    Member

    I wish "path subsystem finder" weren't so long. Then PathFinder would probably still be appropriate for the class name (as FileFinder is appropriate for FilePathEntryFinder :).

    @ericsnowcurrently
    Copy link
    Member

    Maybe we need an @abc.optionalabstractmethod decorator?

    Perhaps you are kidding, but I've had a similar thought on a number of occasions. For kicks, I'll float this by python-ideas. :)

    @brettcannon
    Copy link
    Member

    In terms of invalidate_caches(), it could easily be changed to walk sys.meta_path and to call the method on the meta path finders. Not sure if that generalization is reasonable or not, though.

    @ericsnowcurrently
    Copy link
    Member

    Not to detract from the fun <wink> terminology discussion, but I have lingering concerns with the ABC inheritance model here.

    Looking over changeset 184700df5b6a, I'm still fine with the change if invalidate_caches() is appropriate for both. If not or if we don't know yet, I think we should hold off on moving invalidate_caches() up into Finder. If it is okay then we should add meta path finder traversal to importlib.invalidate_caches(). Either way there's a (not substantial) loose end.

    @brettcannon
    Copy link
    Member

    I guess the question is how extensive do we want this cache validation to go? Do we think that only path entry finders stored in sys.path_importer_cache will have stuff to be cleared, or do we think stuff on sys.meta_path or sys.path_hooks will have things that need to get cleared as well? The more I think about it, the more I want to generalize this to the point of sys.meta_path and then having PathFinder handle sys.path_hooks and sys.path_importer_cache. That stays in line with import not caring about sys.path, etc. and it really just being some cruft left over from Python 2.

    Regardless, though, I would not define invalidate_caches() on Finder and instead define it separately on PathEntryFinder and MetaPathFinder. It's new in 3.3 so it doesn't need to be on Finder for backwards-compatibility, and it isn't inherent to finders. Plus if Finder goes away we will have to push it up to the other ABCs anyway.

    @ericvsmith
    Copy link
    Member

    I think the cache invalidation should be moved to the sys.meta_path level.

    @ericsnowcurrently
    Copy link
    Member

    Good points, Brett. While I still favor my posted patch, wouldn't Nick's commit still be okay? Or would it be confusing to people to have this "Finder" class that no one is actually supposed to use?

    @brettcannon
    Copy link
    Member

    Nick's current patch is fine, although I would add back in invalidate_caches() into MetaPathFinder and PathEntryFinder where they do nothing but return NotImplemented as a form of no-op. I know you have a python-ideas discussion going about optional abstractmethod, but until that's settled I would rather have something there to document in code what is supported as part of the API.

    @brettcannon
    Copy link
    Member

    As for Finder containing nothing significant, that's a side-effect of the cleanup and why Finder should get deprecated, especially since the contract that a Finder implements find_module() is no longer being enforced. Tying MetaPathFinder and PathEntryFinder together because they are conceptually finders is just not useful from a programming perspective since their APIs have no overlap except for invalidate_caches(), which if generalized, is just happenstance and not by explicit design.

    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented Aug 2, 2012

    Per Nick's request that I redirect this here instead of bpo-15295... (w/already-fixed things deleted):

    Hope I'm not too late to the bikeshed painting party; just wanted to chip in with the suggestion of "self-contained package" for non-namespace packages. (i.e., a self-contained package is one that cannot be split across different sys.path entries due to its use of an __init__ module).

    Also, technically, namespace portions do not only contribute subpackages; they can contribute modules as well. (The trunk doc says they contribute subpackages ATM.)

    Another point of possible confusion: meta finders and path finders are both described as "hooks", but this terminology seems in conflict with the use of "hook" as describing a callable on path_hooks. Perhaps we could drop the term "hook" from this section, and retitle it "Import System Extensions" and say you can extend the system by writing meta finders and path entry finders. This would let the term "hook" be the exclusive property of path_hooks, which is how you extend the import system to use your custom finders.

    With respect to the current discussion about "importer", "handler", "finder", etc., I'd like to propose a slight tweak that I think can clear up the remaining confusing points:

    1. Replace "meta path finder" with "import handler"
    2. Replace "path finder" with "sys.path import handler"

    Rationale:

    Despite the similarity in interface between the two types of finders, they perform different sorts of functions. A object in sys.meta_path can (and MUST) implement a broad module-finding policy, whereas a path entry finder is much more limited in scope - it only gets to override later path entry finders, and can't, for example, decide to use some other sort of __path__ attribute.

    (I'm not married to the term "handler" here, either - I just think "importer" is too generic, "finder" conflates with the more-commonly used kind of finder. "Policy" could be another possibility. Both imply a broader scope of responsibility than "finder", and don't impinge on the special meaning of "importer".)

    With a bigger distinction drawn between these two types of finders, we can say that you...

    """Extend the import system by adding "import handlers" to sys.meta_path, and that one of the default handlers is the "sys.path import handler", which processes imports using sys.path and module __path__ attributes.

    The sys.path import handler can of course in turn be extended by adding "path hooks" to sys.path_hooks, which are then used to create "module finder" objects for the path entry strings found in sys.path and module __path__ attributes. A path hook must return a finder object, which implements similar methods to those of an import handler, but with some important differences.
    """

    Of course, if you make that "sys.path import policy" instead of "sys.path import handler", it still works.

    Thoughts?

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Aug 3, 2012

    On the "self-contained" packages point, I was going to suggest that name, but the ability to set __path__ from __init__.py messes with it. "Regular packages" isn't great, but I'm willing to tolerate it (mainly because it was the terminology used in the approved PEP-420). The reason I still prefer "initialized" package though, is because the real difference between the two is the presence or absence of __init__.py. That means arbitrary code execution can occur as a side effect of import for initialized packages, in contrast to namespace packages which are comparatively "pure" (that is, no user provided code is executed as a side effect of importing them).

    +1 for renaming the current "hooks" section to "extensions" (with only the callables on path_hooks being referred to as hooks).

    On the topic of "import handlers" vs "meta path finders", the reason I'm happy with keeping the common suffix for "meta path finders" and "path entry finders" is that while there are substantial differences in *how* they do what they do, they are fundamentally the same in terms of *what* they do: given a module name, find a loader for it (or indicate that you can't find one). Meta path finders search more broadly than path entry finders do, but they're still doing much the same job.

    I've come to the conclusion that PEP-302 actually did a reasonable job with the finder vs loader vs importer classification scheme, and it's not worth trying to change that terminology too much after it's already been in the wild for 10+ years. Clarify and refine by introducing additional distinctions, yes, but change, no.

    @ericsnowcurrently
    Copy link
    Member

    The more I think about it the more I think there needs to be a harder separation between the import system proper and the path-based subsystem. I hadn't really thought of it this way until the discussion of the last few days, but it has really clicked in my mind.

    The new "import system" in the language reference already does a pretty good job of this. I'll try to get a patch up in the next few days that shows what I mean.

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Aug 3, 2012

    See my comment above about explicitly defining the "path import subsystem" and renaming the "path importer" to be the "path import finder". I think the subsystem is well worth calling out explicitly, as when most people think of Python's import system, they're far more likely to be thinking about sys.path and filesystem imports in general than they are the meta path machinery.

    @ericsnowcurrently
    Copy link
    Member

    This patch implements most of Brett's recommendation. I've held off on actually deprecating Finder just yet. However, it would probably entail a message in the docs like this:

    The API signatures for meta path finders and path entry finders
    were separated by PEP-420. This class is still available for
    compatibility with legacy third party reimplementations of the import
    system. Either :class:`MetaPathFinder` or :class:`PathEntryFinder`
    should be used instead.

    @brettcannon
    Copy link
    Member

    Eric's latest patch LGTM except for the missing patch for test.test_importlib.test_abc to fix the inheritance checks.

    Nick, Eric Smith, or Barry: can you double-check the patch is good? About the only thing that might be questionable is having MetaPathFinder and PathEntryFinder not inherit from Finder itself, but that does make sense and we do have multiple inheritance for a reason in Python.

    @brettcannon
    Copy link
    Member

    I still need a core committer to double-check me on Eric's patch to make sure the ABC changes are acceptable. I can handle the commit if someone will just sign off.

    @ncoghlan
    Copy link
    Contributor Author

    Done. Only substantive comment was that Finder should be registered with both of the more specific metaclasses, but there were a couple of docs comments as well.

    @ncoghlan
    Copy link
    Contributor Author

    Oh, and +1 for Eric's "path-based finder". States clearly that it is a finder that *uses* import paths, rather than finding them.

    @brettcannon
    Copy link
    Member

    Thanks for the review, Nick! I will get this patch committed sometime today with your comments integrated.

    @brettcannon
    Copy link
    Member

    Actually there is one issue with PathEntryFinder inheriting from Finder; it doesn't need to implement find_module() if it doesn't want to in Python 3.3. So I will need to stick in a dummy find_module() on the class that calls find_loader() and then returns any found loader and tosses the namespace paths.

    @ncoghlan
    Copy link
    Contributor Author

    Sounds good. Perhaps steal the impl from FileFinder.find_module (i.e. the one with the ImportWarning).

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 10, 2012

    New changeset 0a75ce232f56 by Brett Cannon in branch 'default':
    Issue bpo-15502: Finish bringing importlib.abc in line with the current
    http://hg.python.org/cpython/rev/0a75ce232f56

    @brettcannon brettcannon self-assigned this Aug 10, 2012
    @brettcannon
    Copy link
    Member

    I will go back and steal the FileFinder.find_module implementation after I finish the current patch I'm working on.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 10, 2012

    New changeset e7a67f1bf604 by Brett Cannon in branch 'default':
    Issue bpo-15502: Refactor some code.
    http://hg.python.org/cpython/rev/e7a67f1bf604

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    release-blocker stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants