New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Meta path finders and path entry finders are different, but share an ABC #59707
Comments
Making this a dependency on the import machinery docs issues as the terminology settled on in those docs to differentiate between finders involving sys.meta_path and sys.path will help dictate what to name the new ABCs. |
Would importlib.abc.Finder become something like MetapathHandler, or would we just add a new one (like PathEntryHandler)? Obviously the actual names would depend on the outcome of bpo-15295, but the fate of importlib.abc.Finder is independent of the names. |
As we want to eventually deprecate find_module in the PathEntryHandler API, I'd suggest two new subclasses, MetaPathFinder and PathEntryHandler, both inheriting from Finder. Current path hooks would be updated to inherit from PathEntryHandler, meta path hooks from MetaPathFinder. |
Yeah, but right now the API of importlib.abc.Finder is strictly find_module(name, path=None). I would expect that to remain the same for MetaPathFinder (bikeshedding aside :). So what would be left in importlib.abc.Finder if the ultimate plan is that PathEntryHandler would not implement find_module()? Would it simply be an alias to MetaPathFinder? I agree with the plan, but wonder if a common base class is warranted. The connection to "finder" in PEP-302 is nice, but rather superfluous in a practical context. Having a Finder class is necessary at this point for backward-compatibility, but I don't see how it is different from the more appropriately named MetaPathFinder. |
So I think Nick has the inheritance inverted and I would have Finder inherit from everything to stay backwards-compatible and to ease future deprecation (which could potentially happen starting in Python 3.3), although my approach weakens the usefulness of the ABCs to the point that Nick is probably right in the end. =) Ideally you would have MetaPathFinder which has find_module(fullname, path). Nnotice there is no default value for path since meta path finders always have something passed in for 'path', even if it's None. The other ABC would be PathImporter which defines find_loader(fullname). Notice how 'path' is no longer defined as it isn't used by PathFinder when calling a path importer. importlib.abc.Finder would then inherit from both MetaPathFinder and PathImporter. This is because if someone wants to be compatible with both Python 3.2 and 3.3 without any trickery, they will want to simply inherit from Finder for either a meta path finder or a path importer. Otherwise in Python 3.2 you would have to do some conditional work or define a fake e.g. MetaPathFinder ABC in order to inherit from the new ABCs. Finder would have find_module(fullname, path=None) defined but returning None by default. It would also inherit from PathImporter, but this is where it gets tricky since importlib decides whether to call find_loader() based on whether find_loader() is defined, which means PathImporter couldn't require it to instantiated. Otherwise some hack will be needed in Finder for it to claim that find_loader() is defined but while still failing a hasattr() check (which I can't think of how to do; maybe some insane metaclass which temporarily sets find_loader() so abstractmethod doesn't complain but then isn't defined in the end so __getattr__ still raises an AttributeError?). IOW I don't see an ideal situation here and we will probably want to do what Nick suggested. |
I would largely use Brett's proposed interface for MetaPathFinder (but keep Finder can then just become a legacy alias for MetaPathFinder. |
I'm working up a patch. My intention is to match up the names to those that Barry has pushed out to the updated glossary: MetaPathFinder and PathEntryFinder. I'll also include doc changes. I'm going to toy with not having any inheritance between the 3, but I'm not going to bother if it's not simple. Brett made some good points that I'll have to cover. Otherwise I'll likely go with Nick's recommendation. I'm in favor of deprecating Finder sooner rather than later. Also, I plan on registering importlib.machinery.PathFinder against MetaPathFinder and importlib.machinery.FileFinder against PathEntryFinder. One last thing: with Barry's use of "path importer" in the docs, I'd like to change the name of importlib.machinery.PathFinder to PathImporter (and leave in a deprecated legacy alias like with importlib.abc.Finder). This would line up with the docs and would also match the naming of the other two default importers. If Brett has strong feelings on this I'll let it go. |
(With the draft docs now comitted, this issue is the home for thrashing out the terminology. I'm not going to propose taking this to import-sig or python-dev, because the bikeshed would end up being fuschia or pink-with-yellow-polkadots or something) However this turns out, the names in the glossary and the ABCs in importlib should match for 3.3b2. I wouldn't bother with programmatic deprecation here (it's a PITA), just note them as deprecated aliases in the docs. However, I also think we need to take into account the naming precedents established (or endorsed) by importlib and think carefully about whether which ones we want to change.
|
Further thoughts:
|
Reading through What's New, I withdraw the suggestion of changing the "path entry finder" term. It's just not worth the hassle. That also simplifies things a lot. All that needs to change to make things consistent:
|
Here's a patch that addresses Nick's 2 & 3. I think I found a satisfactory solution for the inheritance that is both correct and backward compatible. One question for now: does FileFinder need to keep its find_module() method? That should be resolved before 3.3 is released to avoid backward compatibility issues. |
I just noticed that WindowsRegistryImporter does not have load_module(). If we are going for consistency, we should change the name to WindowsRegistryFinder. |
In my mind the key is consistency. While a worthy goal regardless, for the import system and its attendant complexity, consistency is crucial. ("importer" == finder + loader) sounds fine with me. Even if some of these terms have been ambiguous in the past, I think that a consistent usage in the current docs will leave the ambiguity in the past. I will advocate Barry's recommendation that we put a warning in PEP-302 that the terms there may be outdated. I'm find with changing "path importer" to "path finder". If no one does a patch first, I'll put one up tomorrow. I actually prefer "path entry handler" for the distinctiveness, but I'm fine with leaving it be. However, I get a sense that this is our one shot at this decision. I'm also okay with renaming FileFinder to DirectoryHandler, particularly since it is a new class to 3.3 (no backward-compatibility issues). Naturally this would parallel any decision on "path entry handler". |
Reading the 3.3 What's New reminded me that we aren't really as free to redefine this terminology as we might hope (or as I claimed on the previous issue). We can still try, of course, but the PEP-302 naming scheme has been around for 10 years, and there are plenty of guides and other things that assume that the things on the meta path and things returned by path hooks are both finders. That's actually reasonable, since the key role of both is to find a module loader given a module name. They differ in API details, but they fundamentally do the same thing. Better to go with the flow, especially given how close we are to release. As far as the implementation goes: +1 for renaming WindowsRegistryImporter to WindowsRegistryFinder -1 on breaking any find_module implementations. Backwards compatibility requirements still apply to the importlib API - while the default import system won't call FileFinder.find_module() any more, third party import reimplementations are still free to do so. |
Except that a good portion of the importlib API is new in 3.3, including FileFinder. I've removed the changes to the 3 default meta path finders don't belong in the patch. |
Fair point. However, the find_module() variant of the API isn't dead yet, so we should still support it. find_module() likely won't die completely until 4.0. |
I've mostly run out of time to work on the docs, but I do want to say that I On Aug 01, 2012, at 05:13 AM, Eric Snow wrote:
That's how I tried to think about it. In some sense, it's just an While technically a finder, calling it the "path finder" just does not work, A diagram would seal the deal I think.
Please don't, for reasons stated above!
I don't think "handler" is a very good term. These things *are* finders, even I'm going to leave all the abc names and implementations to you guys. :) Thanks for all your excellent feedback and your own work on this nasty little |
So just to weigh in on this, I think "meta path finder" and "path entry finder" (as Barry stated in the glossary already) clearly delineate what is from sys.meta_path and what is from some path entry hook by having a clarifying word to go with "finder" (either kind of finder is still trying to find a loader). Finder/loader/importer has history in terms of usage so we can't muck with these terms too much. This does mean, though, that PathFinder and FileFinder are misnamed. Unfortunately PathFinder existed in previous versions of Python, so it will need an alias to its old name from a new name (PathImporter as Barry wants or MetaPathEntryFinder?). But FileFinder is new, so that can be renamed to FilePathEntryFinder (or FileEntryFinder). And of course WindowsRegistryImporter should be changed to *Finder. I think if we harp on the "meta path" vs. "path entry" and use it consistently and extensively then the finder/loader/importer terminology does not need to change meaning and we don't have to misuse it for path importer to classify PathFinder while keeping the rest of Barry's terms intact which make it obvious when something is directly because of how import works vs. just a single meta path finder that happens to be what nearly everyone ends up using to handle their imports. |
@barry: I'm fine with what you've said. My big concern is that we be really consistent here because of the complexity of the import system. Things are at a point that I think we're pretty close in the regard, so I'm mostly fine if we stay put. My only real regret would likely be "path importer" vs. PathFinder, but I can live with it. :) @brett: I'll update my patch for WindowsRegistryImporter and FileFinder. @anyone: do the doc-based deprecations in the patch look okay? The inheritance model? |
The patch will have Finder break all subclasses that don't define find_loader which is backwards-incompatible. And if you really want to deprecate Finder you should have a warning in an __init__ method (as one would hope people are calling super() as necessary). Otherwise the patch looks fine (from a cursory glance). |
As a final attempt at giving the path import machinery the emphasis it
"path_importer_cache" would still be a bit of a misnomer, but that ship |
On Aug 01, 2012, at 10:03 PM, Nick Coghlan wrote:
-0. To me, this doesn't make things vastly more clear for the additional
Yeah, we all agree that one's unfortunate! ;) |
The problem with "path importer" is it's just plain *wrong*. That object is |
On Aug 01, 2012, at 10:44 PM, Nick Coghlan wrote:
The term "importer" is sufficiently squishy though, and I think we could just
The problem is that "import finder" isn't correct either, because this thing It is unquestionably a finder, so probably the most accurate term would be Keep trying, but so far "path importer" is the least crappy of the lot, IMHO. |
How so? Finder is registered against PathEntryFinder. It doesn't inherit, so it doesn't get the abc.ABCMeta as a metaclass and thereby there is no instantiation-time checks for find_loader(). Attached is a patch that makes the two names changes (FileFinder was all over the place). |
Regarding deprecating importlib.abc.Finder, I'll add a DeprecationWarning to the patch later. |
What about "Path Scanner"? Came to me in a dream, so it has to be perfect, |
Regarding the ABCs, I'm okay with it as long as invalidate_caches() is meant to be a part of meta path finders. I had considered this while writing the patch, so how I had it wasn't a fluke. The fluke was that I didn't bring it up for discussion like I had meant to do. Anyway, my rationale was that importlib.invalidate_caches() is specific to path entry finders. My patch reflected this. If invalidates_caches() were meaningful for meta path finders then we could address that later without losing backward compatibility. As to pulling find_module() off Finder, I felt that having a distinct signature, while semantically unimportant, made a clearer distinction between the old and the new. |
On Aug 02, 2012, at 11:33 AM, Nick Coghlan wrote:
One problem I have, which I'm not sure how to solve, is that the protocol You'd like for the ABC to be able to capture both the support of these methods Maybe we need an @abc.optionalabstractmethod decorator? ;) |
I wish "path subsystem finder" weren't so long. Then PathFinder would probably still be appropriate for the class name (as FileFinder is appropriate for FilePathEntryFinder :). |
Perhaps you are kidding, but I've had a similar thought on a number of occasions. For kicks, I'll float this by python-ideas. :) |
In terms of invalidate_caches(), it could easily be changed to walk sys.meta_path and to call the method on the meta path finders. Not sure if that generalization is reasonable or not, though. |
Not to detract from the fun <wink> terminology discussion, but I have lingering concerns with the ABC inheritance model here. Looking over changeset 184700df5b6a, I'm still fine with the change if invalidate_caches() is appropriate for both. If not or if we don't know yet, I think we should hold off on moving invalidate_caches() up into Finder. If it is okay then we should add meta path finder traversal to importlib.invalidate_caches(). Either way there's a (not substantial) loose end. |
I guess the question is how extensive do we want this cache validation to go? Do we think that only path entry finders stored in sys.path_importer_cache will have stuff to be cleared, or do we think stuff on sys.meta_path or sys.path_hooks will have things that need to get cleared as well? The more I think about it, the more I want to generalize this to the point of sys.meta_path and then having PathFinder handle sys.path_hooks and sys.path_importer_cache. That stays in line with import not caring about sys.path, etc. and it really just being some cruft left over from Python 2. Regardless, though, I would not define invalidate_caches() on Finder and instead define it separately on PathEntryFinder and MetaPathFinder. It's new in 3.3 so it doesn't need to be on Finder for backwards-compatibility, and it isn't inherent to finders. Plus if Finder goes away we will have to push it up to the other ABCs anyway. |
I think the cache invalidation should be moved to the sys.meta_path level. |
Good points, Brett. While I still favor my posted patch, wouldn't Nick's commit still be okay? Or would it be confusing to people to have this "Finder" class that no one is actually supposed to use? |
Nick's current patch is fine, although I would add back in invalidate_caches() into MetaPathFinder and PathEntryFinder where they do nothing but return NotImplemented as a form of no-op. I know you have a python-ideas discussion going about optional abstractmethod, but until that's settled I would rather have something there to document in code what is supported as part of the API. |
As for Finder containing nothing significant, that's a side-effect of the cleanup and why Finder should get deprecated, especially since the contract that a Finder implements find_module() is no longer being enforced. Tying MetaPathFinder and PathEntryFinder together because they are conceptually finders is just not useful from a programming perspective since their APIs have no overlap except for invalidate_caches(), which if generalized, is just happenstance and not by explicit design. |
Per Nick's request that I redirect this here instead of bpo-15295... (w/already-fixed things deleted): Hope I'm not too late to the bikeshed painting party; just wanted to chip in with the suggestion of "self-contained package" for non-namespace packages. (i.e., a self-contained package is one that cannot be split across different sys.path entries due to its use of an __init__ module). Also, technically, namespace portions do not only contribute subpackages; they can contribute modules as well. (The trunk doc says they contribute subpackages ATM.) Another point of possible confusion: meta finders and path finders are both described as "hooks", but this terminology seems in conflict with the use of "hook" as describing a callable on path_hooks. Perhaps we could drop the term "hook" from this section, and retitle it "Import System Extensions" and say you can extend the system by writing meta finders and path entry finders. This would let the term "hook" be the exclusive property of path_hooks, which is how you extend the import system to use your custom finders. With respect to the current discussion about "importer", "handler", "finder", etc., I'd like to propose a slight tweak that I think can clear up the remaining confusing points:
Rationale: Despite the similarity in interface between the two types of finders, they perform different sorts of functions. A object in sys.meta_path can (and MUST) implement a broad module-finding policy, whereas a path entry finder is much more limited in scope - it only gets to override later path entry finders, and can't, for example, decide to use some other sort of __path__ attribute. (I'm not married to the term "handler" here, either - I just think "importer" is too generic, "finder" conflates with the more-commonly used kind of finder. "Policy" could be another possibility. Both imply a broader scope of responsibility than "finder", and don't impinge on the special meaning of "importer".) With a bigger distinction drawn between these two types of finders, we can say that you... """Extend the import system by adding "import handlers" to sys.meta_path, and that one of the default handlers is the "sys.path import handler", which processes imports using sys.path and module __path__ attributes. The sys.path import handler can of course in turn be extended by adding "path hooks" to sys.path_hooks, which are then used to create "module finder" objects for the path entry strings found in sys.path and module __path__ attributes. A path hook must return a finder object, which implements similar methods to those of an import handler, but with some important differences. Of course, if you make that "sys.path import policy" instead of "sys.path import handler", it still works. Thoughts? |
On the "self-contained" packages point, I was going to suggest that name, but the ability to set __path__ from __init__.py messes with it. "Regular packages" isn't great, but I'm willing to tolerate it (mainly because it was the terminology used in the approved PEP-420). The reason I still prefer "initialized" package though, is because the real difference between the two is the presence or absence of __init__.py. That means arbitrary code execution can occur as a side effect of import for initialized packages, in contrast to namespace packages which are comparatively "pure" (that is, no user provided code is executed as a side effect of importing them). +1 for renaming the current "hooks" section to "extensions" (with only the callables on path_hooks being referred to as hooks). On the topic of "import handlers" vs "meta path finders", the reason I'm happy with keeping the common suffix for "meta path finders" and "path entry finders" is that while there are substantial differences in *how* they do what they do, they are fundamentally the same in terms of *what* they do: given a module name, find a loader for it (or indicate that you can't find one). Meta path finders search more broadly than path entry finders do, but they're still doing much the same job. I've come to the conclusion that PEP-302 actually did a reasonable job with the finder vs loader vs importer classification scheme, and it's not worth trying to change that terminology too much after it's already been in the wild for 10+ years. Clarify and refine by introducing additional distinctions, yes, but change, no. |
The more I think about it the more I think there needs to be a harder separation between the import system proper and the path-based subsystem. I hadn't really thought of it this way until the discussion of the last few days, but it has really clicked in my mind. The new "import system" in the language reference already does a pretty good job of this. I'll try to get a patch up in the next few days that shows what I mean. |
See my comment above about explicitly defining the "path import subsystem" and renaming the "path importer" to be the "path import finder". I think the subsystem is well worth calling out explicitly, as when most people think of Python's import system, they're far more likely to be thinking about sys.path and filesystem imports in general than they are the meta path machinery. |
This patch implements most of Brett's recommendation. I've held off on actually deprecating Finder just yet. However, it would probably entail a message in the docs like this: The API signatures for meta path finders and path entry finders |
Eric's latest patch LGTM except for the missing patch for test.test_importlib.test_abc to fix the inheritance checks. Nick, Eric Smith, or Barry: can you double-check the patch is good? About the only thing that might be questionable is having MetaPathFinder and PathEntryFinder not inherit from Finder itself, but that does make sense and we do have multiple inheritance for a reason in Python. |
I still need a core committer to double-check me on Eric's patch to make sure the ABC changes are acceptable. I can handle the commit if someone will just sign off. |
Done. Only substantive comment was that Finder should be registered with both of the more specific metaclasses, but there were a couple of docs comments as well. |
Oh, and +1 for Eric's "path-based finder". States clearly that it is a finder that *uses* import paths, rather than finding them. |
Thanks for the review, Nick! I will get this patch committed sometime today with your comments integrated. |
Actually there is one issue with PathEntryFinder inheriting from Finder; it doesn't need to implement find_module() if it doesn't want to in Python 3.3. So I will need to stick in a dummy find_module() on the class that calls find_loader() and then returns any found loader and tosses the namespace paths. |
Sounds good. Perhaps steal the impl from FileFinder.find_module (i.e. the one with the ImportWarning). |
New changeset 0a75ce232f56 by Brett Cannon in branch 'default': |
I will go back and steal the FileFinder.find_module implementation after I finish the current patch I'm working on. |
New changeset e7a67f1bf604 by Brett Cannon in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: