-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a keyword-only spec argument to types.ModuleType #64582
Comments
Would allow for the name attribute to be optional since it can be grabbed from the spec. Since having module.__spec__ set is now expected we should help ensure that by supporting it in the constructor. |
I envision making this happen would also allow for importlib to be updated to rely on __spec__ when possible, with the idea that sometime in the future we can deprecate pulling attributes from a module directly and shift to always working from __spec__. |
Sounds good to me. |
Is there any chance that this would ever happen? |
I made roughly the same point in the current import-sig thread that relates here: https://mail.python.org/pipermail/import-sig/2014-April/000805.html Basically, I agree we should be careful with both __name__ and __file__. |
No, the attribute level arguments won't go away - __name__ deliberately differs from __spec__.name in some cases (notably in __main__), __path__ may be manipulated after the module is loaded, and __name and __file__ are both used too heavily within module code for it to be worth the hassle of deprecating them in favour of something else. I think Brett's push to simplify things as much as possible is good though - that's the main brake on creeping API complexity in the overall import system as we try to make the internals easier to comprehend and manipulate. |
I can dream about getting rid of the attributes, but I doubt it would happen any time soon, if at all. But we do need to make it easier to set __spec__ on a new module than it currently is to help promote its use. |
My current thinking on this it to introduce in importlib.util: def module_from_spec(spec, module=None):
"""Create/initialize a module based on the provided spec.
This serves two purposes. One is that it abstracts the loader.create_module() dance out so that's no longer a worry. But more crucially it also means that if you have the function create the module for you then it will be returned with all of its attributes set without having to worry about forgetting that step. The module argument is just for convenience in those instances where you truly only want to override the module creation dance for some reason and really just want the attribute setting bit. |
Here is an implementation of importlib.util.module_from_spec(). With this it makes PEP-451 conceptually work like: spec = importlib.util.find_spec(name)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module) About the only other thing I can think of that people might still want is something like |
Nope, I commented where I meant to. I wanted a way to promote people to **always** create modules with properly initialized attributes while also dealing with the module creation dance at the same time. Otherwise it will require expanding the API of types.ModuleType() to accommodate specs while also needing to expose a function to do the proper thing to get a module for a loader and **still** have a function to set up a module properly based on what was potentially returned from create_module(). Rolling it all into a single function that just gets you a module ready for use seems like the most practical solution from an API perspective. |
I'd ask "Why not a class method?", but I already know the answer (types.ModuleType is implemented in C, so it would be unnecessarily painful to implement it that way). Given that, the utility function approach sounds good to me. |
Okay, I didn't read closely enough. :) It may be worth updating the title. FWIW, the name "module_from_spec" confused me at first because my brain interpreted that as "load_from_spec". Keeping the name and purpose more focused might be helpful. I have comments below to that effect. If your proposed change still makes sense, could we keep it simpler for now? Something like this: # in importlib.util
def module_from_spec(spec, module=None):
"""..."""
methods = _bootstrap._SpecMethods(spec)
if module is None:
return methods.create()
else:
methods.init_module_attrs(methods)
return module Keeping the 2 methods on _SpecMethods is helpful for the PEP-406 (ImportEngine) superseder that I still want to get back to for 3.5. :) ----------------------------------
I'm not sure what dance you mean here and what the worry is.
So we would discourage calling ModuleType directly and encourage the use of a function in importlib.util that is equivalent to _SpecMethods.create(). That sounds good to me. The use case for creating module objects directly is a pretty advanced one, but having a public API for that would still be good. From this point of view I'd expect it to just look like this: def new_module(spec):
return _SpecMethods(spec).create() or given what I expect is the common use case currently: def new_module(name, loader):
spec = spec_from_loader(name, loader)
return _SpecMethods(spec).create() or together: def new_module(spec_or_name, /, loader=None):
if isinstance(spec_or_name, str):
name = spec_or_name
if loader is None:
raise TypeError('missing loader')
spec = spec_from_loader(name, loader)
else:
if loader is not None:
raise TypeError('got unexpected keyword argument "loader"')
spec = spec_or_name
return _SpecMethods(spec).create() To kill 2 birds with 1 stone, you could even make that the new signature of ModuleType(), which would just do the equivalent under the hood. That way people keep using the same API that they already are (no need to communicate a new one to them), but they still get the appropriate attributes set properly.
Perhaps it would be better to have a separate function for that (equivalent to just _SpecMethods.init_module_attrs()). However, isn't that an even more uncommon use case outside of the import system itself?
I agree it's somewhat orthogonal. I'll address comments to issue bpo-21235. |
First, about breaking up _SpecMethods: that was entirely on purpose. =) I honestly have found _SpecMethods a bit of a pain to work with because at every place where I have a spec object and I need to operate on it I end up having to wrap it and then call a method on it instead of simply calling a function (it also doesn't help that spec methods is structured as a collection of methods which can just operate as functions since they almost all have Second, the dance that an advanced user has to worry about is "does create_module exist, and if so did it not return None and if any of that is not true then return what types.ModuleType would give you" is not exactly one line of code ATM. There is no reason to not abstract that out to do the right thing in the face of a loader. Third, yes this would be to encourage people not to directly call types.ModuleType but call the utility function instead. Fourth, I purposefully didn't bifurcate the code of types.ModuleType based on the type of what the first argument was. At best you could change it to take an optional spec as a second argument and use that, but if you did that and the name doesn't match the spec then what? I'm not going to promote passing in a loader because spec_from_loader() cannot necessarily glean all of the same information a hand-crafted spec could if the loader doesn't have every possible introspection method defined (I'm calling "explicit is better than implicit" to back that up). I also don't think any type should depend on importlib having been previously loaded to properly function if it isn't strictly required, so the code would have to be written entirely in C which I just don't want to write. =) Since it's going beyond simply constructing a new module but also initializing it I think it's fine to keeping it in importlib.util which also makes it all more discoverable for people than having to realize that types.ModuleType is the place to go to create a module and get its attributes initialized. Fifth, fair enough on not needing the module argument. I will refactor the code for internal use of attribute initialization in testing and leave it at that. |
tl;dr I'm okay with pulling the functions out of _SpecMethods (and
Keep in mind that _SpecMethods is basically a proxy for ModuleSpec SimpleModuleSpec # What currently is ModuleSpec. Regardless, _SpecMethods made sense at the time and I still find it
That's what _SpecMethods.create() already does for you.
I'm totally on board. :)
I see your point. I just see the trade-offs a little differently. :)
If every module must have a spec, then I'd expect that to be part of Regarding the second point, with a separate factory function, people Backward compatibility for an updated signature shouldn't be too hard: currently: ModuleType(name, doc=None)
Regardless of new signature or new factory, I still think the mod = new_module(name, loader=loader) vs. spec = spec_from_loader(name, loader=loader)
mod = new_module(spec) I'll argue that we can accommodate the common case and if that doesn't
If it's in _bootstrap then it's already available. No C required. :)
It seems to me that people haven't been thinking about initializing a |
I think we view the fundamentals of built-in types differently as well. =) A module instance can exist without a spec no problem. As long as you don't pass that module instance through some chunk of code that expects __spec__ -- or any other attribute for that matter -- then the instance is fine and operates as expected. There is nothing fundamental to module objects which dictates any real attribute is set (not even __name__). If I wanted to construct a module from scratch and just wanted a blank module object which I assign attributes to manually then I can do that, and that's what types.ModuleType provides (sans its module name requirement in its constructor which really isn't necessary either). This also means that tying the functionality of the module type to importlib is the wrong direction to have the dependency. That means I refuse to add code to the module type which requires that importlib have been imported and set up. Just think of someone who embeds Python and has no use for imports; why should the module type then be broken because of that choice? That means anything done to the module type needs to be done entirely in C. But luckily for you bytes broke the glass ceiling of pivoting on a type's constructor based on its input type (it is the only one, though). So I'll let go on arguing that one. =) Anyway, I'll think about changing types.ModuleType, but having to implement init_module_attrs() in pure C and then exposing it in _imp just doesn't sound like much fun. And as for your preference that "the distinct functions continue to exist as they are", are you saying you want the code duplicated or that you just don't like me merging the create() and init_module_attrs() functions? |
I give. :) You've made good points about builtins and C implementations. Also, thinking about issue bpo-21235 has changed my perspective a bit. As to _SpecMethods, I mean just drop the class and turn the methods into functions:
And then importlib.util.new_module: def new_module(spec):
return _bootstrap._spec_create(spec) |
Why do you want a one-liner wrapper for the functions for the public API when they are exactly the same? |
You're right that it doesn't have to be a one-line wrapper or anything more than an import-from in importlib.util. :) |
Giving Eric is polymorphic first argument to types.ModuleType() is going to be tricky thanks to the fact that it is not hard-coded anywhere in the C code nor documented that the first argument must be a string (the only way it might come up is from PyModule_GetName() and even then that's not required to work as you would expect). And since we purposefully kept specs type-agnostic, you can't do a type check for a SimpleNamespace. I think the only way for it to work is by attribute check (e.g. is 'name' or 'loader' defined on the object, regardless of value). |
Another issue with the polymorphic argument is that the module type is one of those rare things written in C with keyword parameter support, so renaming the 'name' argument to 'name_or_spec' could potentially break code. |
Yeah, it just looks too complicated to take the ModuleType signature approach, as much as I prefer it. :) I appreciate you taking a look though. |
And another complication is the compatibility hack to set loader when submodule_search_locations is set but loader is not since _NamespaceLoader is not exposed in C without importlib which gets us back into the whole question of whether types should function entirely in isolation from other subsystems of Python. |
But that part is less of a concern since we don't need namespace packages before or during bootstrapping, and afterward we have access to interp->importlib. Or am I missing something?
That is a great question, regardless. In part I imagine it depends on the subsystem and how intrinsic it is to the interpreter, which seems like a relatively implementation-agnostic point from a high-level view. |
New changeset b26d021081d2 by Brett Cannon in branch 'default': |
After all the various revelations today about how much of a hassle and murky it would be to get types.ModuleType to do what we were after, I went ahead and kept importlib.util.module_from_spec(), but dropped the module argument bit. I also deconstructed _SpecMethods along the way while keeping the abstractions as Eric requested. |
Thanks for doing that Brett and for accommodating me. :) Also, the various little cleanups are much appreciated. |
Another thing to keep in the back of your minds: one of my goals for PEP-432 is to make it possible to have a fully functional embedded interpreter |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: