New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Add basic extension loader. #4665
base: main
Are you sure you want to change the base?
Conversation
toofar writes:
### Error handling
I assume if we fail to load a third party extension we should try to ignore it and continue on. That currently isn't the case, even the `_on_walk_error()` method that I am re-using raises an error which is never handled. I am not sure how to best proceed though, just log an error?
* when `os.mkdir()` fails, because we have seen that before,
* and when importing the extensions, because I assume we don't want third party extensions to crash the browser when there dependancies aren't met or something,
* and when running the module init hooks for the same reason.
configfiles.py has the logic for handling errors in config.py, I'm guessing we
should do something almost exactly the same so that normally, an error in an
extension will just pop an error (and ideally keep loading the rest of the
extensions)
### Experimental feature warning thing
Shall I just log an all caps thing? Or I could make it only be active if the `--debug` flag is passed.
This should probably log always when it loads a module for now.
## Future work
### Support loading folders
If I remove the `if ispkg: continue` check the finder line actually succeeds in loading the directory (if it has a `__init__.py`) but after we yield it drops back into the `walk_packages` implemention after its own yield where it calls `__import__(name)` which fails because it presumably tries to load the parent package which isn't real. It even fails with the current implementation if there is a package folder in the extensions folder ever though we aren't importing it.
It may be as simple as ignoring errors for packages, we could check in sys.modules if the package itself got imported right.
I'm not sure how to handle this. I'm not very familiar with the package
walker, but how does it handle alternate top level modules in packages? If two
packages define a 'tests' module, what will happen? Unfortunately making the
tests folder a module is needed to get pytest to run properly without args.
I'm guessing right now, it only loads single files (that would be a bummer for
larger extensions, so it would be nice to have it fixed).
### Unloading/Re-loading modules
The `config-source` command has been useful for developing external things so far. Both when developing and when adding things developed in a separate basedir into my main browser instance when I have got something stable without having to restart the whole instance. Currently loaded modules are stored in the `_module_infos` global list. Since they are modules though they do have unique names so we could support reloading (we might have to delete them out of `sys.modules` too though). Reloading modules does have the obvious possible problem of the outgoing copy of those modules having done things like register listeners. Although the config change filter in the extension API does avoid this so if continue with that pattern it would be less of a problem. We could also add a `hook.unload()` decorator where the extension could clean up.
The alternative is to just load new extensions although I would prefer to be able to reload them if it doesn't turn out to be too fraught.
Since we're using the python import system, couldn't we use importlib.reload
to reload modules? Since the hooks work, people should be using hooks for all
their initialization, and make their modules properly reloadable. This isn't
flawless, for example, if someone registers a interceptor function, it won't
be updated (I think?).
|
qutebrowser/extensions/loader.py
Outdated
if ispkg: | ||
continue | ||
if name not in sys.modules: | ||
_module = finder.find_module(name).load_module(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this line do anything? If so, maybe document it, because it does not look like it does anything...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it imports the module and saves it into sys.modules
as a side effect. I will add a comment and remove the left hand side.
Yep. I think we should get bigger ones working too but I would rather do that as a separate PR. For jmatrix for now I put the import time init stuff into an
Ah, I was looking for that. It looks like it throws an error if the module's parent isn't in
It uses fully specified dotted names. Currently prefixed by the prefix we pass. So |
FWIW reloading Python modules is a can of worms - that's definitely something I'd rather defer to some later stage of extension support. |
Oh also, @The-Compiler, on #30 it says plugins should be able to
Is the leading captial letter something you still want to mandate? |
I'm not sure. As long as extensions aren't official I don't have to decide yet, though 😉 |
dcfd829
to
c711025
Compare
That's all the stuff I wanted to get done for this pass (mainly support loading directories). I added some exception handling to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Looks good overall, just some small suggestions.
qutebrowser/extensions/loader.py
Outdated
pass | ||
return | ||
|
||
if any(os.listdir(ext_dir)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any()
takes a list of values and checks whether any of those is boolean-True. It happens to work here because any([]) == False
and any(['foo', 'bar']) == any([True, True]) == True
, but just
if any(os.listdir(ext_dir)): | |
if os.listdir(ext_dir): |
(relying on the list to be either empty or not) is far less confusing 😉
qutebrowser/extensions/loader.py
Outdated
if name not in sys.modules: | ||
try: | ||
# Import the module with the finder so that it is in | ||
# sys.modules readif for _load_component. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# sys.modules readif for _load_component. | |
# sys.modules ready for _load_component. |
I think?
qutebrowser/extensions/loader.py
Outdated
finder.find_module(name).load_module(name) | ||
except Exception: | ||
log.extensions.exception( | ||
"Failed importing extension: {}".format(name[len(prefix):]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Failed importing extension: {}".format(name[len(prefix):]) | |
"Exception while importing extension: {}".format(name[len(prefix):]) |
(or "Failed to import")
qutebrowser/extensions/loader.py
Outdated
mod_info.init_hook(_get_init_context()) | ||
except Exception: | ||
log.extensions.exception( | ||
"Failed initializing extension: {}".format(mod.__file__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Failed initializing extension: {}".format(mod.__file__) | |
"Exception while initializing extension: {}".format(mod.__file__) |
qutebrowser/extensions/loader.py
Outdated
@@ -180,7 +227,15 @@ def _on_config_changed(changed_name: str) -> None: | |||
cfilter = config.change_filter(option) | |||
cfilter.validate() | |||
if cfilter.check_match(changed_name): | |||
hook() | |||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't catch exceptions in the if option is None:
part above - I guess the whole block should be in the try:
instead?
qutebrowser/extensions/loader.py
Outdated
log.extensions.exception( | ||
"Failed running config change hook for " | ||
"item {} in extension: {}".format( | ||
changed_name, hook.__code__.co_filename, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit hacky for getting the name - why not add that information to ModuleInfo
or so?
df40f33
to
2c6b610
Compare
I made this extension to reload extensions to make hacking easier. Previously I was developing them in config.py because that supports reloading and then moving them to extensions. I also made one to reload autoconfig because it makes sharing config between basedirs easier. I tested what would happen if a naughty extension connects to a PyQt signal by connecting to config.instance.changed and printing the curried module ID and they just keep getting printed, even after a gc run, how suspicious, ah, that is just if the connectee isn't a bound method it seems. If they are on an object they stopped getting called. from qutebrowser.api import cmdutils
@cmdutils.register()
def reload_extensions():
"""Attempt to reload any third party extensions."""
mod_prefix = "qutebrowser.extensions.third_party"
# registered commands, including their args
from qutebrowser.misc import objects
todel = [
cmd.name for cmd in objects.commands.values()
if cmd.handler.__module__.startswith(mod_prefix)
]
for name in todel:
del objects.commands[name]
# interceptor functions, could probably del by value too
from qutebrowser.extensions.interceptors import _interceptors
todel = [
idx for idx, func in enumerate(_interceptors)
if func.__module__.startswith(mod_prefix)
]
for idx in reversed(todel):
del _interceptors[idx]
# extension infos with hooks
from qutebrowser.extensions.loader import _module_infos, load_extensions
todel = [
idx for idx, modinfo in enumerate(_module_infos)
if modinfo.module.__name__.startswith(mod_prefix)
]
for idx in reversed(todel):
del _module_infos[idx]
# remove from sys.modules to make sure walk_extensions reloads them
# I didn't test importlib.reload(), it might not work because these aren't real modules.
import sys
todel = [name for name in sys.modules.keys() if name.startswith(mod_prefix)]
for name in todel:
del sys.modules[name]
load_extensions() |
2c6b610
to
3a5c4f4
Compare
de76937
to
b9ab92f
Compare
qutebrowser/extensions/loader.py
Outdated
spec = finder.find_spec(name, None) | ||
if not spec or not spec.loader: | ||
raise ImportError(f"pkgutil couldn't find loader for {name}") | ||
spec.loader.load_module(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO:
Deprecated since version 3.4: The recommended API for loading a module is exec_module() (and create_module()). Loaders should implement it instead of load_module(). The import machinery takes care of all the other responsibilities of load_module() when exec_module() is implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed a commit that should work with python 3.12. Although I'm not convinced it's the right way to do dynamic imports like this. Mainly because I have to add stuff to sys.modules myself which feels wrong. Although there are other examples on the internet doing the same thing. See the commit message for more context and ideas.
I forgot to put it in the commit message but this was the stack trace before I was adding it to sys.modules myself:
File "/home/user/projects/qutebrowser/qutebrowser/app.py", line 141, in init
loader.load_extensions()
File "/home/user/projects/qutebrowser/qutebrowser/extensions/loader.py", line 88, in load_extensions
_load_component(info, skip_hooks=skip_hooks)
File "/home/user/projects/qutebrowser/qutebrowser/extensions/loader.py", line 198, in _load_component
info.spec.loader.exec_module(mod)
File "<frozen importlib._bootstrap_external>", line 940, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/tmp/qute-basdir-extensions/data/extensions/jmatrix.py", line 6, in <module>
@hook.init()
^^^^^^^^^^^
File "/home/user/projects/qutebrowser/qutebrowser/api/hook.py", line 37, in __call__
info = _add_module_info(func)
^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/projects/qutebrowser/qutebrowser/api/hook.py", line 18, in _add_module_info
module = importlib.import_module(func.__module__)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
File "<frozen importlib._bootstrap>", line 1126, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
File "<frozen importlib._bootstrap>", line 1140, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'qutebrowser.extensions.third_party'
This is similar to `_walk_normal()` but since the extensions aren't in a properly impartable package we have to actually use the finder returned by `walk_packages()` to load the source files. That means the module is actually imported at that point and when `_load_component()` uses importlib it finds the module alreay loaded and everything works. TODO: error handling * when `os.mkdir()` fails, because we have seen that before, * and when importing the extensions, because I assume we don't want third party extensions to crash the browser when there dependancies aren't met or something, * and when running the module init hooks for the same reason. TODO: support loading folders If I remove the `if ispkg: continue` check the finder line actually succeeds in loading the directory (if it has a `__init__.py`) but after we yield it drops back into the `walk_packages` implemention after its yield where it calls `__import__(name)` which fails because it presumably tries to load the parent package which isn't real. It may be as simple as ignoring errors for packages.
If you have a folder in your extensions folder with a `__init__.py` file in it it will now be loaded as an extension. So will the modules under it. Extensions can't currently be imported from `qutebrowser.extensions.third_party` but if we added a `third_party` folder in qutebrowser/extentions (even an empty one) we would be able to import them using the "from ... import <extension>" syntax. The issue was that when pkgutil whent to import the package it was failing because the top level module ("extensions") wasn't real. Now it is under a real package.
Handle errors importing extensions and running their initialization and config changed hooks by logging the error (via a stack trace) and continuing. I put the name/filename of the extensions in the log messages but now I realise the filename is going to be in the stack trace. If lint complains I will just take them out.
The case where they wanted to be called for any config item wasn't covered by the broad except. I don't want the config change filter validation to be in the broad except but I am going to move that out in a future PR anyway. It is curretly a memory leak (when change filters are instantiated they are put into a module scope list, here we instantiate them whenever a config item is set) and we can fail early on that.
Because the user might care.
I wanted to log the filename for some reason. Probably because the extension name would have the `qutebrowser.third_party` prefix. Otherwise I could just pass the extension name to `add_module_info()`. I could go either way really.
The vulture and docs scripts both call load_components befer standadir._init_data() is called, so they were getting a KeyError when the extension loader went to find the extension directory. There shouldn't be any extensions installed in those cases anyway, and if there were they shouldn't be included in the docs. Also the docstring on load_components was lying.
FileFinder.find_module(name) doesn't comply with whatever stubs mypy is looking at. The source mentions it is deprecated this one. Still had to add a couple of unnecessary things to make mypy happy, the `None` argument to find_spec() is defined in the source as `path=None` but mypy thinks it is neither optional nor usable as a keyword argument. The (fullname) argument to load_module() appears to be unnecesary in practice but mypyp insists.
Importlib says this about load_module: > Deprecated since version 3.4: The recommended API for loading a module > is exec_module() (and create_module()). Loaders should implement it > instead of load_module(). The import machinery takes care of all the > other responsibilities of load_module() when exec_module() is > implemented. I'm not 100% sure how to implement the new way of doing things with this setup, I've got something working now but I'm not sure it is the best way of doing it. To recap, what I'm trying to do here is load extensions from the extension folder (eg `$XDG_DATA_DIR/qutebrowser/extensions/`) under a namespace we control (eg `qutebrowser.extensions.third_party`). I don't remember why I set it up like that but there is probably a good reason. What I was doing previously was creating a module and not loading it (I think?) in walk_extensions, by extensions putting it in sys.modules, then later load_component would call the normal import mechanism that would find it already loaded. Now I can't find a way to create a module without importing it. Eg `spec.loader.create_module(spec)` just returns None, which the docs say means "use the standard import mechanism". Not a very helpful replacement for `load_module()`! But the loader returned by the finder returned by walk_packages does now how to import the module. So for now I'm saving the whole spec for `walk_component()` so it can re-use that loader. And then after creating the module I have to add it to sys.modules manually and then exec it. Having to manipulate sys.modules manually is problematic and the biggest sign that I'm doing the wrong thing here. I suspect that the proper fix is to register a finder in `sys.meta_path` that'll handler that fake "third_part" package so that you can just to `import qute.third_party.my_extension` and it'll map the module name to a path and create a module from a file somehow. The finder I see here is `FileFinder('/real/path/to/extensions/')` so that might be a good starting spot. Links for how to create Finders: https://docs.python.org/3/library/importlib.html#setting-up-an-importer https://stackoverflow.com/questions/57126085/how-to-use-sys-meta-path-with-python-3-x
b9ab92f
to
91c58e9
Compare
I wanted to see if I could move all my config.py snippets into extensions. I could, because I am not doing any configurating in config.py, just registering commands. This works (for me) but it could use some more attention, particularly around error handling. I don't know how to best handle probably non-fatal errors at init time and I would like to be able to proceed if we run across them. I opened this PR to get some discussion on what we want out of this.
This implementation is similar to
_walk_normal()
but since the extensions aren't in a properly importable package we have to actually use the finder returned bywalk_packages()
to load the source files. That means the module is actually imported at that point and when_load_component()
uses importlib it finds the module already loaded and everything works.I wanted to keep this PR small but there are other features that could be added to it I bring up below.
Error handling
I assume if we fail to load a third party extension we should try to ignore it and continue on. That currently isn't the case, even the
_on_walk_error()
method that I am re-using raises an error which is never handled. I am not sure how to best proceed though, just log an error?os.mkdir()
fails, because we have seen that before, ✓Experimental feature warning thing ✓
Shall I just log an all caps thing? Or I could make it only be active if the
--debug
flag is passed. It'll be easily missed if you have--debug
on though. I supposed I could also just not autocreate the extensions folder.Future work
Support loading folders ✓
If I remove the
if ispkg: continue
check the finder line actually succeeds in loading the directory (if it has a__init__.py
) but after we yield it drops back into thewalk_packages
implemention after its own yield where it calls__import__(name)
which fails because it presumably tries to load the parent package which isn't real. It even fails with the current implementation if there is a package folder in the extensions folder ever though we aren't importing it.It may be as simple as ignoring errors for packages, we could check in sys.modules if the package itself got imported right.
It was failing with "No module named 'extensions'" because I had set the import prefix to "extensions.". Once I changed it to something properly namespaced under a real module (
qutebrowser.extensions.third_party.
) it just works.Having a non-existant module (like
third_party
) in the (fake) module path means you cannot import the extension modules (which I don't see a reason to support). You can still get them fromsys.modules
though (and importlib). Even making that a package in the source tree doesn't let you import them (it seems to look for child modules as attributes of their parents rather than just looking for the whole heirarchy in sys.modules). I think I'll just leave it out and not support importing third party modules. They are in_module_infos
anyway. Actually you can still dofrom qutebrowser.extensions.third_party import whatever
, cool.Unloading/Re-loading modules
The
config-source
command has been useful for developing external things so far. Both when developing and when adding things developed in a separate basedir into my main browser instance when I have got something stable without having to restart the whole instance. Currently loaded modules are stored in the_module_infos
global list. Since they are modules though they do have unique names so we could support reloading (we might have to delete them out ofsys.modules
too though). Reloading modules does have the obvious possible problem of the outgoing copy of those modules having done things like register listeners. Although the config change filter in the extension API does avoid this so if continue with that pattern it would be less of a problem. We could also add ahook.unload()
decorator where the extension could clean up.The alternative is to just load new extensions although I would prefer to be able to reload them if it doesn't turn out to be too fraught.
Put behind a config item (added 2024-02-06)
And surface that this is turned on in :version and :report.
Consider how we could make versioning a part of this (added 2024-02-06)
Ideally extensions would declare a version they were compatible with and if they were running against a too new qutebrowser version they would complain/fail/require extra work to enable.
This change is