Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transform plugin not called when running with multiple jobs #4874

Closed
skogsvik opened this issue Aug 19, 2021 · 9 comments · Fixed by #8683 or #9093
Closed

Transform plugin not called when running with multiple jobs #4874

skogsvik opened this issue Aug 19, 2021 · 9 comments · Fixed by #8683 or #9093
Labels
Bug 🪲 multiprocessing Needs PR This issue is accepted, sufficiently specified and now needs an implementation
Projects
Milestone

Comments

@skogsvik
Copy link

Summary

When adding a plugin which registers a transform it registers and calls the transform as expected, but when enabling multiple jobs, the transform doesn't get called resulting in a different result than expected.

The plugin seems to be loaded fine, but the transform is not triggered.

The expected behavior is to have the linting result be the same regardless of the number of jobs.

Pylint version

> python -m pylint --version
pylint 2.8.3
astroid 2.5.6
Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)]

MWE

With the content of mwe.py being:

# PLUGIN
import astroid


def register(_):
    # Needed for registering the plugin.
    pass


def has_extra_arg_decorator(node):
    return node.decorators and any(d.name == 'class_wrapper' for d in node.decorators.nodes)


def add_extra_arg(node):
    print('transform triggered!')
    init_function = node.getattr('__init__')[0]
    new_arg = astroid.nodes.AssignName('new_arg', parent=init_function.args)
    init_function.args.args.insert(1, new_arg)


# MRE
def class_wrapper(cls):
    """
    Class decorator which adds an extra argument to __init__
    """
    init_func = cls.__init__

    def wrapped_init(self, new_arg, *args, **kwargs):
        print(new_arg)
        init_func(self, *args, **kwargs)
    cls.__init__ = wrapped_init
    return cls


@class_wrapper
class CoolClass():
    def __init__(self):
        print('__init__')


if __name__ == '__main__':
    instance = CoolClass('123')
else:
    astroid.MANAGER.register_transform(astroid.ClassDef, add_extra_arg, has_extra_arg_decorator)
    print('plugin registered!')

You get the following error when running pylint without plugins:

> python -m  pylint -E mwe.py
************* Module mwe
mwe.py:42:15: E1121: Too many positional arguments for constructor call (too-many-function-args)

This is most likely since it can't figure out what the decorator is doing (similar to #259 but for the class decorator).

If we enable the plugin which adds the arg back using the transform (since we can't use signature-mutators as it doesn't seem to work as expected here) we see that the error disappears

> python -m  pylint -E --init-hook="import sys; sys.path.insert(0, '.')" --load-plugins mwe  mwe.py
plugin registered!
transform triggered!

If we however increase the number of jobs, we see that plugin is registered, but the transform not triggered, resulting in the error returning

> python -m  pylint -j2 -E --init-hook="import sys; sys.path.insert(0, '.')" --load-plugins mwe  mwe.py
plugin registered!
************* Module mwe
mwe.py:42:15: E1121: Too many positional arguments for constructor call (too-many-function-args)

Related issues

@skogsvik
Copy link
Author

Upgraded pylint using pip and can confirm the same behavior for 2.9.6

> python -m  pylint --version
pylint 2.9.6
astroid 2.6.6
Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)]

@jfly
Copy link
Contributor

jfly commented Dec 7, 2021

For the record, this bug specifically happens on Pylint >= 2.5 on a platform for which multiprocessing.get_start_method != 'fork' (such as macOS on Python 3.8+).

$ python -c 'import multiprocessing as mp, platform; print("mp start method: " + mp.get_start_method()); print("platform: " + platform.system()); print("python: " + platform.python_version())'
mp start method: spawn
platform: Darwin
python: 3.8.10
$ pip install 'pylint==2.12.2'
$ pylint -j2 -E --init-hook="import sys; sys.path.insert(0, '.')" --load-plugins mwe mwe.py
plugin registered!
************* Module mwe
mwe.py:42:15: E1121: Too many positional arguments for constructor call (too-many-function-args)

Note that this works fine on versions of Pylint < 2.5. With the example above:

$ python -c 'import multiprocessing as mp, platform; print("mp start method: " + mp.get_start_method()); print("platform: " + platform.system()); print("python: " + platform.python_version())'
mp start method: spawn
platform: Darwin
python: 3.8.10
$ pip install 'pylint==2.4.4'
$ pylint -j2 -E --init-hook="import sys; sys.path.insert(0, '.')" --load-plugins mwe mwe.py
plugin registered!
plugin registered!
transform triggered!

Also note that things work fine on a platform such as Linux:

$ python -c 'import multiprocessing as mp, platform; print("mp start method: " + mp.get_start_method()); print("platform: " + platform.system()); print("python: " + platform.python_version())'
mp start method: fork
platform: Linux
python: 3.8.10
$ pip install 'pylint==2.12.2'
$ pylint -j2 -E --init-hook="import sys; sys.path.insert(0, '.')" --load-plugins mwe mwe.py
plugin registered!
transform triggered!

It looks like this bug was introduced by this big refactor: #3016 (you can see some plugin loading code getting removed here), which I believe went out in Pylint 2.5. That change relies upon the (IMO, incorrect) assumption that the only side effects of Pylint plugins are to alter the PyLinter object (which could get pickled and then unpickled in the subprocess), but some Pylint plugins actually register astroid transforms, which get lost when spawning a subprocess (but they are preserved when forking). I personally ran into this in a different way: we've been working around PyCQA/astroid#786 with a custom plugin that calls sys.setrecursionlimit(...), and that change is no longer affecting subprocesses.

For the record, it was really hard for me to tell if this is a known bug or not.

https://pylint.pycqa.org/en/latest/user_guide/run.html#parallel-execution says:

There are some limitations in running checks in parallel in the current implementation. It is not possible to use custom plugins (i.e. --load-plugins option), nor it is not possible to use initialization hooks (i.e. the --init-hook option).

So it sort of sounds like this is a known issue, but then I went digging to see if there's a corresponding issue, and I learned that the history is a little more interesting than that:

  1. (2014-10) The original implementation of --jobs adds that documentation saying --load-plugins and --init-hook don't work: c03fefd
  2. (2015-05) Support for --load-plugins with --jobs is added! 0cf4153. The docs are not updated, though, so there's still a message saying --load-plugins does not work.
  3. (2019-10) There's this big refactor of --jobs: Refactor file checking to allow parallel linting in Prospector #3016, which introduces the issue I've described above. So ironically, the docs are sort of up to date again.

@jfly
Copy link
Contributor

jfly commented Dec 7, 2021

Some ideas:

  1. Actually call each plugin's register method from within the worker subprocesses. Basically, this would give us back the behavior we had before Refactor file checking to allow parallel linting in Prospector #3016.
  2. Have each worker subprocess load the --init-hook. This wouldn't fix transformation plugins like in the original issue description, but it would at least give people a mechanism for increasing the stack limit as a workaround for MultiLineBlockMixin causes stack overflow from recursive _get_assign_nodes() astroid#786 (comment).
  3. Provide some other, new mechanism for running code specifically in the context of worker subprocesses. Perhaps --init-worker-hook? This also wouldn't fix transformation plugins like in the original issue description.
  1. seems like the best solution to me. I don't know if it could cause issues with the pickle/unpickled PyLinter object.

@Pierre-Sassoulas any thoughts?

@Pierre-Sassoulas
Copy link
Member

Thank you for analyzing the problem in detail like that, much appreciated.

I think you're solution sounds great. Why not do 1 and 2 though ? It seems we want both register and init-hook options to works ?

Baring that it seems to me that the hard part (now that you identified the problem properly, because clearly THAT was the hard part) would be to create a proper automated tests for it. We have some automated tests with multiprocessing, but not using optional plugin yet. Maybe it's as simple as adding an optional plugin that would be easily testable to the existing tests. How would you test this ?

@janneronkko
Copy link

janneronkko commented Dec 7, 2021

When I made the #3016 I first tried to separate configuration and the actual linting (see #3016 (comment)) but that basically would have resulted in changing pretty much everything related to launching pylint, i.e. you would have had to basically rewrite everything for launching pylint.

I did not have the changes I made when trying this out in my git clone anymore but I still remember that it started to look like a complete rewrite.

@jfly
Copy link
Contributor

jfly commented Dec 7, 2021

@Pierre-Sassoulas unfortunately I'm pretty busy, so I don't think I'll have time to actually put together a PR for this, sorry.

For anyone else showing up from the internet, we're working around this by using --init-hook to change multiprocessing's start method from spawn to fork:

# (Setup)
python -c 'lines = ["if True:", "    pass"] + ["elif True:", "    pass"]*300; print("\n".join(lines))' > lib.py

## Broken on macOS ##
pylint -j 2 lib.py

## This works on macOS ##
pylint -j 2 --init-hook='import multiprocessing as mp; mp.set_start_method("fork"); import sys; sys.setrecursionlimit(3000000)' lib.py

I don't know if we're opening ourselves up to any bugs here. https://bugs.python.org/issue33725 talks about this at length, and it sounds like CPython changed the default start_method on macOS for a good reason. Oh well. 🤞 🤷

@janneronkko
Copy link

Also fork method is not available on Windows.

I also considered changing the start method to fork but that does not work on Windows and like you mentioned, the fork method should not be used on MacOS:

Changed in version 3.8: On macOS, the spawn start method is now the default. The fork start method should be considered unsafe as it can lead to crashes of the subprocess. See bpo-33725.

https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

I think that if you want to really fix this, the whole implementation for running pylint parallel needs to be rewritten. As far as I can remember, the code starting Pylint has a lot of dependencies and is very complex. Like I said, I tried to refactor it in such way that configuration and starting would be separated but that ended up looking like a complete rewrite and I'm not sure what existing features the change would break. Of course, it could also be that I spent only something like 20 - 30 hours digging into the implementation.

@Pierre-Sassoulas
Copy link
Member

I think that if you want to really fix this, the whole implementation for running pylint parallel needs to be rewritten. As far as I can remember, the code starting Pylint has a lot of dependencies and is very complex

You're right, Pylinter is a god class that does too much. And it inherit from Mixins so it's even bigger than you could think at first. I had a similar experience trying to refactor it. It's hard to even find a small first step to start the refactor.

We did some refactor since though and it now "only" inherit from OptionsManagerMixIn, ReportsHandlerMixIn, and BaseTokenChecker. (Down from being also a MessagesHandlerMixIn,)

There is also a need to separate the option parsing for #5392. One of the issue is that the each checker is parsing the option itself. So, maybe it got a little better and #5392 will help

@Pierre-Sassoulas
Copy link
Member

Update: The PyLinter got simpler following the argparse refactor, this is not something as hard as when we were first discussing it.

@Pierre-Sassoulas Pierre-Sassoulas added the Needs PR This issue is accepted, sufficiently specified and now needs an implementation label Jul 1, 2022
@jacobtylerwalls jacobtylerwalls modified the milestones: 2.15.0, 2.16.0 Jul 3, 2022
@jacobtylerwalls jacobtylerwalls modified the milestones: 2.16.0, 3.0.0 Nov 13, 2022
@jacobtylerwalls jacobtylerwalls self-assigned this May 13, 2023
@jacobtylerwalls jacobtylerwalls modified the milestones: 3.0.0, 3.0.0a7 May 13, 2023
@jacobtylerwalls jacobtylerwalls added this to Done in Pylint 3.0 May 17, 2023
@jacobtylerwalls jacobtylerwalls removed their assignment May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug 🪲 multiprocessing Needs PR This issue is accepted, sufficiently specified and now needs an implementation
Projects
5 participants