Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primer tests "à la mypy" #5173

Merged
merged 10 commits into from Nov 24, 2021
Merged

Primer tests "à la mypy" #5173

merged 10 commits into from Nov 24, 2021

Conversation

Pierre-Sassoulas
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas commented Oct 17, 2021

Type of Changes

Type
✨ New feature

Description

This is a draft for adding primer tests, and discuss about it.

Closes #4413

@Pierre-Sassoulas Pierre-Sassoulas added the Maintenance Discussion or action around maintaining pylint or the dev workflow label Oct 17, 2021
@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.12.0 milestone Oct 17, 2021
@Pierre-Sassoulas Pierre-Sassoulas changed the title Draft: Primet tests a la mypy Draft: Primer tests a la mypy Oct 17, 2021
@coveralls
Copy link

coveralls commented Oct 17, 2021

Pull Request Test Coverage Report for Build 1499050922

  • 38 of 38 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.02%) to 93.477%

Totals Coverage Status
Change from base Build 1498611038: 0.02%
Covered Lines: 13957
Relevant Lines: 14931

💛 - Coveralls

requirements_test.txt Outdated Show resolved Hide resolved
tests/acceptance/test_primer.py Outdated Show resolved Hide resolved
tests/acceptance/test_primer.py Outdated Show resolved Hide resolved
tests/acceptance/test_primer.py Outdated Show resolved Hide resolved
tests/acceptance/test_primer.py Outdated Show resolved Hide resolved
@DanielNoord
Copy link
Collaborator

@Pierre-Sassoulas I took the liberty to work on this some more. I have created a working prototype at DanielNoord#29

If you want to go ahead with what I did there I can commit to this branch directly or create a new PR.

There are some additional discussions to be had (as well as a bug identified by the primer tests to be fixed) but I think these commits should help progress this effort along!

@Pierre-Sassoulas
Copy link
Member Author

Thank you @DanielNoord much appreciated. I'll take a look before the end of the week. This is definitely helping release 2.12 faster :)

@Pierre-Sassoulas
Copy link
Member Author

@DanielNoord it seems like you removed the primer test where we launch without the --error-only in order to check that there is no pylint crash, is that a mistake ? I think it would permit to catch crashes for checker that do not emit any errors, what do you think ?

@DanielNoord
Copy link
Collaborator

@DanielNoord it seems like you removed the primer test where we launch without the --error-only in order to check that there is no pylint crash, is that a mistake ? I think it would permit to catch crashes for checker that do not emit any errors, what do you think ?

See #5173 (comment)
I think we might be able to remove --error-only and focus on the error code of Fatal and Crash. I don't think the --error-only is actually necessary here.

@Pierre-Sassoulas
Copy link
Member Author

Pierre-Sassoulas commented Nov 13, 2021

I think we might be able to remove --error-only and focus on the error code of Fatal and Crash.

Yes, I agree, we can use %2 on the exit code.

tests/primer/test_primer_external.py Outdated Show resolved Hide resolved
tests/primer/test_primer_stdlib.py Outdated Show resolved Hide resolved
tests/primer/test_primer_stdlib.py Outdated Show resolved Hide resolved
@DanielNoord
Copy link
Collaborator

I have also added the check for Fatal errors to the stdlib test.

@DanielNoord
Copy link
Collaborator

I guess we found another issue.. Good to see this actually works :)

@DanielNoord
Copy link
Collaborator

Reposting this from the other PR so we don't lose it:

Some points I found while working on this:

  1. The initial approach of asserting ex.code == 0 doesn't work since many packages will return error messages. Even the stdlib packages. We should therefore only look for crashes (ex.code == 32) or fatal messages (ex.code == 1) I think.

  2. I have created a new CI job to do the primer tests on Linux on every push or pull requests. We don't want to run them only when bumping a version (as we did with the previous acceptance tests), but I think it is good to separate them from the other tests. Especially since they will probably take much longer to complete (more on that later) and sometimes early fails on the tests are helpful in finalising a PR (especially after GitHub review comments break a test).

  3. The lazy loading of repo's checks for the SHA1 hashes of the local commit and remote commit and reclones when it finds a difference.

  4. tests/primer/primer_external_packages.py is used to store a list of packages to download. We currently include numpy but I would argue against including it. numpy has its tests included in the sources files. Normally we don't want to run pylint over tests as this can create problems when tests use frameworks that use non-normal python code. Besides, running over the tests also really inflates the time it takes to run the primer tests. I would think we could come up with 2/3 other projects that might be better. Note that these projects do not need to use pylint, we just need to be able to assume that their code is "normal" and therefore shouldn't crash pylint or astroid.

  5. We might also want to improve the message that gets raised when the test fails. Currently it is not immediately clear where pylint crashed. Improving the message might help expedite the process of fixing it.
    Perhaps we should add --only-fatal? To make the output only print fatal errors?

@Pierre-Sassoulas
Copy link
Member Author

I upgraded the description to add a todo list.

many packages will return error messages. Even the stdlib packages.

I think at first we could only look for crashes or fatal error message. But using project that have a pylint configuration, installing dependencies and using their configuration should help, because if they have a pylint conf it means that errors are really (new) false positives. Maybe something to do after 2.12 release ?

@DanielNoord
Copy link
Collaborator

@Pierre-Sassoulas Thanks for the todo list!

I thought about this a little and I think we might want to change the command (again).
We want to use the primer tests to make sure that any new check or change does not create crashes or fatal errors. It is not necessarily about catching new false positives or negatives: if a new check creates new warnings on a project already using pylint we wouldn't immediately know if those are incorrect. Catching false positive/negative remains something to be done in review and via issues.
To make sure we catch any new crashes and fatal errors we would want to enable all messages and extensions. Even if a project has disabled some of them. This will make sure all of pylint's code is run against the project and would return any of the crashes/fatal errors that the new commits might have introduced.

To summarise:

  • I don't think we want to look for Error-level messages, as those do necessarily break pylint and are just regular false positives/negatives.
  • I think we want to run pylint against projects with all messages and extensions enabled and check for any error code 32 and % 2. We might want to create a command --use-all-extensions
  • We will want 2/3 projects which use different areas of code. I'm thinking of a project using asynchronous code often, using multi threading, etc. They do not necessarily need to use pylint. No project should ever create crashes or Fatal errors.

@Pierre-Sassoulas
Copy link
Member Author

if a new check creates new warnings on a project already using pylint we wouldn't immediately know if those are incorrect.

The reasoning for checking for error is that if there is an "error" in a well tested largely used package main branch, it means pylint created a false positive. We don't need to use the --error-only option as we can bit decode the exit code to check if there were errors or not. But it does force us to install the project dependencies.

We might want to create a command --use-all-extensions

I agree with your three point, but this one is really good as this can be useful for those who want to use all extensions. I'm also thinking about making is possible to disable some extensions because preventing while loop is really opinionated for example. Maybe a disable-plugin option to mirror the load-plugin one ?

@DanielNoord
Copy link
Collaborator

  • Stop github actions at the first job not passing

Doesn't this already happen? When one primer fails, all primers are cancelled.

The reasoning for checking for error is that if there is an "error" in a well tested largely used package main branch, it means pylint created a false positive. We don't need to use the --error-only option as we can bit decode the exit code to check if there were errors or not. But it does force us to install the project dependencies.

I'm not sure this is the case. Take undefined-variable for example:
https://github.com/PyCQA/pylint/blob/e5082d3f9401dbcf65b40ce6a819d2a09beccb5c/pylint/checkers/variables.py#L393-L397

We know there are issues with this message. There are false positives and negatives and due to the lack of control-flow we are not able to solve this (now).
If we make it so the primer tests fail whenever an Error-level message is found in a project there cannot be any undefined-variable messages. This is (currently) unfeasible. There are many more Error-level messages where we know there are issues, which we cannot always fix so easily.

We can (sort of) avoid this by only including projects that use pylint as they are likely to have already disabled current false positives, but that doesn't fully help us.
What if a commit changes the way undefined-variable behaves and the project emits 10 new warnings. We would need to investigate whether any of these are correct or false positive. For larger projects this number might be much higher. Even if we identify 1 false positive and fix it, the primer will still fail because of 9 undefined-variable messages. We know these messages are correct (the project just hasn't updated to the WIP-pylint version) and can merge the commit, but from now on every primer CI job will fail because of those 9 messages.

By only checking for Fatal and Crash we make sure that pylint can parse any type of code (pylint-enforced and non-enforced) without breaking/crashing.
For reference the only Fatal messages are: method-check-failed, fatal, astroid-error and parse-error.

We might want to create a command --use-all-extensions

I agree with your three point, but this one is really good as this can be useful for those who want to use all extensions. I'm also thinking about making is possible to disable some extensions because preventing while loop is really opinionated for example. Maybe a disable-plugin option to mirror the load-plugin one ?

So you would want to allow the use of --use-all-extensions and disable-plugin at the same time?

@Pierre-Sassoulas
Copy link
Member Author

What if a commit changes the way undefined-variable behaves and the project emits 10 new warnings. We would need to investigate whether any of these are correct or false positive.

Yes, that's the idea behind it, being able to handle the false positive during the checker creation and not being flooded by having lot of issues opened right after we release. Sometime the false positive would be unavoidable (need to be disabled in the repo), so it can't be run in github action. But we can add examples to functional tests once we know we need to do them. Right now Marc is checking that kind of thing in its own repository. Let's do that later so 2.12 scope does not increase again though.

So you would want to allow the use of --use-all-extensions and disable-plugin at the same time?

This could work the same way than for messages with --enable=all and --disable=all.

@Pierre-Sassoulas
Copy link
Member Author

Doesn't this already happen? When one primer fails, all primers are cancelled.

Maybe pylnt tests are too long but in pycodestyle if a single job fail all jobs are canceled. This is not directly linked to primer but as primer would take a lot of time we might have to look into it.

@DanielNoord
Copy link
Collaborator

Yes, that's the idea behind it, being able to handle the false positive during the checker creation and not being flooded by having lot of issues opened right after we release. Sometime the false positive would be unavoidable (need to be disabled in the repo), so it can't be run in github action. But we can add examples to functional tests once we know we need to do them. Right now Marc is checking that kind of thing in its own repository. Let's do that later so 2.12 scope does not increase again though.

Ah my bad, I thought we wanted to do primer because of the crashes reported after 2.11. Those can very easily be avoided with the primer tests in its current form.

This could work the same way than for messages with --enable=all and --disable=all.

Do we want to include creating that option for 2.12? Or do we want to put this into 2.12 and then add extensions priming and error priming in 2.13?

@Pierre-Sassoulas
Copy link
Member Author

we wanted to do primer because of the crashes reported after 2.11

It's the main reason clearly, detecting false positive is a lot harder to do, would take time and verification once it's implemented.

Do we want to include creating that option for 2.12? Or do we want to put this into 2.12 and then add extensions priming and error priming in 2.13?

I think we should try to release 2.12 asap so probably 2.13 at least.

@DanielNoord
Copy link
Collaborator

I have a fix ready for the crash which I will push tomorrow morning. Don't have a laptop with internet connection now..

@DanielNoord
Copy link
Collaborator

Btw, before pushing this to 2.12 I really think we should reconsider adding numpy and try to find a package that does not mix code and tests in the same directory. That really inflates the time these primer tests take.

@Pierre-Sassoulas
Copy link
Member Author

I added psycopg and pytest to the list 0ff2f45.

Copy link
Member

@cdce8p cdce8p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would strongly recommend to move the workflows here already. #5173 (comment)
It isn't too difficult either, mostly copy and paste.

ChangeLog Show resolved Hide resolved
pylint/testutils/primer.py Outdated Show resolved Hide resolved
pylint/testutils/primer.py Outdated Show resolved Hide resolved
pylint/testutils/primer.py Outdated Show resolved Hide resolved
@cdce8p
Copy link
Member

cdce8p commented Nov 22, 2021

@doublethefish Caching was been an ongoing discussion in astroid, just to link one: pylint-dev/astroid#1145. There are multiple issues that make it not trivial to implement. Just know that it's still on our minds but not priority at the moment.

command = ["pylint"] + enables + disables + package.pylint_args
logging.info("Launching primer:\n%s", " ".join(command))
if PY37_PLUS:
subprocess.run(command, check=True, capture_output=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cdce8p I just realized that capture_output prevent us from seeing the current problem in the CI, I'm going to remove it as it's easier to just not capture and let pytest handle it.

@Pierre-Sassoulas
Copy link
Member Author

Pierre-Sassoulas commented Nov 22, 2021

@doublethefish thank you for suggesting pyscopg :) We caught yet another crash !

https://github.com/psycopg/psycopg/blob/master/psycopg/psycopg/_queries.py#L281

Exception on node <If l.281 at 0x7f2f9141a550>

pylint crashed with a IndexError and with the following stacktrace:

Traceback (most recent call last):
  File "/home/pierre/pylint/pylint/lint/pylinter.py", line 1035, in _check_files
    self._check_file(get_ast, check_astroid_module, file)
  File "/home/pierre/pylint/pylint/lint/pylinter.py", line 1070, in _check_file
    check_astroid_module(ast_node)
  File "/home/pierre/pylint/pylint/lint/pylinter.py", line 1204, in check_astroid_module
    retval = self._check_astroid_module(
  File "/home/pierre/pylint/pylint/lint/pylinter.py", line 1251, in _check_astroid_module
    walker.walk(node)
  File "/home/pierre/pylint/pylint/utils/ast_walker.py", line 75, in walk
    self.walk(child)
  File "/home/pierre/pylint/pylint/utils/ast_walker.py", line 75, in walk
    self.walk(child)
  File "/home/pierre/pylint/pylint/utils/ast_walker.py", line 75, in walk
    self.walk(child)
  [Previous line repeated 1 more time]
  File "/home/pierre/pylint/pylint/utils/ast_walker.py", line 72, in walk
    callback(astroid)
  File "/home/pierre/pylint/pylint/extensions/check_elif.py", line 69, in visit_if
    if not self._elifs[self._if_counter]:
IndexError: list index out of range

@cdce8p
Copy link
Member

cdce8p commented Nov 22, 2021

thank you for suggesting pyscopg :) We caught yet another crash !

https://github.com/psycopg/psycopg/blob/master/psycopg/psycopg/_queries.py#L281

Exception on node <If l.281 at 0x7f2f9141a550>

pylint crashed with a IndexError and with the following stacktrace:

...

@Pierre-Sassoulas Where did you find the traceback? I only found the error here: https://github.com/PyCQA/pylint/runs/4286552306?check_suite_focus=true#step:6:38. Am I missing something?

Btw. I would suggest to add __tracebackhide__ = True to some of the test methods. Might help a bit.
https://doc.pytest.org/en/latest/example/simple.html#writing-well-integrated-assertion-helpers
We did so here already:
https://github.com/PyCQA/pylint/blob/d288ec3544fae6a154b3e5fd34fe0f5d10e74e53/pylint/testutils/lint_module_test.py#L189

@Pierre-Sassoulas
Copy link
Member Author

Where did you find the traceback?

Because of #5173 (comment) I had to reproduce locally

I opened #5369

@Pierre-Sassoulas
Copy link
Member Author

I would suggest to add tracebackhide = True

43924fd

@Pierre-Sassoulas Pierre-Sassoulas force-pushed the add-primer-tests branch 2 times, most recently from 297a4fb to c387e12 Compare November 23, 2021 19:06
@Pierre-Sassoulas
Copy link
Member Author

Should be good to go once #5369 is merged and rebased upon.

In order to anticipate crash/fatal messages, false positives are harder
to anticipate.

Add '__tracebackhide__ = True' so the traceback is manageable
pylint/testutils/primer.py Outdated Show resolved Hide resolved
directories: str
"""Directories within the repository to run pylint over"""

pylint_additional_args: List[str] = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pylint_additional_args: List[str] = []
pylint_additional_args: List[str]

pylint/testutils/primer.py Outdated Show resolved Hide resolved
pylint/testutils/primer.py Outdated Show resolved Hide resolved
pylint/testutils/primer.py Outdated Show resolved Hide resolved
tests/primer/test_primer_stdlib.py Outdated Show resolved Hide resolved
directories: str
"""Directories within the repository to run pylint over"""

pylint_additional_args: List[str] = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pierre-Sassoulas I think you missed this comment (the previous conversation was resolved).

tests/testutils/test_package_to_lint.py Show resolved Hide resolved
@Pierre-Sassoulas
Copy link
Member Author

This MR will make pylint's release so much safer ! Thanks to everyone who participated and reviewed. It's still far from perfect, the next step will be #5359 and later on #5364 🚀

assert package_to_lint.pylint_additional_args == ["--ignore-pattern='re*'"]
assert package_to_lint.paths_to_lint == [str(Path(expected_path_to_lint))]
assert package_to_lint.clone_directory == Path(str(Path(expected_dir_path)))
assert package_to_lint.pylintrc == Path(str(Path(expected_pylintrc_path)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't these Path by default due to the refactor?
Anyway, Path(str(Path seems a bit redundant 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Harg, you're right, and the 21mn for the pipeline really hurt for small thing like this.. 😓 Let's release 2.12 and focus on 2 primer batch right after that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to the end_line reporter first. We could make this blocked until that is merged?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge and do another speed up MR before 2.12.0. This MR is already too big.

@Pierre-Sassoulas Pierre-Sassoulas merged commit 5e9d20d into main Nov 24, 2021
@Pierre-Sassoulas Pierre-Sassoulas deleted the add-primer-tests branch November 24, 2021 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Maintenance Discussion or action around maintaining pylint or the dev workflow primer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Primer acceptance tests like in black or mypy
5 participants