Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to pytest from our homegrown test runner #1673

Closed
refi64 opened this issue Jun 7, 2016 · 23 comments
Closed

Switch to pytest from our homegrown test runner #1673

refi64 opened this issue Jun 7, 2016 · 23 comments

Comments

@refi64
Copy link
Contributor

refi64 commented Jun 7, 2016

Ok, so when I saw the PR a while back that "parallelized" the test runner, I was super hyped. But the hype wore off when I realized that the unit tests were still run sequentially.

In particular, it continuously bugs me how, when running the full unit test suite, there's absolutely no sort of progress indicator of any kind. To make things worse, runtests.py is largely undocumented, leaving me to wonder why stuff like ./runtests.py unit-test -a '*something*' works but not ./runtests.py -a '*something*'.

In addition, to an extent, -a is slightly...useless. I mean, because the naming convention for tests is so inconsistent, it's rarely ever useful.

Also, test fixtures are a cruel mess. It took me 30 minutes of debugging to figure out why the tests were giving out obscure KeyErrors that I couldn't reproduce in my own code. Turns out, the fixture I was using didn't define list. But defining list caused other tests to brutally fail, so then I just created a new fixture. IMO this whole thing should either be reworked or (better yet) completely gutted and thrown into a fire pit. A very hot fire pit. With sulfur.

So I see two options:

  • Move to a different test runner. Some have suggested pytest, which I personally really like! Pytest has the xdist plugin, which supports running multiple tests in parallel.
  • Revamp the current test runner. Ideally, this would include fixing fixtures (oh, the irony...), improving runtests.py, and making the actual unit tests run in parallel, preferably also with a progress bar of some sort.

Thoughts? I could try and help out here a bit if you guys come to a consensus of some sort.

@gnprice
Copy link
Collaborator

gnprice commented Jun 7, 2016

Yep -- I think we all agree that the current test runner is pretty frustrating. Last week at the PyCon sprints where we had a bunch of new contributors getting involved, I found myself apologizing every time I explained how to filter tests (with the positional argument and with -a).

We were actually just discussing this also on #1668, and so far everyone agrees that the right answer will be to move to something like pytest. If you want to try putting together a PR, that would be very useful!

Pytest is (mostly for good reasons) kind of a complicated beast with a lot of choices in how to set it up, so I recommend discussing early and often, either on this thread or on a work-in-progress PR.

@gnprice gnprice changed the title Either fix up the current test runner or move to a new one Switch to pytest from our homegrown test runner Jun 7, 2016
@ddfisher
Copy link
Collaborator

ddfisher commented Jun 7, 2016

I also hate hate hate the builtins test fixtures. +1 to killing them with fire. They only exist for perf reasons, and I'm not at all sure they're worth it.

@gvanrossum
Copy link
Member

gvanrossum commented Jun 7, 2016 via email

@gnprice
Copy link
Collaborator

gnprice commented Jun 7, 2016

Yeah, one thing we'll definitely want is for pytest to see the individual
test cases inside the data files. I think it should be possible to do this
pretty nicely with appropriate use of pytest.mark.parametrize:
https://pytest.org/latest/parametrize.html but I am not a pytest expert,
and this is one of the things I had in mind when I said that pytest is
something of a complicated beast with a lot of choices in how to set it up.
So that will be for Ryan or whoever takes this on to figure out. :-)

On Tue, Jun 7, 2016 at 3:29 PM, Guido van Rossum notifications@github.com
wrote:

A big problem here seems to be that (for reasons having to do with mypy's
early history as a separate language, probably) the non-data-driven tests
use a set of classes that is custom to mypy's "myunit" test runner (which
is wrapped by runtests.py). I don't know enough about pytest to understand
whether we'd have to redo all that infrastructure. I also don't know how
easy it is to get pytest to understand our data-driven tests. It would be
pretty frustrating if pytest treated each file of test data as a single
test.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#1673 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AABuDRvZfrjPF-sEQfTUhsdfDcXEFYvgks5qJfDDgaJpZM4IwVl5
.

@refi64
Copy link
Contributor Author

refi64 commented Jun 7, 2016

I'll try to see if I can tackle this over the weekend!

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 8, 2016

Cool, switching to a standard test runner would have many potential benefits. (Note that this has been discussed before, see #380. That issue also explains why things are as they are.)

The last time the test runner was rewritten we had a bunch of issues and regressions and let's try to avoid them this time. Here are some things to consider:

  • Running a single test case shouldn't be (significantly) slower than it's now, as this is critical to productivity.
  • Generally try to support current workflows and try not to make them less convenient or slower. Otherwise productivity could even be impaired instead of improved due to the switch, at least in the short term. I don't have a full list, but here are some:
    • Running a single unit test or a bunch of related unit tests (based on keywords). The -k py.test option seems to do the trick.
    • Running all "unit tests" (the idea is to make it easy to run only fast test cases locally).
    • Running all tests in parallel locally.
    • Running all tests in parallel on Travis CI.
    • Reporting test failures and file locations clearly. This includes locations of test cases in .test data files and generally whatever (useful) output the current runner generates.
  • Try to only change one thing at a time. For example, don't try to deal with test fixtures at the same time as switching to py.test. Also, try to avoid touching the .test data files when switching the test runner.

Test fixtures are mostly a separate issue. I'll file a new issue.

@gvanrossum
Copy link
Member

FWIW the fast way to run a single test case is currently to use the myunit incantation given in README.md, ignoring runtests.py. It would be much nicer if pytest could handle all of these.

This was referenced Jun 8, 2016
@gnprice
Copy link
Collaborator

gnprice commented Jun 8, 2016

Yeah, "change one thing at a time" is definitely good advice for this kind of migration.

I'd recommend taking that farther, even, and try to find a way to migrate one section of things at a time. For example, one good strategy could be to take just one of the larger "tasks" in runtests.py, like the testcheck driver, and rewrite that driver using pytest, so that

  • the tests can be run directly with a nice py.test command, and also
  • runtests.py continues to run it, now through that py.test command.

You'd want to preserve the ability to run just one or a few related unit tests at a time (perhaps by having runtests.py pass any -a argument it gets straight through to py.test -k, if it's running the task you migrate). And I think if you reuse DataDrivenTestCase and parse_test_cases from mypy.test.data, perhaps with some adaptations, it should be possible to keep parsing the *.data files unmodified and pass the results to a mechanism like pytest.mark.parametrize so that pytest knows about all the separate test cases.

A PR that does that for one driver without making a ton of changes in total would be a really solid step forward. There'll probably be discussion on the details (made possible by having the code up in a PR), and then once it's merged it should be pretty smooth sailing to do the same for most of the other drivers. Once they're all converted, runtests.py can be dropped.

gnprice added a commit that referenced this issue Jul 27, 2016
…over to it (#1944)

This is a step toward #1673 (switching entirely to pytest from myunit
and runtests.py), using some of the ideas developed in @kirbyfan64's
PR #1723.

Both `py.test` with no arguments and `py.test mypy/test/testcheck.py`
work just as you'd hope, while `./runtests.py` continues to run all
the tests.

The output is very similar to the myunit output.  It doesn't spew
voluminous stack traces or other verbose data, and it continues using
`assert_string_arrays_equal` to produce nicely-formatted comparisons
when e.g. type-checker error messages differ.  On error it even
includes the filename and line number for the test case itself, which
I've long wanted, and I think pytest's separator lines and coloration
make the output slightly easier to read when there are multiple
failures.

The `-i` option from myunit is straightforwardly ported over as
`--update-data`, giving it a longer name because it feels like the
kind of heavyweight and uncommon operation that deserves such a name.
It'd be equally straightforward to port over `-u`, but in the presence
of source control I think `--update-data` does the job on its own.

One small annoyance is that if there's a failure early in a test run,
pytest doesn't print the detailed report on that failure until the
whole run is over.  This has often annoyed me in using pytest on other
projects; useful workarounds include passing `-x` to make it stop at
the first failure, `-k` to filter the set of tests to be run, or
(especially with our tests where errors often go through
`assert_string_arrays_equal`) `-s` to let stdout and stderr pass
through immediately.  For interactive use I think it'd nearly always
be preferable to do what myunit does by immediately printing the
detailed information, so I may come back to this later to try to get
pytest to do that.

We don't yet take advantage of `xdist` to parallelize within a
`py.test` run (though `xdist` works for me out of the box in initial
cursory testing.)  For now we just stick with the `runtests.py`
parallelization, so we set up a separate `py.test` command for each
test module.
@gnprice
Copy link
Collaborator

gnprice commented Jul 27, 2016

With #1944 merged, the migration is begun!

Some next steps, not necessarily in a particular order:

  • Migrate testcmdline, which is the other DataDrivenTestCase-using driver. This should be a straightforward re-use of what's in Set up to use pytest for our data-driven tests, and switch testcheck over to it #1944.
  • Migrate the other parse_test_cases-using drivers that don't use DataDrivenTestCase -- testparse, testpythoneval, testsemanal, teststubgen, testtransform, testtypegen. This should generally also be a pretty straightforward re-use of what's in Set up to use pytest for our data-driven tests, and switch testcheck over to it #1944, with a little more massaging of the test code.
    • testpythoneval is especially appealing because it's the long pole and in combination with xunit (below) will make a full test run faster.
  • Migrate the non-parse_test_cases-using drivers: testargs, testdocstring, testgraph, testinfer, testlex, testmoduleinfo, testsolve, testsubtypes, testtypes. These probably become plain old bog-standard pytest tests. Probably will require loosening up python_functions and/or python_classes in pytest.ini, which in turn may require some tweaking elsewhere to avoid spuriously picking up functions back in parse_test_cases-land that are meant as helpers and not as test cases themselves.
  • Add the xdist plugin, do some testing that it's working correctly, and set up pytest.ini so that when someone just types py.test they get the benefit of it by default. Then adjust runtests.py to fit with that. Initially this could just be runtests.py passing -n 1 to knock out xdist and carrying on with its own concurrency model... but the real prize is when it doesn't do that. Say, run pytest alone first, so it can have all the cores to itself, then everything else with the runtests/waiter concurrency mechanism.
  • Cut out myunit from the code and documentation, simplifying things.
  • Migrate everything else in runtests.py: flake8, type-checking, and importing.
  • Cut out runtests.py and mypy.waiter from code and documentation, simplifying things again.

The big prizes here are the simplification from cutting things out at the end; and the speedups from getting effective parallelism, especially on testpythoneval and other individual slow drivers.

@gnprice
Copy link
Collaborator

gnprice commented Jul 27, 2016

Oh, also:

  • Incorporate the test data file (like check-classes.test) into what py.test -k can match against. It's already in the reporting, which is something I've wanted for a while and I know at least some other people have too.

@refi64
Copy link
Contributor Author

refi64 commented Jul 27, 2016

Uhhh...I think I kind of screwed this one up... So sorry! Glad @gnprice was able to actually, y'know, finish it!

@gnprice
Copy link
Collaborator

gnprice commented Jul 28, 2016

@kirbyfan64 Nothing to apologize for! Thanks for your work in #1723 -- I drew on that for #1944 and it's now landed. Want to take on some of the further steps of the migration? ;-)

In particular I think you mentioned in a comment on #1723 that you'd already done some testing of xdist, and gotten it to work after a bugfix upstream. Setting things up so a plain py.test command runs things in parallel by default would be awfully nice...

@ambv
Copy link
Contributor

ambv commented Jan 5, 2017

The biggest issue with test speed now is not the runner anymore but excessive subprocessing and disk thrashing. If we switch to py.test but still have to subprocess for every myunit invocation and for every mypy test or flake8 run, it's not going to win us much, especially with so many [cases] generating temporary throw-away *.py files.

As far as I can tell, the myunit *.test format is very appealing because it's minimal boilerplate. There's no plans to move away from that syntax, right?

Are there plans to make myunit use the API to not have to fire a fresh mypy subprocess for every check? mypy --help takes 0.5s on my laptop. Since myunit is already subprocessed from runtests.py and it doesn't itself do any parallelization, there's little point in invoking mypy via the command line. We could start simplifying myunit so that it just becomes a set of helpers for pytest to handle *.test as parameters, like Greg is suggesting above. Then, making those tests use the API instead of command line mypy, would give us the biggest efficiency win.

gvanrossum pushed a commit that referenced this issue Jan 6, 2017
This change is a step towards removing `runtests.py` (see #1673).

The exclusion list in the flake8 configuration in `setup.cfg` has been updated
to enable running the linter from the root of the project by simply invoking
`flake8`.  This enables it to leverage its own file discovery and its own
multiprocessing queue without excessive subprocessing for linting every file.
This gives a minor speed up in local test runs. Before:

  total time in lint: 130.914682

After:

  total time in lint: 20.379915

There's an additional speedup on Travis because linting is now only performed
on Python 3.6.

More importantly, this means flake8 is now running over all files unless
explicitly excluded in `setup.cfg`. This will help avoiding unintentional
omissions in the future (see comments on #2637).

Note: running `flake8` as a single lazy subprocess in `runtests.py` doesn't
sacrifice any parallelism because the linter has its own process pool.

Minimal whitespace changes were required to `mypy_extensions.py` but in return
flake8 will check it now exactly like it checks the rest of the `mypy/*`
codebase.  Those are also done on #2637 but that hasn't landed yet.

Finally, flake8-bugbear and flake8-pyi were added to test requirements to make
the linter configuration consistent with typeshed.  I hope the additional
checks will speed up future pull requests by automating bigger parts of the
code review.  The pyi plugin enables forward reference support when linting
.pyi files.  That means it's now possible to run `flake8` inside the typeshed
submodule or on arbitrary .pyi files during development (which your editor
could do for you), for example on fixtures.  See discussion on #2629 on checks
that are disabled and why.
@elazarg
Copy link
Contributor

elazarg commented Aug 26, 2017

The way I see it, moving the rest of the tests (i.e. the non data-driven tests) to pytest requires using @pytest.fixture: see POC implementation of the fixture file with functions like

@pytest.fixture
def fx_contra() -> TypeFixture:
    return TypeFixture(CONTRAVARIANT)

And a test, with test cases such as

def test_is_proper_subtype_contravariance(fx_contra) -> None:
    assert_true(is_proper_subtype(fx_contra.gsab, fx_contra.gb))

Type checking the consistency of pytest's fixtures will require nontrivial mypy plugin, at least. Is it OK to leave it unchecked, and open a separate issue for that?

@JukkaL
Copy link
Collaborator

JukkaL commented Aug 30, 2017

Pytest can also run tests defined using the stdlib unittest module. unittest is closer to myunit and using it would make the migration easier fir non-data-driven tests. In particular, the fixtures should be easy to migrate. Type checking test cases would also be simpler.

@elazarg
Copy link
Contributor

elazarg commented Sep 20, 2017

#3880 completed the migration of the data-driven tests to pytest.

@gvanrossum
Copy link
Member

Great! Thanks for working through the various test modules.

What should be the criteria for closing this issue? The retirement or runtests.py? Why would we need that script still if we can run all tests through pytest?

@JukkaL
Copy link
Collaborator

JukkaL commented Sep 20, 2017

runtests.py also type checks various things, verifies that modules can be imported, and it runs lint. These would need to be migrated to pytest as well, in addition to the remaining non-data-driven myunit tests.

A reasonable next milestone would be the removal of myunit.

@ethanhs
Copy link
Collaborator

ethanhs commented Dec 15, 2017

After #4369, I think we should replace some of the simpler assert* functions, as pytest will work better without them and using bare assert functions should lead to nicer output on errors.

This would entail e.g. replacing assert_false calls to be just assert not ..., but leaving assert_join and other non-straightforward assert functions.

JukkaL pushed a commit that referenced this issue Feb 6, 2018
Migrate the remaining myunit test cases to pytest. This uses 
unittest.TestCase as a shortcut, as suggested by @JukkaL. It 
is aliased as Suite.

The transition is crude, aims at functionality; I intend to clean it 
up in a later PRs.

Work towards #1673.
@JukkaL
Copy link
Collaborator

JukkaL commented Feb 6, 2018

Myunit is finally gone. Thanks @elazarg!

@ilevkivskyi
Copy link
Member

We are not quite there still, the current status is:

$ ./runtests.py -l
0:lint
1:pytest
2:check file runtests.py
3:check file waiter.py
4:check package mypy
5:check stubs
6:check stdlibsamples (3.2)
7:check file test-data/samples/bottles.py
8:check file test-data/samples/class.py
9:check file test-data/samples/cmdline.py
10:check file test-data/samples/crawl.py
11:check file test-data/samples/crawl2.py
12:check file test-data/samples/dict.py
13:check file test-data/samples/fib.py
14:check file test-data/samples/files.py
15:check file test-data/samples/for.py
16:check file test-data/samples/generators.py
17:check file test-data/samples/greet.py
18:check file test-data/samples/guess.py
19:check file test-data/samples/hello.py
20:check file test-data/samples/input.py
21:check file test-data/samples/itertool.py
22:check file test-data/samples/regexp.py

@elazarg As I understand you are working on this, right?

gvanrossum pushed a commit that referenced this issue Jun 23, 2018
Using the api, similarly to #5087.

stdlibsamples is not included in this PR since it requires special handling of working directory and paths.

This is another item for #1673.
@ilevkivskyi
Copy link
Member

This is now finally fixed with #5274 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants