-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multithreaded Building and Testing #540
Conversation
Nice |
Can we close this or do you think it will happen in Q2? |
This can happen. Build should be safe, though it's going to interact badly with my winbuild branch. (n^2 threads) But we can do either default off some switch to turn it off. I can take another look at the testing and see if I can isolate it. |
Good stuff, this latest branch build took 15 minutes, compared to 29 minutes for the latest master! Python 2/3 builds went from ~4 mins to ~3 mins, and pypy builds from 13 mins to 4 mins. This branch: https://travis-ci.org/python-imaging/Pillow/builds/22731569 |
Except... Reported coverage is down from 65% to 35%. Both run the same number of tests, so it's probably a problem of over-writing coverage reports. This branch, 35%: https://coveralls.io/builds/673134 I think this should solve it. In tester.py, this:
needs an extra parameter:
(Maybe See http://nedbatchelder.com/code/coverage/api.html#api Then, the first line of See "Combining data files" at http://nedbatchelder.com/code/coverage/cmd.html |
I'm sure that's the problem, we're going to need either put the coverage file in a tempdir per process, or key it to a process id. |
Well, that 'fixed' the coverage. It's still a little wonky, since only the last completing run will actually get accurate coverage. If we could get one coverage run after all of the tests on all of the pythons had run, we'd be golden. As for the test failures, there's still weird stuff happening. |
It's possible to edit .travis.yml to move the coverage reporting stuff out of http://docs.travis-ci.com/user/build-configuration/ In fact, we could just rename that whole section as that data would still be useful for failing builds. (If we had a --failfast option to abort tests after the first failure, submitting coverage wouldn't be so useful.) |
Is after_script run once per python, or once per commit? Ideally all that stuff would be one run in common for all of the pythons, and the code coverage would be the intersection of all of it. And things like the pep8 and pylint aren't going to be any different for different pythons. (or at least, they shouldn't be). It looks like I'm getting a 32 wide build, so perhaps I'll check to see if I'm running on travis and back that down a bit. I remember somewhere that there's something like 1.5vcpus per travis run, so it's probably worth not hitting it with that much contention. |
Also, I'm wondering if the builds that hang are an artifact of using os.popen and read, instead of something that explicitly buffers like subprocess.Popen and communicate. |
after_script is once per Python version, not once per commit. Each Travis sub-build ("job") is independent, in a fresh virtual machine. However, Coveralls does combine the coverage from the jobs into its final report, so you get an overall coverage number for each commit. For example, https://coveralls.io/builds/692289 has coverage from 69.01% to 69.36%, and combined coverage of 69.82%. (Individual files show just one job -- eg https://coveralls.io/files/180021092 -- and the dropdown lets you select another job or the build combined. Unfortunately Coveralls has been showing a 500 error for combined reports for a while -- eg https://coveralls.io/builds/692289/source?filename=PIL%2FImageCms.py. I've asked them about this.) pep8 and pyflakes will be slightly different for different Python versions caused by Python 2/3-specific code (eg Yep, "Travis CI VMs run on 1.5 virtual cores." Rather than using and maintaining a homemade test runner, another option would be to use some other tool to run the tests. https://stackoverflow.com/questions/2074074/how-to-speedup-python-unittest-on-muticore-machines suggests py.test, nose (for non-Windows), and unittest/testtools. |
See also #632 for some experiments. |
I think I'm going to try a few things:
|
Closing. Tests are superseded by the unittest and test runner changes. Build has been split out into separate PR. |
EXPERIMENTAL.
I've seen false negatives on the testing, mainly from race conditions on files. Also, the test results/running messages and errors aren't exactly in sync anymore.
The build portion is a horrendous monkey patched hack. It works here, YMMV.
Speedup is roughly a factor of 2.5 - 3x on my 4core/8thread using all the processors, 2x on my ancient dual core laptop.