Multithreaded Building and Testing #540

wiredfool · 2014-03-06T00:22:07Z

EXPERIMENTAL.

I've seen false negatives on the testing, mainly from race conditions on files. Also, the test results/running messages and errors aren't exactly in sync anymore.

The build portion is a horrendous monkey patched hack. It works here, YMMV.

Speedup is roughly a factor of 2.5 - 3x on my 4core/8thread using all the processors, 2x on my ancient dual core laptop.

…ltiprocessing

aclark4life · 2014-03-16T19:11:58Z

Nice

aclark4life · 2014-04-07T22:53:04Z

Can we close this or do you think it will happen in Q2?

wiredfool · 2014-04-08T05:27:57Z

This can happen. Build should be safe, though it's going to interact badly with my winbuild branch. (n^2 threads) But we can do either default off some switch to turn it off.

I can take another look at the testing and see if I can isolate it.

hugovk · 2014-04-12T12:29:54Z

Good stuff, this latest branch build took 15 minutes, compared to 29 minutes for the latest master!

Python 2/3 builds went from ~4 mins to ~3 mins, and pypy builds from 13 mins to 4 mins.

This branch: https://travis-ci.org/python-imaging/Pillow/builds/22731569
Master: https://travis-ci.org/python-imaging/Pillow/builds/22756501

hugovk · 2014-04-12T12:51:32Z

Except...

Reported coverage is down from 65% to 35%. Both run the same number of tests, so it's probably a problem of over-writing coverage reports.

This branch, 35%: https://coveralls.io/builds/673134
Master, 65%: https://coveralls.io/builds/673884

I think this should solve it. In tester.py, this:

cov = coverage.coverage(auto_data=True, include="PIL/*")

needs an extra parameter:

data_file is the base name of the data file to use, defaulting to ".coverage". data_suffix is appended (with a dot) to data_file to create the final file name. If data_suffix is simply True, then a suffix is created with the machine and process identity included.

(Maybe auto_data can be removed.)

See http://nedbatchelder.com/code/coverage/api.html#api

Then, the first line of after_success: in .travis.yml, before coverage report, needs to combine the reports. (Alternatively, it may be better to use combine() using the API at the end of the tests. Then local test runs will have usable coverage without needing to run extra commands.)

See "Combining data files" at http://nedbatchelder.com/code/coverage/cmd.html

wiredfool · 2014-04-12T12:59:08Z

I'm sure that's the problem, we're going to need either put the coverage file in a tempdir per process, or key it to a process id.

wiredfool · 2014-04-17T18:59:07Z

Well, that 'fixed' the coverage. It's still a little wonky, since only the last completing run will actually get accurate coverage. If we could get one coverage run after all of the tests on all of the pythons had run, we'd be golden.

As for the test failures, there's still weird stuff happening.

hugovk · 2014-04-17T21:44:32Z

It's possible to edit .travis.yml to move the coverage reporting stuff out of after_success into after_script, that way it'd be collected for all builds.

http://docs.travis-ci.com/user/build-configuration/

In fact, we could just rename that whole section as that data would still be useful for failing builds. (If we had a --failfast option to abort tests after the first failure, submitting coverage wouldn't be so useful.)

wiredfool · 2014-04-17T22:56:05Z

Is after_script run once per python, or once per commit?

Ideally all that stuff would be one run in common for all of the pythons, and the code coverage would be the intersection of all of it. And things like the pep8 and pylint aren't going to be any different for different pythons. (or at least, they shouldn't be).

It looks like I'm getting a 32 wide build, so perhaps I'll check to see if I'm running on travis and back that down a bit. I remember somewhere that there's something like 1.5vcpus per travis run, so it's probably worth not hitting it with that much contention.

wiredfool · 2014-04-17T22:57:47Z

Also, I'm wondering if the builds that hang are an artifact of using os.popen and read, instead of something that explicitly buffers like subprocess.Popen and communicate.

hugovk · 2014-04-18T09:41:41Z

after_script is once per Python version, not once per commit. Each Travis sub-build ("job") is independent, in a fresh virtual machine. However, Coveralls does combine the coverage from the jobs into its final report, so you get an overall coverage number for each commit.

For example, https://coveralls.io/builds/692289 has coverage from 69.01% to 69.36%, and combined coverage of 69.82%.

(Individual files show just one job -- eg https://coveralls.io/files/180021092 -- and the dropdown lets you select another job or the build combined. Unfortunately Coveralls has been showing a 500 error for combined reports for a while -- eg https://coveralls.io/builds/692289/source?filename=PIL%2FImageCms.py. I've asked them about this.)

pep8 and pyflakes will be slightly different for different Python versions caused by Python 2/3-specific code (eg try/except ImportError), but not much.

Yep, "Travis CI VMs run on 1.5 virtual cores."
http://docs.travis-ci.com/user/speeding-up-the-build/

Rather than using and maintaining a homemade test runner, another option would be to use some other tool to run the tests.

https://stackoverflow.com/questions/2074074/how-to-speedup-python-unittest-on-muticore-machines suggests py.test, nose (for non-Windows), and unittest/testtools.

hugovk · 2014-04-18T12:00:48Z

See also #632 for some experiments.

wiredfool · 2014-04-18T16:27:34Z

I think I'm going to try a few things:

Set my local concurrency to 32, to replicate the build errors.
put in a environment flag for max concurrency, and set it to something like 4 on travis.

wiredfool · 2014-06-24T23:00:12Z

Closing. Tests are superseded by the unittest and test runner changes. Build has been split out into separate PR.

wiredfool and others added 7 commits January 28, 2014 12:21

multiprocess --in progress

4386c4e

added close/join

7cc9af6

multithreaded build on multiproc machines

7f9de4a

added parallelisim

7ce1c02

Merge branch 'multiprocessing' of github.com:wiredfool/Pillow into mu…

bbd68ea

…ltiprocessing

cpu_count *2 is a step too far, the default is faster

187e265

_processes isn't in py 2.6

8ad2d53

aclark4life added the enhancement label Mar 17, 2014

aclark4life added this to the Future milestone Mar 17, 2014

wiredfool added 3 commits April 9, 2014 00:02

Merge from master

68cb38a

Fixes race condition on multiprocess testing

84cfe37

secure tempdir creation and deletion, fixes race condition

1b091e2

wiredfool added 6 commits April 14, 2014 06:47

Merge from master

ac35b29

fixed borked merge

a2428f2

Pass the tempdir through to the tester from the bulk runner

5e077e0

skipped tests are printed later

fdd5900

multiprocess coverage

7ecda00

Don't delete the tempdir if it's passed in from the runner

955bfd4

hugovk mentioned this pull request Apr 18, 2014

Run test sets in different Travis jobs #632

Closed

environment based concurrency limit

c39de6d

wiredfool closed this Jun 24, 2014

wiredfool mentioned this pull request Jun 24, 2014

Multiprocessing build #721

Merged

wiredfool deleted the multiprocessing branch September 23, 2014 16:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multithreaded Building and Testing #540

Multithreaded Building and Testing #540

wiredfool commented Mar 6, 2014

aclark4life commented Mar 16, 2014

aclark4life commented Apr 7, 2014

wiredfool commented Apr 8, 2014

hugovk commented Apr 12, 2014

hugovk commented Apr 12, 2014

wiredfool commented Apr 12, 2014

wiredfool commented Apr 17, 2014

hugovk commented Apr 17, 2014

wiredfool commented Apr 17, 2014

wiredfool commented Apr 17, 2014

hugovk commented Apr 18, 2014

hugovk commented Apr 18, 2014

wiredfool commented Apr 18, 2014

wiredfool commented Jun 24, 2014

Multithreaded Building and Testing #540

Multithreaded Building and Testing #540

Conversation

wiredfool commented Mar 6, 2014

aclark4life commented Mar 16, 2014

aclark4life commented Apr 7, 2014

wiredfool commented Apr 8, 2014

hugovk commented Apr 12, 2014

hugovk commented Apr 12, 2014

wiredfool commented Apr 12, 2014

wiredfool commented Apr 17, 2014

hugovk commented Apr 17, 2014

wiredfool commented Apr 17, 2014

wiredfool commented Apr 17, 2014

hugovk commented Apr 18, 2014

hugovk commented Apr 18, 2014

wiredfool commented Apr 18, 2014

wiredfool commented Jun 24, 2014