Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: distutils, add compatibility python parallelization #9050

Merged
merged 2 commits into from
May 10, 2017

Conversation

juliantaylor
Copy link
Contributor

@juliantaylor juliantaylor commented May 4, 2017

BUG: distutils, place fortranobject files in subfolder

Placing them all under the same name in the top level folder breaks when
using the parallel extension compilation option of python 3.5.

BUG: distutils, add compatiblity python parallelization

Python 3.5 also added build parallelization at the extension level
instead of the file leve numpy uses.
This causes two problems:

  • numpy.distutils is not threadsafe with duplicated source files
    When source files are duplicated in multiple extensions the output
    objects are overwritten which can truncate in a parallel context.
    This is fixed by keeping track of the files being worked on and wait
    for completion if another thread is already using the object name.

  • The parallelization on two nested levels causes oversubscription.
    When building multiple extensions with multiple source files the number
    of jobs running is multiplied.
    This is fixed by adding a semaphore that limits the number of jobs numpy
    starts to the defined amount.

closes gh-7139

Placing them all under the same name in the top level folder breaks when
using the parallel extension compilation option of python 3.5.
@juliantaylor
Copy link
Contributor Author

juliantaylor commented May 4, 2017

should fix the scipy part of gh-7139
shared source files as in numpy are still an issue, we could fix that by just doing our compilation under a lock.
That would solve the running thread multiplication we current have and would keep our in-extension parallelization but lose some efficiency for small extensions with large files like our mtrand.

@juliantaylor juliantaylor changed the title BUG: distutils, place fortranobject files in subfolder BUG: distutils, add compatiblity python parallelization May 4, 2017
Python 3.5 also added build parallelization at the extension level
instead of the file leve numpy uses.
This causes two problems:

- numpy.distutils is not threadsafe with duplicated source files
When source files are duplicated in multiple extensions the output
objects are overwritten which can truncate in a parallel context.
This is fixed by keeping track of the files being worked on and wait
for completion if another thread is already using the object name.

- The parallelization on two nested levels causes oversubscription.
When building multiple extensions with multiple source files the number
of jobs running is multiplied.
This is fixed by adding a semaphore that limits the number of jobs numpy
starts to the defined amount.

closes numpygh-7139
@juliantaylor
Copy link
Contributor Author

added a commit that fixes the numpy problem without losing the benefits, a bit more complex than the sledge hammer lock solution, but imo reasonable.

@juliantaylor
Copy link
Contributor Author

technically the second commit makes the first obsolete, but using subfolders for fortranobject is probably a good idea in any case.

try:
# retrieve slot from our #job semaphore and build
with _job_semaphore:
self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any situation when this semaphore wouldn't be acquired? Aren't there the same number of pool workers as available semaphores?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I get it now - CCompiler_compile itself can be called from multiple threads simultaneously, resulting in multiple pools

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct
an alternative implementation could have one global pool that each compile thread shares

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, am I right in thinking that we end up allocating N pools of N workers, but then sharing an N-item semaphore between all N^2 of them? That seems pretty wasteful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it are a few hundred threads in the worst case, so not too bad.

With a global pool we can't close it again but that shouldn't be an issue and I don't think map is guaranteed to be threadsafe (though it probably is)

Copy link
Member

@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation LGTM

@juliantaylor
Copy link
Contributor Author

should probably still go into 1.13, any other opinions on semaphore vs global pool?

@charris charris merged commit bbb2b81 into numpy:master May 10, 2017
@charris
Copy link
Member

charris commented May 10, 2017

Thanks Julian.

@juliantaylor juliantaylor deleted the fortranobj-path branch May 10, 2017 14:42
@charris charris changed the title BUG: distutils, add compatiblity python parallelization BUG: distutils, add compatibility python parallelization Dec 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

parallel build fails sometimes with python3.5
3 participants