Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: don't rebuild unchanged files #8897

Merged
merged 4 commits into from
Apr 19, 2017
Merged

Conversation

juliantaylor
Copy link
Contributor

Only rebuild files that are newer than their corresponding object or if
older than their dependencies.

@juliantaylor
Copy link
Contributor Author

I could have sworn distutils already did that and it stopped working at some point, am I remembering wrong?

@rgommers
Copy link
Member

rgommers commented Apr 5, 2017

It has been doing that semi-reliably for a long time. Are you seeing complete rebuilds? What Python and setuptools versions?

@juliantaylor
Copy link
Contributor Author

both python 2.7.13 and 3.5.3

@rgommers
Copy link
Member

rgommers commented Apr 5, 2017

Hmm, I'll check tonight with my currently installed versions.

return 0
newest = 0
for f in files:
newest = max(os.stat(f)[ST_MTIME], newest)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just os.stat(f).st_mtime, and skip the stat import. Same below

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, return max(files, key=lambda f: os.stat(f).st_mtime) could replace the last four lines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this returns the newest filename I guess the function should be called newest_file_time

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case, return max(os.stat(f).st_mtime for f in files) is better. Yep, rename sounds sensible

@rgommers
Copy link
Member

rgommers commented Apr 6, 2017

distutils does check file timestamps, you should see that if you rebuild the exact same commit twice in a row, then the second build does nothing (takes 1 sec). The issue (or feature?) you're likely seeing is that all files are rebuilt that together make up a single .so even if only one file was touched. This is due to the newer_group check here (I think - just reading the code, haven't verified) which is implemented here

@juliantaylor
Copy link
Contributor Author

juliantaylor commented Apr 6, 2017

distutils does not rebuild if nothing changed, but it rebuilds everything if one file changed. I thought it didn't do that in the past, but maybe I am imagining that as every other buildsystem works like that ...
This fixes this, but it has the drawback that the depends need to be set correctly as there is no automatic determination.

@juliantaylor
Copy link
Contributor Author

should we add automatic dependency determination? its not difficult on unix platforms

@juliantaylor
Copy link
Contributor Author

This is how it would look like.
If we are paranoid we could add a check for compiler support. I guess CCompiler_customize would be the right place for that?

@rgommers
Copy link
Member

distutils does not rebuild if nothing changed, but it rebuilds everything if one file changed. I thought it didn't do that in the past, but maybe I am imagining that as every other buildsystem works like that ...

If with "everything" you mean all files for a given extension module, then we are saying the same thing.

You can verify that different modules are independent

time python setup.py build_ext -i   # ~1min 30s
time python setup.py build_ext -i  # ~1s
touch numpy/core/src/npysort/binsearch.c
time python setup.py build_ext -i  # ~18s

This PR still looks fairly straightforward with the automatic dependency checking, so +1 to give it a go.

@rgommers
Copy link
Member

How did you test this?

@@ -44,8 +44,9 @@ def UnixCCompiler__compile(self, obj, src, ext, cc_args, extra_postargs, pp_opts
self.linker_so = llink_s.split() + opt.split()

display = '%s: %s' % (os.path.basename(self.compiler_so[0]), src)
deps = ['-MMD', '-MF', obj + '.d']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this deserves a comment explaining what's going on here. I could figure it out, but those are not flags one uses every day

@rgommers
Copy link
Member

If we are paranoid we could add a check for compiler support.

You mean checking that the rebuild is using the same compiler as the previous build?

@juliantaylor
Copy link
Contributor Author

no, I meant checking if the compiler supports the commandline flags

@rgommers
Copy link
Member

Ah, that's probably a healthy idea. We have people building on AIX, Raspberry Pi, and other assorted exotic platforms. CCompiler_customize should be the right place.

@juliantaylor
Copy link
Contributor Author

Updated with more comments and a flag support check. It is skipped for gcc and clang which always support it as the customize is run on every extension so it actually costs measurable time in a single file recompile.

@juliantaylor juliantaylor force-pushed the no-rebuild branch 3 times, most recently from 1226555 to 8f400b5 Compare April 10, 2017 11:25
@juliantaylor
Copy link
Contributor Author

probably release note worthy if we still want it in the next release

Use these dependencies to avoid unnecessary recompilations of unchanged
files.
@rgommers
Copy link
Member

CI is failing

probably release note worthy if we still want it in the next release

agreed

@juliantaylor
Copy link
Contributor Author

ci looks good, it failed earlier due to msvc in distutils being stupid. Fixed that and I removed even trying on non posix.

@rgommers rgommers merged commit 3b2a7a7 into numpy:master Apr 19, 2017
@rgommers
Copy link
Member

Okay, let's give this a try. In it goes. Thanks Julian!

@rgommers
Copy link
Member

Want to send a follow up PR for release notes edit?

@charris
Copy link
Member

charris commented May 1, 2017

This broke SciPy testing, see scipy/scipy#7321. I don't know if this is a problem with the SciPy setup, but in any case i'm inclined to revert it until we can figure out what is going on.

charris added a commit to charris/numpy that referenced this pull request May 1, 2017
This reverts commit 3b2a7a7, it was
causing errors in Scipy testing.

Closes numpy#9035.
@@ -191,7 +230,8 @@ def CCompiler_compile(self, sources, output_dir=None, macros=None,

def single_compile(args):
obj, (src, ext) = args
self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
if _needs_build(obj):
self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue vs. #9035 is probably that _needs_build should also check if any compilation flags differ from the previous compilation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could well be. I need to delete the SciPy build directory in order to run tests that show the problem if it was working in the previous build.

Copy link
Member

@pv pv May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleting might not help, because the same file is built two times with different options during the same build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just referring to my local build. Another problem I ran into was that in order to cythonize the source I needed to do a Python2 build first. I have cython installed for Python 3 but it goes under the name cython3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm I though I tested scipy with this, apparently I forgot to run the tests

checking the arguments is simple enough, would be nicer to add the extension name to the dependency filename instead but it is probably clunky to get that from the compiler object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants