Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pythonPackages.Theano: set NIX_ENFORCE_PURITY=0 for invocation of wrapped compiler #99516

Closed
wants to merge 1 commit into from

Conversation

@risicle
Copy link
Contributor

@risicle risicle commented Oct 3, 2020

Motivation for this change

This should unblock some ZHF (#97479) targets.

Theano currently builds but doesn't work properly as the nix compiler wrapper gets upset with the path of the temporary file that Theano generates - it considers it an impure path. Issues discussing this: #91431, #93560, #73051

This may be an extremely naive solution from me, but it does appear to work just making sure NIX_ENFORCE_PURITY is set to zero before the compiler-wrapper is invoked.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.
@risicle risicle requested review from FRidh and jonringer as code owners Oct 3, 2020
@risicle risicle force-pushed the risicle:ris-theano-fix branch 2 times, most recently from 12ab88f to 8081c35 Oct 3, 2020
@ofborg ofborg bot requested a review from bcdarwin Oct 3, 2020
…pped compiler

this should prevent the wrapper from objecting to the path of the object
file
@risicle risicle force-pushed the risicle:ris-theano-fix branch from 8081c35 to 2d9839d Oct 4, 2020
@twhitehead
Copy link
Contributor

@twhitehead twhitehead commented Oct 4, 2020

The settings on the temporary variables should be fully addressed with #97597.

That isn't to say though that not setting NIX_ENFORCE_PURITY is or isn't a good idea. That will depend on whether someone is more likely to having impure paths sneak in or have pure paths not resolve correctly.

In the past one person ran into issues due to making their home directory a symlink. This resulted in their being multiple paths to the same libraries, only one of which was recognized as being pure. Not settings NIX_ENFORCE_PURITY would have avoided this. Possible also if the wrapper script had set NIX_BUILD_TOP to $(realpath $HOME/.theano) instead of just $HOME/.theano things would also have been okay.

All I did though was advice them not to use symlinks for their home as it is generally a bad idea in my experience. For example, if /home/me is a symlink to /mnt/external/home, then /home/me/.. is /mnt/external and not /home, which I've seen cause issues before.

It is likely also worth mentioning, now that I'm more familiar with cross compilation, technically we should have the runtime version of the buildtime compiler. This hasn't been an issue yet though as I don't think anytime is cross compiling theano. 😁

@risicle
Copy link
Contributor Author

@risicle risicle commented Oct 4, 2020

I'm proposing this PR because it isn't fixed. Theano on master will build fine, but as soon as you try to import it, it will give the same error. Do try it.

In particular this means that import theano will fail when done from inside a sandbox, so the pythonImportsCheck added here would otherwise fail and it also causes the tests of downstream packages such as pymc3 to fail.

@twhitehead
Copy link
Contributor

@twhitehead twhitehead commented Oct 4, 2020

That is because #97597 was just recently merged into staging, so it hasn't yet landed in master

https://github.com/NixOS/nixpkgs/compare/staging

@risicle
Copy link
Contributor Author

@risicle risicle commented Oct 4, 2020

Ahh staging - I didn't notice that. 👍

I'm looking at something that will be safely backportable to 20.09 though, and I'm not sure how people will feel about a significant compiler-wrapper change.

@twhitehead
Copy link
Contributor

@twhitehead twhitehead commented Oct 4, 2020

I'm not opposed to this patch. I just don't know if it will be a net win or not.

From a purity standpoint, I prefer the purity filtering. From a practical standpoint though, it if we keep running into corner cases, it is better to forgo it. What we don't know is if we are going to discover corner cases from disabling it because the purity enforcement isn't just about checking, it also actively strips possibly problematic paths from the users environment (IIRC, this was an issue on our CentOS based super computer clusters).

If we do merge this, my suggestion would be to pass through the user's NIX_ENFORCE_PURITY rather than unsetting it

"$(declare -xp | sed -e '/^[^=]\+="\('"''${NIX_STORE//\//\\/}"'\|[^\/]\)/!d' \
                     -e '/declare -x \(NIX_ENFORCE_PURITY\)/d'))"

so, if there are people who require it, or we think it might help them, there will be a way to re-enable it by manual exporting NIX_ENFORCE_PURITY=1 before running Theano.

Another, possibly better, alternative would be to combine this with clearing the user environment before invoking the compiler

#!$(type -P env) -i $(type -P bash)

as I expect this may also be sufficient to resolve the bad user's environment issues that the purity checking took care of.

Assuming this last one works, I would suggest merging that into 20.09. It would probably be fine for master too, if people really wanted it, although my slight personal preference would be to find out first where things sit with hydra once staging lands.

@risicle
Copy link
Contributor Author

@risicle risicle commented Oct 4, 2020

Ok let's wait for staging to land. Though I'm sceptical that even if everything is happy, the staging solution will be appropriate for 20.09 given how close we're getting. Patching things here can only ever break theano-related packages, which is why I like it.

I'm also running the implications of fully stripping env for the compiler wrapper through my mind...

@twhitehead
Copy link
Contributor

@twhitehead twhitehead commented Oct 4, 2020

Well, the compiler environment wouldn't be fully stripped. It would be the build environment variables stripped of any variables set to absolute paths other than those under $NIX_STORE. That is, the command

declare -xp

prints all the bash declare command required to restore the exported environment. The sed line then removes exported environment variables from that that aren't of the form

  • <absolute path starting with $NIX_STORE> or
  • <not an absolute path>

which is what gets put at the start of the gcc wrapper script.

Still, it is a bit of an approximation though. From the gcc environment variables page, you can see there are both environment variables you wouldn't want leaking to the compiler (e.g., LIBRARY_PATH, CPATH, C_INCLUDE_PATH, etc.) and some that should (e.g., TMPDIR). And that is just gcc. You have to also consider the compiler could be clang.

This is what makes hijacking the existing nix purity filtering system seem so tempting.

@risicle
Copy link
Contributor Author

@risicle risicle commented Oct 17, 2020

I think the problem with respecting the user's NIX_ENFORCE_PURITY setting in the wrapper is it will cause failures when invoked from within another package's build, because of course, NIX_ENFORCE_PURITY will be set. So it will e.g. cause downstream packages' tests to fail.

@vcunat
Copy link
Member

@vcunat vcunat commented Oct 18, 2020

NIX_ENFORCE_PURITY now shouldn't be causing problems on master or 20.09, right?

@risicle
Copy link
Contributor Author

@risicle risicle commented Oct 18, 2020

It's still causing problems on 20.09 because it lacks #97597 (and its predecessor), and I think it's probably too significant a change to introduce to 20.09 at this stage.

However I have figured out a workaround I can use for 20.09, which I'm about to open a PR for shortly. (edit: this is #100955)

Certainly this PR can be closed.

@risicle risicle closed this Oct 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants