Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CIBW_ENVIRONMENT #21

Merged
merged 19 commits into from
Sep 7, 2017
Merged

CIBW_ENVIRONMENT #21

merged 19 commits into from
Sep 7, 2017

Conversation

joerick
Copy link
Contributor

@joerick joerick commented Aug 8, 2017

This is a PR to track a branch I've been working on regarding #16.

@joerick
Copy link
Contributor Author

joerick commented Aug 8, 2017

Currently only linux is working. I'm splitting the environment variables into keyvalue pairs and storing them in a dictionary at startup, then injecting them into the build environment inside the Docker.

@joerick
Copy link
Contributor Author

joerick commented Aug 8, 2017

I've come across a problem with the escaping rules - for PATH=$PATH:/usr/local/bin to work I should be escaping the environment variable value ($PATH:/usr/local/bin) without escaping the $PATH bit. The standard shlex.quote function uses single quotes ', so the $PATH isn't expanded in the shell.

Possible solutions-

  • Quote using double-quotes, rather than single-quotes
  • Split the value into envvar names and string bits ['$PATH', ':/usr/local/bin'], and only quote the string bits

I'm thinking the latter will also pave the way to nice macOS and windows solutions, since in those scenarios we'll be merging two dictionaries to form the environment that's passed to the subprocesses.

Maybe I'm overthinking - anyone see an easier way to accomplish this?

@YannickJadoul
Copy link
Member

Cool, nice job!

A few comments (cfr. what I encountered in #19):

  • Why do you want to quote these values? If they need to be quoted, won't they already be like that in the CIBW_ENVIRONMENT. E.g., 'VAR=val OTHER_VAR="some string with spaces and possibly a = sign". If this is the desired input, your dictionary should already be {'VAR': 'val', 'OTHER_VAR': '"some string with spaces and possibly a = sign\"'}, and you don't need to extra quote things anymore, no?
  • Next to that, if you're shlex.quote'ing, you lose the possibility of e.g. having backticks that will be evaluated at assignment time. Silly example: SOME_FILE="\`which <some_exectuable> | dirname\`/<relative_path>"
  • Is it worth the extra effort of trying to make these environment variables be set every execution of the loop over all Python version (to allow for something slightly more customized). This would probably be more general and allow for a few extra, rare use cases. This would also allow for the variables to be set dependent upon the things being done during CIBW_BEFORE_BUILD?

@joerick
Copy link
Contributor Author

joerick commented Aug 13, 2017

Thanks for your feedback @YannickJadoul !

Why do you want to quote these values?

Gah, this is a fair point. I suppose I'm trying to shield users from the crazy world of shell syntax. The Python way of having environment just be a dict is so much simpler than the crazy shell world.

Also, part of me is thinking about how this might look in a cibuildwheel.yml file...

environment:
  PATH: $PATH:/usr/local/bin
  SOME_VAR: 1234
  COMPUTED_VAR: $(pwd)/tests
  ANOTHER_VAR: "a string with spaces in"

Maybe the solution is to let the shell handle it on Linux. But on Mac and Windows...

  • Mac, bash is available, so we could somehow run those environment setting lines in that
  • Windows, I have no idea how env vars work there!

Tricky. To be honest, ideally, this would be in Python. Maybe I'll just write a parser for these strings. How hard can it be :D

@joerick joerick changed the title Environment CIBW_ENVIRONMENT Aug 13, 2017
@YannickJadoul
Copy link
Member

Gah, this is a fair point. I suppose I'm trying to shield users from the crazy world of shell syntax. The Python way of having environment just be a dict is so much simpler than the crazy shell world.

What about replacing $VAR and %VAR% (Windows) by just having {VAR} in the input? Then you could do something like echo {environment} | python -c "import os; import sys; sys.stdout.write(sys.stdin.read().format(**os.environ))". Looks fairly horrible, but the syntax to the user would be nicer, more pythonic, and more cross-platform compatible?

The one thing I'm still 'afraid' of is that by quoting these environment variables and losing backtick-tricks, the environment variables would be completely static. For example, if you set a path that contains a certain version number, you cannot figure out that version dynamically and have to adapt it in multiple places. Then again, maybe I'm over-thinking that part and maybe it would already be good to have this basic example working.

Also, part of me is thinking about how this might look in a cibuildwheel.yml file...

I do like the idea of being able to use a single yaml/json file instead of only the CIBW environment variables!

Tricky. To be honest, ideally, this would be in Python. Maybe I'll just write a parser for these strings. How hard can it be :D

Eeeeeerm, rather you than me, in that case ;-)

@tgarc
Copy link
Contributor

tgarc commented Aug 15, 2017

Also, part of me is thinking about how this might look in a cibuildwheel.yml file...

I do like the idea of being able to use a single yaml/json file instead of only the CIBW environment variables!

I'd be opposed to this, at least in this stage of the game..users already need to have separate appveyor and travis files. I don't see the value in having variables in a separate file unless cibuildwheel really starts becoming more feature rich (and therefore complex).

As for CIBW_ENVIRONMENT in general...is there actually any advantage to having this option for Mac or Windows? Isn't this feature only really needed for Linux where Docker is being used?

@joerick
Copy link
Contributor Author

joerick commented Aug 16, 2017 via email

@tgarc
Copy link
Contributor

tgarc commented Aug 23, 2017

  • in the 'happy path', users don't have to know how other platforms work (the Docker considerations, windows cmd syntax), since cibuildwheel abstracts over that stuff

I get where you're coming from. My trepidation is that if cibuildwheel gets intertwined with appveyor/circle/travis configuration then we have to start 'keeping up' with their changes, features, and issues. Thinking along these lines I was also skeptical of doing things like #24 which in my view is fixing an issue that is due to the continuous integration service build environment (IOW any person using appveyor with python will have this issue - not just cibuildwheel users); maybe in a month this issue is not even present in appveyor!

Anyway, I guess my two cents is that trying to abstract CI services and OSes for users seems like the wrong way to go - at least for a relatively small project like this one. If you have a rough sketch of what you're thinking on the single configuration file idea, I'd be interested to see it though.

Back to the point at hand...I'm not sure what the best solution for this issue is. I'd be tempted to add a CIBW_DOCKER_ENVIRONMENT since that's the only platform for which this feature is really needed but I don't want to just end up getting rid of it soon after that for something better...

@joerick
Copy link
Contributor Author

joerick commented Aug 29, 2017

Thanks for your thoughts on this @tgarc!

Anyway, I guess my two cents is that trying to abstract CI services and OSes for users seems like the wrong way to go - at least for a relatively small project like this one. If you have a rough sketch of what you're thinking on the single configuration file idea, I'd be interested to see it though.

You are giving me pause here... I think there's a lot to be said for keeping things simple, maybe another way of saying your point is that cibuildwheel will always be a leaky abstraction, so why try to paper over how it works.

Back to the point at hand...I'm not sure what the best solution for this issue is. I'd be tempted to add a CIBW_DOCKER_ENVIRONMENT since that's the only platform for which this feature is really needed but I don't want to just end up getting rid of it soon after that for something better...

But when I think about CIBW_DOCKER_ENVIRONMENT, I'm thinking:

  • The DOCKER part is leaking implementation into interface - what if better containerisation comes or maybe we don't even need Docker to build on linux in a couple years. So better to rename to CIBW_ENVIRONMENT
  • Then, if it's called CIBW_ENVIRONMENT, it might as well work on all platforms, since that could be useful for people / confusing if it only works on Linux.

So I'm coming back to the current approach, even without the single config file idea.

@joerick
Copy link
Contributor Author

joerick commented Aug 29, 2017

p.s. the single config file might look like:

cibuildwheel.yml

environment:
  - $PATH:/usr/local/bin
before_build: "pip install ."
test_requires:
  - pytest
test_command: "pytest {project}/tests"

(but it probably needs more thought... e.g. platform-specific options)

@joerick
Copy link
Contributor Author

joerick commented Sep 2, 2017

So... this should be ready to merge now (CI pending)! I found a really handy module called bashlex which has really helped with the environment string parsing.

If you have time @tgarc @YannickJadoul would you mind giving a quick review?

@YannickJadoul
Copy link
Member

YannickJadoul commented Sep 2, 2017

Sounds pretty good! I'll make some time for having a look at this tomorrow.

@YannickJadoul
Copy link
Member

@joerick Wow, this is the closest we've come to having something that's acutally working! :-D
I do like the bashlex project you've found, which makes you've just improved Windows by adding a Bash interpreter ;-)

I was surprised by how many lines you've needed to add to make this work (even using that bashlex library), but then I guess that it's part of the choice to have portable/unified environment evaluation (or almost portable, looking at the minor CI errors) ?
Would it be easy to also use this code to evaluate the expressions in the Linux shell, btw, just for the sake of having as much consistency as possible (and to get rid of the as_shell_commands)?

For the rest, it seems as if the escaping will sometimes be hell, as your example >>> split_env_items('VAR="dont \\\\"forget\\\\" about backslashes"') shows, but I cannot see an easy way to resolve that immediately.

Finally, I'd just like to bring up the consideration if this should be executed once, or if we would allow the user to execute it for every Python executable (possibly doing something with the value of which python or python --version). I don't think I would need this urgently, but I'm thinking of making your hard work as flexibly usable as possible :-)

@YannickJadoul
Copy link
Member

By the way, I cannot say I've looked at every single detail, but I cannot find any concrete things to remark in a code review!

I'll try putting this PR to use tomorrow, if I find the time to set up wheel building for manylinux1 for my project, using this.

@YannickJadoul
Copy link
Member

On the AppVeyor error:

>>> x = json.load(open('environment.json', 'r'))
>>> x["CIBW_ENVIRONMENT_WINDOWS"]
u'CIBW_TEST_VAR="a b c" CIBW_TEST_VAR_2=1 CIBW_TEST_VAR_3="$(echo \'test string 3\')" PATH="$PATH;/opt/cibw_test_path"'
>>> list(bashlex.split(x["CIBW_ENVIRONMENT_WINDOWS"]))
[u'CIBW_TEST_VAR="a b c"', u'CIBW_TEST_VAR_2=1', u"CIBW_TEST_VAR_3=$(echo 'test string 3')", u'PATH=$PATH;/opt/cibw_test_path'

Isn't this weird/wrong, that bashlex seems to be losing the quotes around $(echo 'test string 3') and $PATH;/opt/cibw_test_path? That's at least where the error comes from, because now the value of PATH contains 3 parts, again.

@joerick
Copy link
Contributor Author

joerick commented Sep 3, 2017

Isn't this weird/wrong, that bashlex seems to be losing the quotes around $(echo 'test string 3') and $PATH;/opt/cibw_test_path?

Yes it's pretty strange. I've got an idea how to resolve it though, using bashlex.parse instead

@YannickJadoul
Copy link
Member

Yes it's pretty strange.

If we're sure this is wrong, we could of course also log an issue on the bashlex project?

Two more things, btw:

  • Just curious, do backticks work instead of $(...)?
  • It seems there's quite a bit of overlap between run_test.py and run_tests.py? Is there a good reason to not try merging them and reducing code duplication? (If you don't feel like it, I wouldn't mind taking a stab at it after this PR is merged, though.)

@YannickJadoul
Copy link
Member

Slightly ahead of you on the bashlex bug

Perfect, let's see what that brings :-)

I'd be very happy if you wanted to take a look at fixing that

Done. Is there a way to make a PR on a PR? :-D I could just attach the 2 changed files here, too, if that's better?

@joerick
Copy link
Contributor Author

joerick commented Sep 6, 2017

Just push to this branch, I'll review it as part of this 👍

@YannickJadoul
Copy link
Member

Well, my git complains I don't have write access, but I've just pushed it to my 'environment' branch: YannickJadoul@a7c7a58

@joerick
Copy link
Contributor Author

joerick commented Sep 6, 2017

Accept the invite I just sent you and try again ;)

@YannickJadoul
Copy link
Member

That also works, of course :-)

bin/run_tests.py Outdated
@@ -4,7 +4,7 @@
import os, sys, subprocess, shutil, json
from glob import glob

from . import run_test
import run_test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if there is a better way of doing this 'import from the same folder'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there isn't. Also since these are utility scripts I'd prefer using the shell interface instead, something like subprocess.check_call([sys.executable, 'bin/run_test.py', arg1, arg2])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There you go, then. I'd chosen for this such that the whole argument parsing (and extra output 'Project built successfully.') was not executed every time.

Have a look, maybe the if __name__ == '__main__': construction was also unnecessary, then.

YannickJadoul and others added 4 commits September 6, 2017 16:14
…vironment

* 'environment' of github.com:joerick/cibuildwheel:
  Making call from run_tests to run_test subprocess.check_call
  Fixing 'ValueError: Attempted relative import in non-package'
  Reducing code duplication in bin/run_test.py and bin/run_tests.py
@joerick
Copy link
Contributor Author

joerick commented Sep 6, 2017

Great, thanks Yannick. I've put the fix for the string splits in so this should be working now. I'm gonna write some documentation.

@YannickJadoul
Copy link
Member

Brilliant, seems this is almost done! (Whenever Travis CI is starts running OS X jobs, but it seems they are having troubles today.)

Minor suggestion, maybe: what about a quick list (+ maybe short description) of all environment variables, before going over them in all detail? That way, the potential user can get an overview of what options there are, before reading the exact implementation of them.

@joerick joerick merged commit 4457a12 into master Sep 7, 2017
@joerick
Copy link
Contributor Author

joerick commented Sep 7, 2017

🎉

Merged!

Your list sounds useful @YannickJadoul, but I'm not sure what would go in it... are there some common environment variables that are useful for building wheels?

@YannickJadoul
Copy link
Member

YannickJadoul commented Sep 7, 2017

Awesome!

but I'm not sure what would go in it

I actually just mean listing CIBW_PLATFORM, CIBW_SKIP, CIBW_ENVIRONMENT, CIBW_BEFORE_BUILD, CIBW_TEST_COMMAND, and CIBW_TEST_REQUIRES.
I know they're all documented, so if you'd just see, then a new potential user can in one view spot what's possible with cibuildwheel?

I was just thinking of something like this, at the start of the options:

  • CIBW_PLATFORM - Override the platform, when auto-detection does not suffice.
  • CIBW_SKIP - Skip certain Python versions
    etc...

(but maybe I'm overthinking stuff, but I know that I'm failing to get the overview of all options, especially when I'm just trying to remember the exact name, instead of the whole explanation ;-) )

@joerick
Copy link
Contributor Author

joerick commented Sep 7, 2017

That's a good idea! I'm kinda blind to it since I've seen those options grow over time but I can definitely see that would be useful to a newcomer.

@YannickJadoul
Copy link
Member

Once I've fixed my own Windows build, I can have a look, if you want. If you aren't doing it yourself, now, ofc...

@joerick
Copy link
Contributor Author

joerick commented Sep 7, 2017

Yes, please do, that would be great.

@YannickJadoul
Copy link
Member

Seems my build is now successfully using CIBW_ENVIRONMENT 🎉 🙂
https://travis-ci.org/YannickJadoul/Parselmouth/builds/273942976

One minor thing I've encountered: I need to misuse the evaluation in assignments to execute code. Specific case: I want to compile a recent version of ccache (the old in the manylinux1 image is not good enough).

Normally this would be a thing for CIBW_BEFORE_BUILD (knowing make will not recompile, so executing it for each python executable isn't an issue). But then I also need to have the environment variables CC and CXX set for my builds (because they need to be ccache cc and ccache c++). However, that's a problem, because if I already do that in CIBW_ENVIRONMENT, I cannot compile ccache anymore, since my CC and CXX compilers do not work!

Just thinking out loud: would it be interesting to be able to have some commands that would be executed once, before CIBW_ENVIRONMENT gets set?

@joerick
Copy link
Contributor Author

joerick commented Sep 10, 2017

hmmm! In your case, maybe could could set CC and CXX in CIBW_ENVIRONMENT, and then have CIBW_BEFORE_BUILD="unset CC CXX && mkdir ccache/$ARCH && cd ccache/$ARCH && ../configure && make install"

I'm also wondering if there's a need for a CIBW_BEFORE command, that would do some work before any builds started, so only running once before the version builds start. What do you think?

@YannickJadoul
Copy link
Member

Hmmmm, nice idea, with the unset. Still feels a bit like a hack, though. Yet a better hack than the one I have, currently, I agree on that. ;-)

The CIBW_BEFORE was indeed what I was thinking about/suggesting. To be fair, the 'do some things, then set some environment variables' seems like a not-so-unlikely combination to me. (Especially in the ooooold CentOS5 docker images.)

@YannickJadoul
Copy link
Member

YannickJadoul commented Sep 10, 2017

By the way: because CIBW_BEFORE_BUILD is executed each time, the mkdir fails, the second time. So I'm now trying with [ -d ccache/$ARCH ] || (mkdir etc etc...)

Maybe that kind of construction is an extra argument for CIBW_BEFORE_ENVIRONMENT

@joerick
Copy link
Contributor Author

joerick commented Sep 10, 2017

mkdir -p won't fail if the dir is already there.

@joerick
Copy link
Contributor Author

joerick commented Sep 10, 2017

but yes, CIBW_BEFORE looks like it has a few use cases

@joerick joerick deleted the environment branch September 10, 2017 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants