Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

From Linux bash script to Python control flow #386

Merged
merged 25 commits into from
Jul 11, 2020
Merged

From Linux bash script to Python control flow #386

merged 25 commits into from
Jul 11, 2020

Conversation

joerick
Copy link
Contributor

@joerick joerick commented Jun 23, 2020

We've been talking for a while about moving more of the Linux control flow into Python, instead of bash. This PR explores that.

This is a first implementation of that idea. It has a build script that resembles that of macOS or Windows, but runs every command through docker exec.

Code-wise, I think this is a nice improvement. It avoids lots of bash quirks and fiddling with strings. Hopefully, it should unlock lots of improvements down the line.

The downside, currently, is performance. Those docker exec calls are pretty slow, compared to local subprocess calls. Each docker exec takes around 150ms on my machine. We'll see how it shakes out in our CI providers, but in my testing it looks like roughly +33% run time on the CI suite.

To be clear, the effects will be much worse on our CI than in any user scenario, because we run hundreds of tiny builds, and users probably have a handful of more substantial ones. But the CI is already pretty tedious, so I'd like to explore options to speed it up before committing to this approach.

My current thought to improve perf is to keep a long-running shell process running in the Docker container, and relay commands to that via open file descriptors. Couple details to figure out but should be fairly transparent I hope.

Comments welcome!

@Czaki
Copy link
Contributor

Czaki commented Jun 23, 2020

What did you think about create python script which will be copied into docker container and then executed in one command?

@joerick
Copy link
Contributor Author

joerick commented Jun 23, 2020

What did you think about create python script which will be copied into docker container and then executed in one command?

I think it could work, but there's loads of details to figure out. What state/configuration and code to pass over? How is passed state serialised/deserialized? How to interpret the output of the script? Which sections of the codebase can be called from that remote script? Worst case, it'd be the same as the current bash implementation, with huge amounts of string interpolation, writing Python code inside Python strings. I also was keen to avoid the mental overhead of having to think what sections of the codebase could be called by different files.

@joerick
Copy link
Contributor Author

joerick commented Jun 24, 2020

Alright, that should be the performance problems fixed. Instead of docker exec I'm using a remote shell running inside the Docker container, relaying commands to it and streaming back the stdout. Works pretty well! By chance, that means we can use ENTRYPOINT for that shell, so not specify linux32 anymore.

@joerick joerick changed the title From Linux bash script to 'docker exec' From Linux bash script to Python control flow Jun 24, 2020
@joerick
Copy link
Contributor Author

joerick commented Jun 25, 2020

But, the dockcross images in the test aren't so happy about using ENTRYPOINT after all - their default ENTRYPOINT prints this-

################################################################################
#
# This image is not intended to be run manually.
#
# To create a dockcross helper script for the
# dockcross/manylinux2010-x64:latest image, run:
#
# docker run --rm dockcross/manylinux2010-x64:latest > dockcross-manylinux2010-x64-latest
# chmod +x dockcross-manylinux2010-x64-latest
#
# You may then wish to move the dockcross script to your PATH.
#
################################################################################

It does sorta make sense that we ensure it's bash that we're talking to, to be honest. Anyway, I've reverted that change.

@joerick joerick marked this pull request as ready for review June 25, 2020 11:50
Copy link
Member

@YannickJadoul YannickJadoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, quite a, impressive PR, @joerick!

I do very much like how it looks from the user-side of linux.py. It's ... well, one piece of code again, rather than the bash-nested-inside-python. The implementation is quite impressive as well, but also quite heavy, somehow, and then especially the output-retrieving part of call. I'm somehow not entirely convinced there isn't a better solution, and it does make me wonder if it is worth it (I can't immediately see how or where, but if there are problems or errors in there, they will potentially be very obscure?).
But at the same time, all these docker-specific things are now separate from the rest of the code, more general and reusable, and not really worse than what we had before, so why not go for it, actually, and see how it feels once we start using it? :-)

cibuildwheel/linux.py Show resolved Hide resolved
cibuildwheel/docker_container.py Show resolved Hide resolved
cibuildwheel/docker_container.py Outdated Show resolved Hide resolved
cibuildwheel/docker_container.py Outdated Show resolved Hide resolved
cibuildwheel/docker_container.py Show resolved Hide resolved
env_assignments = ' '.join(f'{shlex.quote(k)}={shlex.quote(v)}'
for k, v in env.items())
command = ' '.join(shlex.quote(str(a)) for a in args)
end_of_message = str(uuid.uuid4())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we somehow use \0 to mark the end of a message? To be fair, I don't know how bash handles this.

At any rate, do we need to generate a UUID each time? I'd think we could just take a fixed string marker?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's nothing stopping a child process from outputting \0 and causing havoc.

Normally, when doing this kind of protocol, there'd be a header that specifies the length of the following message -that makes reads really easy. But I wanted to avoid buffering the output on the container side, for memory reasons but mostly so that output still streamed back to the console - during a long build this is very reassuring, or helpful if a command stalls. This solution was inspired by SMTP multipart boundaries - basically a random string long enough is unique enough to delimit between parts of data. UUID is just an easy way to make a unique string.

I could add some of this context as a comment I guess?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I wanted to avoid buffering the output on the container side, for memory reasons but mostly so that output still streamed back to the console - during a long build this is very reassuring, or helpful if a command stalls.

Alright, yes. Good point! I hadn't considered that, yet, but this is an important feature, then! :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could still just define our own constant magic end-of-message UUID, no? Is there a reason to introduce randomness and regenerate a new one every time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's something comforting about using a new UUID each time - it's a string that's never existed in the world before. but practically, err, the most concrete I can get with it is that if it was static and written in code, and then somehow this code file was sent down the pipe, then the terminator would appear before the end of the message. Far-fetched, but that's the kind of thing I'm thinking about. But a fresh UUID has never existed before, so that kind of thing couldn't happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bit far-fetched, but it had completely not crossed my mind! Better safe than sorry, then :-)

cibuildwheel/docker_container.py Show resolved Hide resolved
cibuildwheel/linux.py Show resolved Hide resolved
cibuildwheel/linux.py Outdated Show resolved Hide resolved
cibuildwheel/linux.py Outdated Show resolved Hide resolved
@joerick
Copy link
Contributor Author

joerick commented Jun 25, 2020

The implementation is quite impressive as well, but also quite heavy, somehow, and then especially the output-retrieving part of call. I'm somehow not entirely convinced there isn't a better solution, and it does make me wonder if it is worth it (I can't immediately see how or where, but if there are problems or errors in there, they will potentially be very obscure?).

Agree that the docker part is kinda... low-level protocolish. I wonder if a few unit tests on that piece in isolation might make us feel better about using it?

@YannickJadoul
Copy link
Member

I wonder if a few unit tests on that piece in isolation might make us feel better about using it?

Ah yes, that would probably already make quite a difference :-)
Though maybe we should not download/run a full docker image by default, but somehow make it optional to run? (something like this perhaps?)

@joerick
Copy link
Contributor Author

joerick commented Jul 10, 2020

Okie doke, this is ready for another review, if you have time @YannickJadoul !

I've fixed some the issues you raised above, and added unit tests for DockerContainer.

It got a little... complicated... when I realised that one of the preexisting tests test_overridden_path wasn't failing for the correct reason. So I have fixed that, it involved some changes to bashlex_eval to support command1; command2 syntax inside command substitutions (command1 && command2 isn't supported due to a bug in bashlex idank/bashlex#54). In the end, I switched to using BEFORE_ALL for that setup anyway, but there's now a unit test to cover command1; command2 in bashlex.

Anyway, let me know what you think!

Copy link
Member

@YannickJadoul YannickJadoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright! I can't say I've looked at every little detail, because there are a lot of changes and I'm a bit overwhelmed, but I see quite extensive tests. So I think we should just merge it, get used to the new way of doing things, and make smaller sets of changes later, if we notice something that can be improved?

bin/run_tests.py Show resolved Hide resolved
cibuildwheel/bashlex_eval.py Show resolved Hide resolved
env_assignments = ' '.join(f'{shlex.quote(k)}={shlex.quote(v)}'
for k, v in env.items())
command = ' '.join(shlex.quote(str(a)) for a in args)
end_of_message = str(uuid.uuid4())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could still just define our own constant magic end-of-message UUID, no? Is there a reason to introduce randomness and regenerate a new one every time?

test/test_environment.py Show resolved Hide resolved
if from_path.is_dir():
self.call(['mkdir', '-p', to_path])
subprocess.run(
f'tar cf - . | docker exec -i {self.name} tar -xC {shell_quote(to_path)} -f -',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking, since you were disappointed by the speed of docker exec. Using byte-streams now, it's still not possible to do this with self.call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooops, I just saw this after the merge. It would be possible, but pretty fiddly to implement. The easiest implementation would involve reading an entire tarball into memory within the host process before streaming it into the container. With a large codebase that might introduce memory issues, and could well be slower. A better implementation would stream between active tar processes on either side, but I'm guessing that would involve some pretty gnarly non-blocking I/O code to get it to work right.

I was curious - here's a profile of the basic test project build -

  _     ._   __/__   _ _  _  _ _/_   Recorded: 19:54:25  Samples:  373
 /_//_/// /_\ / //_// / //_'/ //     Duration: 35.644    CPU time: 0.144
/   _/                      v3.1.3

Program: /Users/joerick/Projects/cibuildwheel/cibuildwheel/__main__.py --platform linux

35.642 <module>  __main__.py:1
└─ 35.599 main  __main__.py:56
   └─ 35.597 build  linux.py:78
      ├─ 25.233 call  docker_container.py:117
      ├─ 3.552 copy_into  docker_container.py:76
      │  └─ 3.541 run  subprocess.py:447
      │        [11 frames hidden]  subprocess
      ├─ 3.184 __enter__  docker_container.py:34
      │  ├─ 2.655 call  docker_container.py:117
      │  └─ 0.520 run  subprocess.py:447
      │        [11 frames hidden]  subprocess
      ├─ 1.854 __exit__  docker_container.py:68
      │  └─ 1.575 wait  subprocess.py:1014
      │        [4 frames hidden]  subprocess
      ├─ 0.682 glob  docker_container.py:106
      │  └─ 0.682 call  docker_container.py:117
      ├─ 0.536 copy_out  docker_container.py:95
      │  └─ 0.536 run  subprocess.py:447
      │        [9 frames hidden]  subprocess
      └─ 0.518 get_environment  docker_container.py:176
         └─ 0.518 call  docker_container.py:117

copy_into is 10% of the time here. Maybe not worth the hassle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might not be indeed. It just looks out of place; that's why I'm asked.

@joerick
Copy link
Contributor Author

joerick commented Jul 11, 2020

Thanks for looking it over, @YannickJadoul!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants