From Linux bash script to Python control flow #386

joerick · 2020-06-23T19:01:37Z

We've been talking for a while about moving more of the Linux control flow into Python, instead of bash. This PR explores that.

This is a first implementation of that idea. It has a build script that resembles that of macOS or Windows, but runs every command through docker exec.

Code-wise, I think this is a nice improvement. It avoids lots of bash quirks and fiddling with strings. Hopefully, it should unlock lots of improvements down the line.

The downside, currently, is performance. Those docker exec calls are pretty slow, compared to local subprocess calls. Each docker exec takes around 150ms on my machine. We'll see how it shakes out in our CI providers, but in my testing it looks like roughly +33% run time on the CI suite.

To be clear, the effects will be much worse on our CI than in any user scenario, because we run hundreds of tiny builds, and users probably have a handful of more substantial ones. But the CI is already pretty tedious, so I'd like to explore options to speed it up before committing to this approach.

My current thought to improve perf is to keep a long-running shell process running in the Docker container, and relay commands to that via open file descriptors. Couple details to figure out but should be fairly transparent I hope.

Comments welcome!

Czaki · 2020-06-23T19:04:03Z

What did you think about create python script which will be copied into docker container and then executed in one command?

joerick · 2020-06-23T19:10:36Z

What did you think about create python script which will be copied into docker container and then executed in one command?

I think it could work, but there's loads of details to figure out. What state/configuration and code to pass over? How is passed state serialised/deserialized? How to interpret the output of the script? Which sections of the codebase can be called from that remote script? Worst case, it'd be the same as the current bash implementation, with huge amounts of string interpolation, writing Python code inside Python strings. I also was keen to avoid the mental overhead of having to think what sections of the codebase could be called by different files.

This lets us use the 'linux32 bash' that's specified by the i686 docker images, without specifying it explicitly.

joerick · 2020-06-24T16:25:20Z

Alright, that should be the performance problems fixed. Instead of docker exec I'm using a remote shell running inside the Docker container, relaying commands to it and streaming back the stdout. Works pretty well! By chance, that means we can use ENTRYPOINT for that shell, so not specify linux32 anymore.

…tead" This reverts commit a052c71.

joerick · 2020-06-25T09:42:27Z

But, the dockcross images in the test aren't so happy about using ENTRYPOINT after all - their default ENTRYPOINT prints this-

################################################################################
#
# This image is not intended to be run manually.
#
# To create a dockcross helper script for the
# dockcross/manylinux2010-x64:latest image, run:
#
# docker run --rm dockcross/manylinux2010-x64:latest > dockcross-manylinux2010-x64-latest
# chmod +x dockcross-manylinux2010-x64-latest
#
# You may then wish to move the dockcross script to your PATH.
#
################################################################################

It does sorta make sense that we ensure it's bash that we're talking to, to be honest. Anyway, I've reverted that change.

cibuildwheel/bashlex_eval.py

YannickJadoul

Wow, quite a, impressive PR, @joerick!

I do very much like how it looks from the user-side of linux.py. It's ... well, one piece of code again, rather than the bash-nested-inside-python. The implementation is quite impressive as well, but also quite heavy, somehow, and then especially the output-retrieving part of call. I'm somehow not entirely convinced there isn't a better solution, and it does make me wonder if it is worth it (I can't immediately see how or where, but if there are problems or errors in there, they will potentially be very obscure?).
But at the same time, all these docker-specific things are now separate from the rest of the code, more general and reusable, and not really worse than what we had before, so why not go for it, actually, and see how it feels once we start using it? :-)

cibuildwheel/linux.py

cibuildwheel/docker_container.py

YannickJadoul · 2020-06-25T16:56:00Z

cibuildwheel/docker_container.py

+        env_assignments = ' '.join(f'{shlex.quote(k)}={shlex.quote(v)}'
+                                   for k, v in env.items())
+        command = ' '.join(shlex.quote(str(a)) for a in args)
+        end_of_message = str(uuid.uuid4())


Could we somehow use \0 to mark the end of a message? To be fair, I don't know how bash handles this.

At any rate, do we need to generate a UUID each time? I'd think we could just take a fixed string marker?

There's nothing stopping a child process from outputting \0 and causing havoc.

Normally, when doing this kind of protocol, there'd be a header that specifies the length of the following message -that makes reads really easy. But I wanted to avoid buffering the output on the container side, for memory reasons but mostly so that output still streamed back to the console - during a long build this is very reassuring, or helpful if a command stalls. This solution was inspired by SMTP multipart boundaries - basically a random string long enough is unique enough to delimit between parts of data. UUID is just an easy way to make a unique string.

I could add some of this context as a comment I guess?

But I wanted to avoid buffering the output on the container side, for memory reasons but mostly so that output still streamed back to the console - during a long build this is very reassuring, or helpful if a command stalls.

Alright, yes. Good point! I hadn't considered that, yet, but this is an important feature, then! :-)

We could still just define our own constant magic end-of-message UUID, no? Is there a reason to introduce randomness and regenerate a new one every time?

I think there's something comforting about using a new UUID each time - it's a string that's never existed in the world before. but practically, err, the most concrete I can get with it is that if it was static and written in code, and then somehow this code file was sent down the pipe, then the terminator would appear before the end of the message. Far-fetched, but that's the kind of thing I'm thinking about. But a fresh UUID has never existed before, so that kind of thing couldn't happen.

That's a bit far-fetched, but it had completely not crossed my mind! Better safe than sorry, then :-)

cibuildwheel/docker_container.py

cibuildwheel/linux.py

joerick · 2020-06-25T18:58:27Z

The implementation is quite impressive as well, but also quite heavy, somehow, and then especially the output-retrieving part of call. I'm somehow not entirely convinced there isn't a better solution, and it does make me wonder if it is worth it (I can't immediately see how or where, but if there are problems or errors in there, they will potentially be very obscure?).

Agree that the docker part is kinda... low-level protocolish. I wonder if a few unit tests on that piece in isolation might make us feel better about using it?

YannickJadoul · 2020-06-25T20:13:22Z

I wonder if a few unit tests on that piece in isolation might make us feel better about using it?

Ah yes, that would probably already make quite a difference :-)
Though maybe we should not download/run a full docker image by default, but somehow make it optional to run? (something like this perhaps?)

unit_test/conftest.py

…h use for Mac

joerick · 2020-07-10T14:22:36Z

Okie doke, this is ready for another review, if you have time @YannickJadoul !

I've fixed some the issues you raised above, and added unit tests for DockerContainer.

It got a little... complicated... when I realised that one of the preexisting tests test_overridden_path wasn't failing for the correct reason. So I have fixed that, it involved some changes to bashlex_eval to support command1; command2 syntax inside command substitutions (command1 && command2 isn't supported due to a bug in bashlex idank/bashlex#54). In the end, I switched to using BEFORE_ALL for that setup anyway, but there's now a unit test to cover command1; command2 in bashlex.

Anyway, let me know what you think!

YannickJadoul

Alright! I can't say I've looked at every little detail, because there are a lot of changes and I'm a bit overwhelmed, but I see quite extensive tests. So I think we should just merge it, get used to the new way of doing things, and make smaller sets of changes later, if we notice something that can be improved?

bin/run_tests.py

cibuildwheel/bashlex_eval.py

YannickJadoul · 2020-07-10T18:15:07Z

cibuildwheel/docker_container.py

+        env_assignments = ' '.join(f'{shlex.quote(k)}={shlex.quote(v)}'
+                                   for k, v in env.items())
+        command = ' '.join(shlex.quote(str(a)) for a in args)
+        end_of_message = str(uuid.uuid4())


We could still just define our own constant magic end-of-message UUID, no? Is there a reason to introduce randomness and regenerate a new one every time?

test/test_environment.py

YannickJadoul · 2020-07-10T18:20:24Z

cibuildwheel/docker_container.py

+        if from_path.is_dir():
+            self.call(['mkdir', '-p', to_path])
+            subprocess.run(
+                f'tar cf - . | docker exec -i {self.name} tar -xC {shell_quote(to_path)} -f -',


Just checking, since you were disappointed by the speed of docker exec. Using byte-streams now, it's still not possible to do this with self.call?

Ooops, I just saw this after the merge. It would be possible, but pretty fiddly to implement. The easiest implementation would involve reading an entire tarball into memory within the host process before streaming it into the container. With a large codebase that might introduce memory issues, and could well be slower. A better implementation would stream between active tar processes on either side, but I'm guessing that would involve some pretty gnarly non-blocking I/O code to get it to work right.

I was curious - here's a profile of the basic test project build -

_ ._ __/__ _ _ _ _ _/_ Recorded: 19:54:25 Samples: 373 /_//_/// /_\ / //_// / //_'/ // Duration: 35.644 CPU time: 0.144 / _/ v3.1.3 Program: /Users/joerick/Projects/cibuildwheel/cibuildwheel/__main__.py --platform linux 35.642 <module> __main__.py:1 └─ 35.599 main __main__.py:56 └─ 35.597 build linux.py:78 ├─ 25.233 call docker_container.py:117 ├─ 3.552 copy_into docker_container.py:76 │ └─ 3.541 run subprocess.py:447 │ [11 frames hidden] subprocess ├─ 3.184 __enter__ docker_container.py:34 │ ├─ 2.655 call docker_container.py:117 │ └─ 0.520 run subprocess.py:447 │ [11 frames hidden] subprocess ├─ 1.854 __exit__ docker_container.py:68 │ └─ 1.575 wait subprocess.py:1014 │ [4 frames hidden] subprocess ├─ 0.682 glob docker_container.py:106 │ └─ 0.682 call docker_container.py:117 ├─ 0.536 copy_out docker_container.py:95 │ └─ 0.536 run subprocess.py:447 │ [9 frames hidden] subprocess └─ 0.518 get_environment docker_container.py:176 └─ 0.518 call docker_container.py:117

copy_into is 10% of the time here. Maybe not worth the hassle.

Might not be indeed. It just looks out of place; that's why I'm asked.

joerick · 2020-07-11T18:27:49Z

Thanks for looking it over, @YannickJadoul!

Remove the linux bash script, replaced with a docker exec approach

e417791

joerick mentioned this pull request Jun 23, 2020

Remove str() calls in call/shell functions using destructuring syntax #387

Merged

joerick added 4 commits June 24, 2020 16:39

Use a long-running remote bash shell, rather than docker exec

53c00a0

Extract DockerContainer into its own file

31339ba

Remove hard-coded shell info to use the docker ENTRYPOINT instead

a052c71

This lets us use the 'linux32 bash' that's specified by the i686 docker images, without specifying it explicitly.

Don't use docker exec -w because CircleCI doesn't support it

850d109

joerick changed the title ~~From Linux bash script to 'docker exec'~~ From Linux bash script to Python control flow Jun 24, 2020

Revert "Remove hard-coded shell info to use the docker ENTRYPOINT ins…

dd4fc69

…tead" This reverts commit a052c71.

Fix issues with spaces in environment variables

fefe128

joerick marked this pull request as ready for review June 25, 2020 11:50

joerick requested a review from YannickJadoul June 25, 2020 11:50

YannickJadoul reviewed Jun 25, 2020

View reviewed changes

cibuildwheel/bashlex_eval.py Outdated Show resolved Hide resolved

YannickJadoul reviewed Jun 25, 2020

View reviewed changes

joerick added 8 commits June 26, 2020 11:02

Style tweaks

68d8f89

Explicit buffer size isn't necessary

1d93336

Fix broken PATH test - it was erroring for a different reason

df69dda

Import organisation

2edb336

Convert DockerContainer to use binary PIPEs, to handle arbitrary data

762dc80

Add unit tests for DockerContainer

c956f11

Fix style

ad871f8

Fix Python 3.6 compatibility

91c3e90

YannickJadoul reviewed Jun 27, 2020

View reviewed changes

unit_test/conftest.py Outdated Show resolved Hide resolved

joerick added 2 commits June 28, 2020 10:53

Change 'slow' tests to 'docker' tests, only run them on Linux

faf6a76

Don't use redirection syntax in Windows environment. Fix absolute pat…

27d97da

…h use for Mac

joerick added 2 commits June 28, 2020 21:51

Fix overridden PATH test, by ensuring bash command substitution rules

e31c24d

Add docker test images for other architectures

b12a560

joerick mentioned this pull request Jul 4, 2020

Windows pyproject.toml cannot handle versioned dependencies #392

Closed

joerick added 6 commits July 5, 2020 17:27

Use a different method for the Windows test

cc913ee

Pass PATH during command substitution test

833a069

Add file-based unit tests

9e35d65

Finish file/dir tests

4c0af9f

Convert glob to take a Path and a str pattern

44e3198

Flake8 fixes

5842a75

YannickJadoul reviewed Jul 10, 2020

View reviewed changes

YannickJadoul approved these changes Jul 10, 2020

View reviewed changes

joerick merged commit ed0077a into master Jul 11, 2020

YannickJadoul mentioned this pull request Jul 19, 2020

The "/project/" path isn't there anymore for Linux in version 1.5.3 #409

Closed

YannickJadoul mentioned this pull request Aug 11, 2020

auditwheel: error: cannot repair "...cp37m-linux_x86_64.whl" to "manylinux2010_x86_64" #417

Closed

joerick deleted the docker-exec branch August 29, 2020 20:44

joerick mentioned this pull request Oct 30, 2020

Refactoring the Linux build script #99

Closed

joerick mentioned this pull request Dec 20, 2021

Podman support on Linux #966

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

From Linux bash script to Python control flow #386

From Linux bash script to Python control flow #386

joerick commented Jun 23, 2020 •

edited

Loading

Czaki commented Jun 23, 2020

joerick commented Jun 23, 2020

joerick commented Jun 24, 2020

joerick commented Jun 25, 2020

YannickJadoul left a comment

YannickJadoul Jun 25, 2020

joerick Jun 26, 2020

YannickJadoul Jun 26, 2020

YannickJadoul Jul 10, 2020

joerick Jul 11, 2020

YannickJadoul Jul 11, 2020

joerick commented Jun 25, 2020

YannickJadoul commented Jun 25, 2020

joerick commented Jul 10, 2020

YannickJadoul left a comment

YannickJadoul Jul 10, 2020

YannickJadoul Jul 10, 2020

joerick Jul 11, 2020

YannickJadoul Jul 11, 2020

joerick commented Jul 11, 2020

From Linux bash script to Python control flow #386

From Linux bash script to Python control flow #386

Conversation

joerick commented Jun 23, 2020 • edited Loading

Czaki commented Jun 23, 2020

joerick commented Jun 23, 2020

joerick commented Jun 24, 2020

joerick commented Jun 25, 2020

YannickJadoul left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joerick commented Jun 25, 2020

YannickJadoul commented Jun 25, 2020

joerick commented Jul 10, 2020

YannickJadoul left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joerick commented Jul 11, 2020

joerick commented Jun 23, 2020 •

edited

Loading