Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow running cibuildwheel inside of a previously configured docker image. #676

Closed
Erotemic opened this issue May 14, 2021 · 12 comments
Closed

Comments

@Erotemic
Copy link
Contributor

On my internal gitlab-ci runners docker-inside-docker is disabled. This seems to be causing cibuildwheel to fail.

As a workaround, I would like to be able to use cibuildwheel inside of a base manylinux image. That is, I want to give it the information that it is already inside of a manylinux image like quay.io/pypa/manylinux2010_x86_64, and then I want it to do its thing.

I do something similar in this script, if you are inside of docker image (you have to give it this information), it executes the script, but if you tell it you want to run in docker, then it executes itself inside inside of the docker image.
https://gitlab.kitware.com/computer-vision/kwimage/-/blob/master/run_manylinux_build.sh

And my gitlab yaml explicitly has to call out when I use the base quay.io/pypa/manylinux2010_x86_64 image:
https://gitlab.kitware.com/computer-vision/kwimage/-/blob/master/.gitlab-ci.yml

I was poking around in cibuildwheel.linux, and I see that it loops over several configurations and then executes a block of code in a DockerContainer context manager. If I was to write a PR that refactored that inner part into a function the CLI could invoke directly (where the user likely has to provide some information that the looped settings are currently providing), would that be of interest to the maintainers?

(A lot of the inner loop actually looks like a more sophisticated version of what I'm doing in that shell script, I think it would be nice to have that as a callable function)

@henryiii
Copy link
Contributor

Do you really need to refactor it in this way? It seems like we could just write a new context manager that simply wraps subprocess.run and otherwise does pretty much nothing, and then using that instead would perform the work in process instead of in a docker container. Then you could have some escape hatch that uses that instead if cibuildwheel was running inside the manylinux image - say maybe even with pipx! :) (See pypa/manylinux#1055 (comment)).

I actually would really like to have a run_all and run function inside each linux/macOS/windows module, see #560. But there are challenges to doing it properly, I think, and the interface might not be that "nice" for outside users. Longer term, having an API usage might be of interest?

@Erotemic
Copy link
Contributor Author

I started hacking on this on https://github.com/Erotemic/cibuildwheel/tree/dev/flow

I'm not sure what the best way to refactor my stuff to use cibuildwheel is yet. I'm not tied to any one solution.

In my branch I'm also checking to see if replacing "docker" with "podman" (which seems to be as easy as making a docker_exe variable that is set to either "docker" or "podman" and passing it to subprocess.run). If that works that might be the simpler route.

@mayeut
Copy link
Member

mayeut commented May 24, 2021

@Erotemic,

podman or a rootless docker daemon should probably work. If you've been able to try any of those, it might be interesting to know what works and what doesn't.
For a simple tryout with podman, I'd probably go for a symlink as docker in the PATH so as not to have to modify cibuildwheel at first.

@henryiii,

Refactoring the run command might be challenging for linux but, I agree, this would probably be the best way to handle this.

@Erotemic
Copy link
Contributor Author

@mayeut I've actually been able to use podman with some success, but it did require modifications, which are currently in my branch:
master...Erotemic:dev/flow

I had to add these "common docker flags"

        self.common_docker_flags = [
            # https://stackoverflow.com/questions/30984569/error-error-creating-aufs-mount-to-when-building-dockerfile
            "--cgroup-manager=cgroupfs",
            "--storage-driver=vfs",
            # https://github.com/containers/podman/issues/2347
            f"--root={os.environ['HOME']}/.local/share/containers/vfs-storage/",
        ]

The storage driver needed to change to vfs for whatever reason, which also meant I needed to change the root, and I also think I needed to add a --cgroup-manager=cgroupfs, although I'm not 100% sure what the effect of the last option is.

I added :Z to the end of the volume mount and removed the cwd_args from docker create because that broke on podman. (That also required that I did an initial mkinit and executed a cd before every command to simulate a workdir).

There was a weird issue on the __exit__ method when shutting down the podman containers. Adding a small sleep seemed to mitigate it.

The workaround for docker cp did not work on podman because it seems to cuttoff stdout, so the tar files kept coming back with unexpected EOFs. I just used podman cp and that did work to copy wheels out of the container.

The diff is definitely bigger than it needs to be because I played around a lot with it, but ultimately podman did work (and probably should be added as an option to the main cibuildwheel).

Note, I'm also not sure if all of the changes I made were 100% necessary. I know the cwdargs was, but not sure about the :Z. I was trying things to get around errors like:

time="2021-05-19T19:28:17Z" level=warning msg="Failed to add conmon to systemd sandbox cgroup: dial unix /run/systemd/private: connect: no such file or directory"
Error: OCI runtime error: unable to start container 3aaeddd49abd4f8736532af93080ffc109ba48617738bbc25d1d76be78f166d4: systemd cgroup flag passed, but systemd support for managing cgroups is not available
time="2021-05-19T16:37:07Z" level=warning msg="Couldn't run auplink before unmount /var/lib/containers/storage/aufs/mnt/a5366127266b78186d0c49c41deb56c6913d0b571daa78322eae26093f9e06c6: exec: \"auplink\": executable file not found in $PATH"
time="2021-05-19T16:37:07Z" level=error msg="Error adding network: failed to locate iptables: exec: \"iptables\": executable file not found in $PATH"
time="2021-05-19T16:37:07Z" level=error msg="Error while adding pod to CNI network \"podman\": failed to locate iptables: exec: \"iptables\": executable file not found in $PATH"
time="2021-05-19T16:37:07Z" level=error msg="Error preparing container 762a698cc4d6ca307caa297ef328e13fb7c1d545e229080975d73987062e5667: error configuring network namespace for container 762a698cc4d6ca307caa297ef328e13fb7c1d545e229080975d73987062e5667: failed to locate iptables: exec: \"iptables\": executable file not found in $PATH"
Error: unable to start container 762a698cc4d6ca307caa297ef328e13fb7c1d545e229080975d73987062e5667: error mounting storage for container 762a698cc4d6ca307caa297ef328e13fb7c1d545e229080975d73987062e5667: error creating aufs mount to /var/lib/containers/storage/aufs/mnt/a5366127266b78186d0c49c41deb56c6913d0b571daa78322eae26093f9e06c6: invalid argument

But again, with these modifications podman does work.

@mayeut
Copy link
Member

mayeut commented May 24, 2021

@Erotemic,

Thanks for the detailed feedback. I'll probably have a look at your fork given I'm currently trying to debug some tests issues on Travis CI using podman and, as your patches suggest, it's not as easy as "replace docker by podman".

@Erotemic
Copy link
Contributor Author

Erotemic commented Oct 4, 2021

As an update, I've been using my patched version here https://github.com/Erotemic/cibuildwheel/tree/dev/flow to build all of my wheels on machines where podman is available but docker isn't. If there is interest I can clean it up and submit it as a PR.

@fedelibre
Copy link

For a simple tryout with podman, I'd probably go for a symlink as docker in the PATH so as not to have to modify cibuildwheel at first.

An alias is not enough?

[fede@fedora python-poppler-qt5]$ which docker
alias docker='podman'
	/usr/bin/podman
[fede@fedora python-poppler-qt5]$ docker --version
podman version 3.4.0
[fede@fedora python-poppler-qt5]$ cibuildwheel --platform linux

     _ _       _ _   _       _           _
 ___|_| |_ _ _|_| |_| |_ _ _| |_ ___ ___| |
|  _| | . | | | | | . | | | |   | -_| -_| |
|___|_|___|___|_|_|___|_____|_|_|___|___|_|

cibuildwheel version 2.1.3

Build options:
  platform: 'linux'
  architectures: {<Architecture.x86_64: 'x86_64'>, <Architecture.i686: 'i686'>}
  before_all: ''
  before_build: ''
  before_test: ''
  build_frontend: 'pip'
  build_selector: BuildSelector(build_config='*')
  build_verbosity: 0
  dependency_constraints: DependencyConstraintsPosixPath('/var/home/fede/.local/lib/python3.10/site-packages/cibuildwheel/resources/constraints.txt'))
  environment: ParsedEnvironment([])
  manylinux_images: {'x86_64': 'quay.io/pypa/manylinux2010_x86_64:2021-10-06-94da8f1', 'i686': 'quay.io/pypa/manylinux2010_i686:2021-10-06-94da8f1', 'pypy_x86_64': 'quay.io/pypa/manylinux2010_x86_64:2021-10-06-94da8f1', 'aarch64': 'quay.io/pypa/manylinux2014_aarch64:2021-10-06-94da8f1', 'ppc64le': 'quay.io/pypa/manylinux2014_ppc64le:2021-10-06-94da8f1', 's390x': 'quay.io/pypa/manylinux2014_s390x:2021-10-06-94da8f1', 'pypy_aarch64': 'quay.io/pypa/manylinux2014_aarch64:2021-10-06-94da8f1', 'pypy_i686': 'quay.io/pypa/manylinux2010_i686:2021-10-06-94da8f1'}
  output_dir: PosixPath('wheelhouse')
  package_dir: PosixPath('.')
  repair_command: 'auditwheel repair -w {dest_dir} {wheel}'
  test_command: ''
  test_extras: ''
  test_requires: []
  test_selector: TestSelector(build_config='*')

Here we go!

cibuildwheel: Docker not found. Docker is required to run Linux builds. If you're building on Travis CI, add `services: [docker]` to your .travis.yml.If you're building on Circle CI in Linux, add a `setup_remote_docker` step to your .circleci/config.yml

@Erotemic
Copy link
Contributor Author

@fedelibre an alias is not enough on the gitlab CI machines I use. There are specific args I have to add in order to make podman work.

In my fork the differences are in cibuildwheel/docker_container.py

In __enter__ I have the following tweaks:

        if self.oci_exe == 'podman':
            self.common_oci_args += [
                # https://stackoverflow.com/questions/30984569/error-error-creating-aufs-mount-to-when-building-dockerfile
                "--cgroup-manager=cgroupfs",
                "--storage-driver=vfs",
            ]
            if self.oci_root == "":
                # https://github.com/containers/podman/issues/2347
                self.common_oci_args += [
                    f"--root={os.environ['HOME']}/.local/share/containers/vfs-storage/",
                ]
            else:
                self.common_oci_args += [
                    f"--root={self.oci_root}",
                ]
        if self.oci_exe == 'podman':
            oci_create_args.extend([
                #https://github.com/containers/podman/issues/4325
                "--events-backend=file",
                "--privileged",
            ])
            oci_start_args.extend([
                "--events-backend=file",
            ])

I also have to add some hacky sleeps in the __exit__ not sure why.

        if self.oci_exe == 'podman':
            time.sleep(0.01)

The copy_out process is a bit different:

        if self.oci_exe == 'podman':
            command = f"{self.oci_exe} exec {self.common_oci_args_join} -i {self.name} tar -cC {shell_quote(from_path)} -f /tmp/output-{self.name}.tar ."
            subprocess.run(
                command,
                shell=True,
                check=True,
                cwd=to_path,
            )

            command = f"{self.oci_exe} cp {self.common_oci_args_join} {self.name}:/tmp/output-{self.name}.tar output-{self.name}.tar"
            subprocess.run(
                command,
                shell=True,
                check=True,
                cwd=to_path,
            )

            command = f"tar -xvf output-{self.name}.tar"
            subprocess.run(
                command,
                shell=True,
                check=True,
                cwd=to_path,
            )

            os.unlink(to_path / f"output-{self.name}.tar")
        elif self.oci_exe == 'docker':
            command = f"{self.oci_exe} exec {self.common_oci_args_join} -i {self.name} tar -cC {shell_quote(from_path)} -f - . | cat > output-{self.name}.tar"
            subprocess.run(
                command,
                shell=True,
                check=True,
                cwd=to_path,
            )
        else:
            raise KeyError(self.oci_exe)

@henryiii
Copy link
Contributor

henryiii commented Oct 15, 2021

Personally, I'd be fine adding podman support, doesn't look too hard. I don't know much about podman, we'd want some way to test it. Is it available on public CI, like GitLab CI? It could be CIBW_<something>=podman, probably as a non-options option, like... Actually, I thought we had a CIBW_FOLD_PATTERN, but apparently not. @joerick, thoughts about podman?

We could also support "native", which would run on the host system directly, and would ignore images - you would be expected to run cibuldwheel from the manylinux image. Might be harder to support, though. (that's the original idea of this issue)

@fedelibre
Copy link

Ok, I see.
I was following the manual and I hoped I could test my first attempts locally with podman. I guess I'll use Github Actions then.

@Erotemic
Copy link
Contributor Author

FYI: I've updated my fork of cibuildwheel with a breaking change. I needed to have finer grained control over the extra options I pass to podman. So instead of detecting if podman is the OCI driver, and then adding extra flags to the commands, I'm currently forcing the user (me) to explicitly set the extra flags podman needs. To get the previous behavior these environment variables that need setting are now:

          export CIBW_OCI_EXE="podman"
          export CIBW_OCI_EXTRA_ARGS_CREATE="--events-backend=file --privileged"
          export CIBW_OCI_EXTRA_ARGS_COMMON="--cgroup-manager=cgroupfs --storage-driver=vfs --root=$HOME/.local/share/containers/vfs-storage/"
          export CIBW_OCI_EXTRA_ARGS_START="--events-backend=file --cgroup-manager=cgroupfs --storage-driver=vfs"

Also note, that if running podman inside of docker it is important that the parent docker run command is given the --privileged flag.

Lastly, it seemed important for me to update from podman 3.0.1 to 3.2.1 in order for my CI scripts to work on newer linux kernels (5.4 worked but 5.8 and 5.11 failed).

@joerick
Copy link
Contributor

joerick commented Apr 1, 2023

Podman support was merged a year ago, so I think the motivation for the initial request of invoking cibuildwheel inside the container is gone. Podman is the solution for environments where root isn't available.

@joerick joerick closed this as not planned Won't fix, can't repro, duplicate, stale Apr 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants