Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

containers create by Github workflow have wrong dockerMTU #1046

Open
erichorwath opened this issue Jan 11, 2022 · 16 comments
Open

containers create by Github workflow have wrong dockerMTU #1046

erichorwath opened this issue Jan 11, 2022 · 16 comments
Labels

Comments

@erichorwath
Copy link

erichorwath commented Jan 11, 2022

Describe the bug
No network possible from Docker containers created by Github workflows (dind - Docker in Docker)

To Reproduce
Steps to reproduce the behavior:

  1. create workflow with docker image, e.g.
name: Setup Go

on:
  workflow_dispatch:

jobs:
  setupgo:
    runs-on: [kubernetes]
    container:
      image: 'ubuntu:latest'
    steps:
    - name: Set up Go
      uses: actions/setup-go@v2
      with:
        go-version: 1.17
    
    - name: sleep
      shell: bash
      run: sleep 300
  1. exec into the dind sidecar container and confirm that mtu parameter is propagated properly from runner deployment spec:
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: my-runner
spec:
  template:
    spec:
      organization: abc
      ephemeral: true
      dockerMTU: 1440
$ kubectl exec -it my-runner-xyz -c docker -- /bin/sh 
$ ps auxww
    1 root      0:00 docker-init -- dockerd --host=unix:///var/run/docker.sock --host=tcp://0.0.0.0:2376 --tlsverify --tlscacert /certs/server/ca.pem --tlscert /certs/server/cert.pem --tlskey /certs/server/key.pem --mtu 1440
   53 root      0:03 dockerd --host=unix:///var/run/docker.sock --host=tcp://0.0.0.0:2376 --tlsverify --tlscacert /certs/server/ca.pem --tlscert /certs/server/cert.pem --tlskey /certs/server/key.pem --mtu 1440
   61 root      0:04 containerd --config /var/run/docker/containerd/containerd.toml --log-level info
   ...
$ ip a
  ...
  4: eth0@if700: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1440 qdisc noqueue state UP
      link/ether d2:ee:d5:2f:04:ae brd ff:ff:ff:ff:ff:ff
      inet 100.96.2.200/32 brd 100.96.2.200 scope global eth0
         valid_lft forever preferred_lft forever
      inet6 fe80::d0ee:d5ff:fe2f:4ae/64 scope link
         valid_lft forever preferred_lft forever
  1. run the workflow and exec into the workflow docker container
$ kubectl exec -it my-runner-xyz -c docker -- /bin/sh
$ docker ps
$ docker exec -it <containerID> bash
$ apt update
$ apt install curl -y
$ curl -vOL https://github.com/actions/go-versions/releases/download/1.17.6-1668090892/go-1.17.6-linux-x64.tar.gz
  0     0    0     0    0     0      0      0 --:--:--  0:00:54 --:--:--     0
  -> nothing is happening...
$ apt install iproute2 -y
$ ip a 
  ...
  eth0@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
  -> wrong MTU

Expected behavior
I would expect that downloading files should work from docker containers.
If I manually start a docker container from the dind sidecar container -> no network problems:

$ kubectl exec -it my-runner-xyz -c docker -- /bin/sh
$ docker run -it --rm ubuntu bash
$ apt update
$ apt install curl -y
$ curl -vOL https://github.com/actions/go-versions/releases/download/1.17.6-1668090892/go-1.17.6-linux-x64.tar.gz
   100  128M  100  128M    0     0  22.9M      0  0:00:05  0:00:05 --:--:-- 23.7M
   --> download completed
$ apt install iproute2 -y
$ ip a 
  ...
  eth0@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440
  -> correct MTU

Environment:

  • Helm Chart Version 0.14
  • Controller Version: 0.20.2
  • Deployment Method: Helm + Kustomize + fluxV2
  • Kubernetes: ip-in-ip Calico on AWS
@mumoshu
Copy link
Collaborator

mumoshu commented Jan 11, 2022

@erichorwath Hey. Please set dockerMTU appropriately as documented in https://github.com/actions-runner-controller/actions-runner-controller#additional-tweaks. Reading through all the docker-mtu related issues we had experienced https://github.com/actions-runner-controller/actions-runner-controller/issues?q=label%3A%22docker+mtu+issue%22+is%3Aclosed would also help.

@mumoshu mumoshu closed this as completed Jan 11, 2022
@erichorwath
Copy link
Author

@mumoshu thanks for your quick reply! I have set the dockerMTU based on the documentation, which works perfectly fine EXCEPT for workflows where "container" is specified in the jobs section as described above.
Unfortunately our infrastructure does not allow us to have 1500 as the host network MTU, so we have to rely on a proper dockerMTU propagation...

@mumoshu
Copy link
Collaborator

mumoshu commented Jan 11, 2022

@erichorwath Hey. Thanks for your confirmation. For the remaining MTU issue for the service containers, the only rescue I'm aware of is the create a wrapper script for docker network create which is used for setting up docker networks for those service containers. It seems like it's a docker issue that it doesn't inherit MTU setting from the daemon to the custom docker networks...

#848 (comment)

@erichorwath
Copy link
Author

@mumoshu oh, that explains the problem!

Which part of the coding is able to parse the correct MTU to the docker network create command? Is this part of Github Runner behaviour or the action-runner-controller go code?

@mumoshu
Copy link
Collaborator

mumoshu commented Jan 12, 2022

@erichorwath The problem is that neither actions runner nor docker sets the proper MTU on docker network create. As a workaround, you'd need to create a custom script named docker and put it in $PATH within runner containers, so that it is called by the actions runner process.

In the custom script, you auto-add MTU if the sub-command is network create, and just proxy all the args as-is to the original docker command for other cases.
That's what's the script introduced at #848 (comment) is doing.

Our runner image is build from https://github.com/actions-runner-controller/actions-runner-controller/blob/master/runner/Dockerfile. You'd need to add the custom script to the Dockerfile and build your own runner image out of it. A custom runner image can be specified via controller-manager's --runner-image flag or
RunnerDeployment.Spec.Template.Image. Probably the latter is easier for you.

@tiagoblackcode
Copy link
Contributor

tiagoblackcode commented Feb 16, 2022

Hello,

I've come across this issue as well since the Kubernetes network I'm setting up uses a smaller MTU (1420 instead of the usual 1500).

Despite the MTU being configurable with the dockerMTU setting on the Runner spec, it only affects the default docker network and does not propagate to networks created by the Github runner. This poses a problem when running workflows that use containers such as the actions/checkout@v2 one.

So, I followed-up on the workaround proposed by @FalconerTC here and changed it to support the sidecar dind container since /etc/docker/daemon.json is not available:

#!/usr/bin/env bash

# Inspired by https://github.com/actions-runner-controller/actions-runner-controller/issues/848#issuecomment-929394653
# Inspired by https://github.com/actions/runner/issues/775#issuecomment-927826684

if [[ $1 = "network" ]] && [[ $2 = "create" ]] ; then
    shift; shift #pop 2 first parameters

    MTU=$(docker network inspect bridge --format '{{index .Options "com.docker.network.driver.mtu"}}' 2>/dev/null); 
    if [[ ! -z "$MTU" ]]; then 
        /usr/local/bin/docker.bin network create --opt com.docker.network.driver.mtu=$MTU "${@}"  
    else
        /usr/local/bin/docker.bin network create "${@}"
    fi
else
    #just call docker as normal if not network create
    /usr/local/bin/docker.bin "${@}"
fi

This has the added benefit of propagating whatever MTU is set on the default docker bridge, which works in both dind and docker in runner situations.

I've pushed a docker image with the change to tiagomelo/github-actions-runner which basically places the script shim the docker container and propagate the docker MTU to newly created networks.

However, since using containers in Github Actions is pretty common and - apparently - having Kubernetes overlay networks with MTUs smaller than 1500, I think this should be integrated in the official summerwind/actions-runner image. I'm willing to open a PR with the change if you wish.

Meanwhile, this can be tested by creating a runner with the custom runner image:

apiVersion: actions.summerwind.dev/v1alpha1
kind: Runner
metadata:
  name: test-runner
spec:
  dockerEnabled: true
  dockerMTU: 1420
  image: tiagomelo/docker-actions-runner
  repository: <github-repository>

You can validate this is working with:

kubectl exec -it -c runner test-runner -- /bin/bash

within the runner:

docker network inspect bridge --format '{{index .Options "com.docker.network.driver.mtu"}}' 2>/dev/null
# returns 1420 (or whatever MTU is set)

docker run --rm rancher/curl https://github.com >/dev/null
# should complete

docker network create test

docker network inspect bridge --format '{{index .Options "com.docker.network.driver.mtu"}}' 2>/dev/null
# returns 1420 (or whatever MTU is set)

docker run --rm --network test rancher/curl https://github.com >/dev/null
# should complete

docker network rm test

Edit: Published a repository with the image under tiagoblackcode/github-actions-runner.

@alexdepalex
Copy link

Can this be reopened since the issue still persist? We're running into the same issue where we use dind and buildah to create container images.

@tiagoblackcode Were you able to create a PR for this?

@mumoshu
Copy link
Collaborator

mumoshu commented Feb 17, 2022

@alexdepalex We aren't very eager to fix this on ARC side because we consider this as an issue in actions/runner, but happy to review any PR related to this and see if it's feasible for us to maintain it.

@NicklasWallgren
Copy link

NicklasWallgren commented Feb 18, 2022

@tiagoblackcode Thanks for publishing a docker image containing the fix!

I agree with @mumoshu, but at the same time I think it's a good idea to include the fix directly in summerwind/actions-runner, since the maintainers of actions/runner doesn't acknowledge the issue neither in actions/runner#775 nor the open PR actions/runner#1650

@tiagoblackcode
Copy link
Contributor

tiagoblackcode commented Feb 18, 2022

I agree with the general sentiment that the issue is not in ARC but instead in actions/runner or even in docker, honestly. Since docker supports the --mtu flag, it should use its MTU value as default to any networks created unless specified otherwise.

However, since it seems we're kind of stuck on where to implement this, I suggest the following:

  • Document in the Troubleshooting section the issue, and the workaround.
  • Optionally, create a secondary image in ARC with the fix, and a USE AT YOUR OWN RISK advert. Alternatively, point to my image, but I'd rather have this in the official repository since it has more visibility. I suggest a secondary image because this can have unintended consequences we're not aware of at the moment.
  • Bring visibility to the actions/runner#775 issue.

This way ARC users have a documented way to fix the issue while we wait for actions/runner to support this.

@mumoshu what are your thoughts on this?

@NicklasWallgren regarding actions/runner#1650, instead of having an env variable I think it's better to inspect the bridge network, figure out its MTU and use it to create the network.

Edit:

There's an open PR in moby/moby#43197 to propagate the MTU set in --mtu to networks created without an explicit MTU.

@NicklasWallgren
Copy link

@tiagoblackcode Yeah, you are probably right. I did suggest that as an alternative solution in the PR.

@mumoshu
Copy link
Collaborator

mumoshu commented Feb 19, 2022

I suggest a secondary image because this can have unintended consequences we're not aware of at the moment.

This might be the most viable option 👍

Also, I'd greatly appreciate it if you folks mind helping me provide user support around this workaround in the future.

Maybe- include some of you into CODEOWNERS file so that it's more clear that there are multiple folks not only me but also you who is known to have some knowledge about the alternative runner dockerfile (with the said workaround)?

@tiagoblackcode
Copy link
Contributor

I'm willing to help you with support.

I'm gonna prepare a PR with the secondary image and regarding it in the next couple of days.

@mumoshu mumoshu reopened this Feb 21, 2022
tiagoblackcode added a commit to tiagoblackcode/actions-runner-controller that referenced this issue Mar 10, 2022
Some network environments use non-standard MTU values. In these
situations, the `DockerMTU` setting might be used to specify the MTU
setting for the `bridge` network created by Docker. However, when the
Github Actions workflow creates networks, it doesn't propagate the
`bridge` network MTU which can lead to `connection reset by peer`
messages.

To overcome this, I've created a new docker image called
`summerwind/actions-runner-mtu` that shims the docker binary in order to
propagate the MTU setting to networks created by Github workflows.

This is a follow-up on the discussion in
(actions#1046)[actions#1046]
and uses a separate image since there might be some unintended
side-effects with this approach.
tiagoblackcode added a commit to tiagoblackcode/actions-runner-controller that referenced this issue Mar 10, 2022
Some network environments use non-standard MTU values. In these
situations, the `DockerMTU` setting might be used to specify the MTU
setting for the `bridge` network created by Docker. However, when the
Github Actions workflow creates networks, it doesn't propagate the
`bridge` network MTU which can lead to `connection reset by peer`
messages.

To overcome this, I've created a new docker image called
`summerwind/actions-runner-mtu` that shims the docker binary in order to
propagate the MTU setting to networks created by Github workflows.

This is a follow-up on the discussion in
(actions#1046)[actions#1046]
and uses a separate image since there might be some unintended
side-effects with this approach.
tiagoblackcode added a commit to tiagoblackcode/actions-runner-controller that referenced this issue Mar 10, 2022
Some network environments use non-standard MTU values. In these
situations, the `DockerMTU` setting might be used to specify the MTU
setting for the `bridge` network created by Docker. However, when the
Github Actions workflow creates networks, it doesn't propagate the
`bridge` network MTU which can lead to `connection reset by peer`
messages.

To overcome this, I've created a new docker image called
`summerwind/actions-runner-mtu` that shims the docker binary in order to
propagate the MTU setting to networks created by Github workflows.

This is a follow-up on the discussion in
(actions#1046)[actions#1046]
and uses a separate image since there might be some unintended
side-effects with this approach.
@tiagoblackcode
Copy link
Contributor

Hello,

I finally had some time to work on this. Please check out #1201 which adds a new docker image to the repository that propagates the MTU setting to networks created in Github workflows.

@stale
Copy link

stale bot commented Apr 9, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 9, 2022
@toast-gear toast-gear added enhancement New feature or request and removed stale labels Apr 11, 2022
tiagoblackcode added a commit to tiagoblackcode/actions-runner-controller that referenced this issue May 18, 2022
Some network environments use non-standard MTU values. In these
situations, the `DockerMTU` setting might be used to specify the MTU
setting for the `bridge` network created by Docker. However, when the
Github Actions workflow creates networks, it doesn't propagate the
`bridge` network MTU which can lead to `connection reset by peer`
messages.

To overcome this, I've created a new docker image called
`summerwind/actions-runner-mtu` that shims the docker binary in order to
propagate the MTU setting to networks created by Github workflows.

This is a follow-up on the discussion in
(actions#1046)[actions#1046]
and uses a separate image since there might be some unintended
side-effects with this approach.
tiagoblackcode added a commit to tiagoblackcode/actions-runner-controller that referenced this issue Jul 18, 2022
Some network environments use non-standard MTU values. In these
situations, the `DockerMTU` setting might be used to specify the MTU
setting for the `bridge` network created by Docker. However, when the
Github Actions workflow creates networks, it doesn't propagate the
`bridge` network MTU which can lead to `connection reset by peer`
messages.

To overcome this, I've created a new docker image called
`summerwind/actions-runner-mtu` that shims the docker binary in order to
propagate the MTU setting to networks created by Github workflows.

This is a follow-up on the discussion in
(actions#1046)[actions#1046]
and uses a separate image since there might be some unintended
side-effects with this approach.
tiagoblackcode added a commit to tiagoblackcode/actions-runner-controller that referenced this issue Jul 19, 2022
Some network environments use non-standard MTU values. In these
situations, the `DockerMTU` setting might be used to specify the MTU
setting for the `bridge` network created by Docker. However, when the
Github Actions workflow creates networks, it doesn't propagate the
`bridge` network MTU which can lead to `connection reset by peer`
messages.

To overcome this, I've created a new docker image called
`summerwind/actions-runner-mtu` that shims the docker binary in order to
propagate the MTU setting to networks created by Github workflows.

This is a follow-up on the discussion in
(actions#1046)[actions#1046]
and uses a separate image since there might be some unintended
side-effects with this approach.
mumoshu added a commit that referenced this issue Sep 23, 2022
* feat: Add container to propagate host network MTU

Some network environments use non-standard MTU values. In these
situations, the `DockerMTU` setting might be used to specify the MTU
setting for the `bridge` network created by Docker. However, when the
Github Actions workflow creates networks, it doesn't propagate the
`bridge` network MTU which can lead to `connection reset by peer`
messages.

To overcome this, I've created a new docker image called
`summerwind/actions-runner-mtu` that shims the docker binary in order to
propagate the MTU setting to networks created by Github workflows.

This is a follow-up on the discussion in
(#1046)[#1046]
and uses a separate image since there might be some unintended
side-effects with this approach.

* fixup! feat: Add container to propagate host network MTU

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
neggles pushed a commit to neggles/actions-runners that referenced this issue Nov 10, 2022
* feat: Add container to propagate host network MTU

Some network environments use non-standard MTU values. In these
situations, the `DockerMTU` setting might be used to specify the MTU
setting for the `bridge` network created by Docker. However, when the
Github Actions workflow creates networks, it doesn't propagate the
`bridge` network MTU which can lead to `connection reset by peer`
messages.

To overcome this, I've created a new docker image called
`summerwind/actions-runner-mtu` that shims the docker binary in order to
propagate the MTU setting to networks created by Github workflows.

This is a follow-up on the discussion in
(#1046)[actions/actions-runner-controller#1046]
and uses a separate image since there might be some unintended
side-effects with this approach.

* fixup! feat: Add container to propagate host network MTU

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
neggles pushed a commit to neggles/actions-runners that referenced this issue Nov 10, 2022
* feat: Add container to propagate host network MTU

Some network environments use non-standard MTU values. In these
situations, the `DockerMTU` setting might be used to specify the MTU
setting for the `bridge` network created by Docker. However, when the
Github Actions workflow creates networks, it doesn't propagate the
`bridge` network MTU which can lead to `connection reset by peer`
messages.

To overcome this, I've created a new docker image called
`summerwind/actions-runner-mtu` that shims the docker binary in order to
propagate the MTU setting to networks created by Github workflows.

This is a follow-up on the discussion in
(#1046)[actions/actions-runner-controller#1046]
and uses a separate image since there might be some unintended
side-effects with this approach.

* fixup! feat: Add container to propagate host network MTU

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
@na4ma4
Copy link

na4ma4 commented Feb 15, 2024

With the introduction of --default-network-opt mapmap as an argument for the docker daemon, is it possible to add an option to specify custom arguments for the docker sidecar image ?

If we could specify dockerArgs: [ '--default-network-opt=bridge=com.docker.network.driver.mtu=1400' ] to RunnerSet.spec or RunnerDeployment.spec.template.spec then that might solve this issue generally without wrappers ?

The wrapper doesn't help for applications that use the docker API directly (like the dependabot-action network create)

I'd be happy to do a PR if that seems like a valid option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants