Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support "native" podman as backend #85

Open
AkiraNorthstar opened this issue Nov 10, 2019 · 24 comments · May be fixed by #305
Open

Support "native" podman as backend #85

AkiraNorthstar opened this issue Nov 10, 2019 · 24 comments · May be fixed by #305
Labels
agent backend new backend feature add new functionality
Milestone

Comments

@AkiraNorthstar
Copy link

AkiraNorthstar commented Nov 10, 2019

Hello Laszlo!
Is it possible to integrate podman in woodpecker?

Podman does not run as a daemon (like docker : /var/run/docker.sock) but is fully compatible with the docker command line.

Another point is that podman can also mount secrets via mount.conf, then you maybe have also solved Global Secrets.

Also cgroups v2 is native supported by podman.

Further information about podman:

@laszlocph
Copy link
Member

laszlocph commented Nov 10, 2019

Technically it is a matter of implementing this interface: https://github.com/laszlocph/woodpecker/blob/master/cncd/pipeline/pipeline/backend/backend.go and this is how it is implemented for Docker: https://github.com/laszlocph/woodpecker/blob/master/cncd/pipeline/pipeline/backend/docker/docker.go

So technically it is probably possible. As for should the project focus on it, is a different question.

In my surroundings I don't see many companies want to differentiate on what container engines they are running, and just default on Docker. Even though the drawbacks of Docker's architecture is known to some of them, adopting alternatives have not reached a high level. Or they don't talk about it, or I don't listen :) Also these companies are not Fedora/RedHat/Centos shops, so I have limited visibility on the adoption of Podman. But I do see the benefits of Podman over Docker. I haven't used it, but I might try.

All in all:

  • technically it's possible
  • I don't plan to immediately implement it
  • I accept pull requests even in early forms, and happy to team up in full delivery
  • I need more data on the adoption of Podman.
  • I want to be convinced as I feel something great going on with Podman. Please help me see the light.

@mscherer
Copy link
Contributor

mscherer commented Sep 1, 2021

So I tried to use podman with woodpecker. It failed in several interesting ways. I used podman on f34, either the package (quite recent 3.3.0), or the latest git devel version (self compiled).

It failed first because podman didn't pull the plugins/git image, and I do not understand why as it work if I do it manually. Once the plugin is in the local registry, the build proceed.

Then it failed because the workingDir directory is not created automatically. I see the volume creation, and the code should create it on disk in /var/lib/something, but it doesn't seems to mount it correctly. Again, it seems to be dependent on the backend, i tried with crun and runc, no luck. My plan was to debug that, but I haven't yet looked more.

However, I also looked at what it would requires to write a backend. Podman has a set of bindings, so this shouldn't have been too hard for a go beginner like me. However, since woodpecker has switched to vendoring in 7551357 , my understanding is that we also have to vendor podman, which in turn pull a rather large number of dependencies, some who requires C headers (btrfs, devicemapper). I do think this would put a undue burden on woodpecker CI and deployment.

So for now, the solution for using podman is either find why woodpecker work with docker but not the docker API exposed by podman (likely a issue podman side), or add a backend in a way that do not bloat the build (again, likely a issue on podman binding, or maybe I do not it wrong ).

@anbraten
Copy link
Member

anbraten commented Sep 1, 2021

That sounds quite interesting. I never used podman, but as I am totally interested in a Kubernetes agent, it is probably a good discussion about how backends for agent should be handled in the long term. Currently I see two options for it:

  • separate agents per backend
  • one agent which includes every supported backend

Meanwhile if you need help debugging woodpecker feel free to write me on Discord.

@mfulz mfulz linked a pull request Sep 14, 2021 that will close this issue
@mfulz
Copy link
Contributor

mfulz commented Sep 14, 2021

That sounds quite interesting. I never used podman, but as I am totally interested in a Kubernetes agent, it is probably a good discussion about how backends for agent should be handled in the long term. Currently I see two options for it:

  • separate agents per backend
  • one agent which includes every supported backend

Meanwhile if you need help debugging woodpecker feel free to write me on Discord.

I've just used a --use-podman boolean flag for the agent for now.
But I think in the actual Implementation it would make sense to use one agent with multiple backends and implementing some flag like --backend with a simple switch...case inside the agent.

Most of the code would be either just duplicated for every agent or it would need a bigger rewrite I guess.

@anbraten anbraten changed the title podman as feature ? Support podman as backend Sep 18, 2021
@anbraten anbraten added the feature add new functionality label Sep 18, 2021
@mscherer
Copy link
Contributor

mscherer commented Oct 2, 2021

So I just tried again with podman and it now work.... almost.

I was able to use on a self hosted gitea the following config:

clone:
  git:
    image: docker.io/a6543/test_git_plugin:latest
pipeline:
  prepare-build:
    image: quay.io/fedora/fedora:latest
    commands:
      - sudo dnf install -y zola
      - zola build --drafts

The server run on Fedora 34 with podman-3.3.1-1.fc34.x86_64, with woodpecker 34cfabb
So the issue of not downloading the image is however still here, but there is a workaround unlike the working dir one.

@mscherer
Copy link
Contributor

mscherer commented Oct 2, 2021

Ok so it worked because I patched podman myself, and forgot about it. Upgrading to 3.4.0 resulted in the same workingDir issue

@mscherer
Copy link
Contributor

Status update, I managed to get podman working (for real this time), and the patch was merged upstream, and will be in the next release. Now the only last issue is one around registry. It seems for some reason, the docker compat API of podman will not download remote images automatically, but if you download manually before (using podman pull on the agent), it work without problem, so this one can be worked around more easily.

@6543
Copy link
Member

6543 commented Nov 14, 2021

@mscherer nice to hear - so do we need #305 or do work podman with compatiblity layer with podman >= v3.4.3 (upcomming release) ?

@mscherer
Copy link
Contributor

I think it will work with newer podman, but I would appreciate someone to do a test, since I already managed to get my test wrong.
I also wrote mscherer/podman@bf4a6b9 to fix the 2nd problem I faced, but I need to research a bit more the code of docker to make sure I replicated the behavior correctly.

@6543
Copy link
Member

6543 commented Nov 16, 2021

looking forward do see a pull for the 2nd issue :)

I'll think we should still add podman as "native" option too - but what ever get released first will do it for the majority i think

@mscherer
Copy link
Contributor

I am sure upstream podman would also prefer a native API, but it seemed to pull a rather high amount of dependencies and code, even by go standard of vendoring due to the web of import in podman code base (the PR changed 2430 files, and current vendor directory is 2794 files). In turn, this will make the compilation time increase, and the binary size too (like, it almost doubled when I tried to make a mock podman backend before #305 was proposed).

Nothing unfixable, but this look like a rather tedious problem to solve. Last time I looked, it was because the specgen package from podman pull the whole world, so that's a issue podman side to be solved, and afaik, one that wasn't reported yet.

@mscherer
Copy link
Contributor

Here is the PR for the last problem I encountered:
containers/podman#12315

@mscherer
Copy link
Contributor

So the PR was wrong, but this is being worked on containers/podman#12318 and containers/podman#12317 .

In the mean time, I also found another incompatibility on containers/podman#12320 (which might be trickier to solve)

@mscherer
Copy link
Contributor

So, after discussion with the podman devs, we reached a consensus that the compatibility issue is tricky to solve, but it should be fixed. However, my fix is not the right one, and as I do not think there is no easy fix, it might take some time before it get done.

In the mean time, there is a few way that can be explored on woodpecker side:

  • convince moby devs to not shorten the name when a full name is given. Eg, change https://github.com/moby/moby/blob/master/client/image_pull.go#L22-L29
  • do not use docker.io for the images request. I stumbled on the bug because I used the custom image made by @6543 on Maintain a forked plugins/git #303 (comment) , but if the image was on quay.io, the bug would not be have been triggered. Hence my question regarding self hosting the plugins image or having a vhost dedicated to it in Maintain a forked plugins/git #303
  • instruct people wishing to use podman to change the configuration to only search docker.io ( registries.search in /etc/containers/registries.conf ). On Fedora/Centos/RHEL, there is at least Quay and Docker.io, so podman will not allow short name without asking, but maybe it work on Debian without any specific configuration
  • ask those users to add shortname alias for the plugin they use. Bonus, this allow to say "git" and do the right thing
  • instruct people to download the image manually before using it
  • add a little http proxy between podman and woodpecker-agent to change the parameter

(to be clear, the last one is not a serious proposal)

@6543 6543 added the agent label Dec 1, 2021
@9p4
Copy link
Contributor

9p4 commented Feb 6, 2022

Ok, it seems like using Podman rootless works just fine for me. What I did was manually link docker.sock to the user's podman socket and use a version of Podman>=3.4.3 (because of a bug that would not create the paths in storage containers/podman#11842).

@6543 6543 added this to the 1.0.0 milestone Feb 6, 2022
@mscherer
Copy link
Contributor

Podman 4.0 was released yesterday, (announce is not yet published, and will likely not be until next week), and it should now fix the short name issue I faced, so that would be working out of the box. I will do a quick test later.

@9p4
Copy link
Contributor

9p4 commented Feb 18, 2022

With PR #763, setting the DOCKER_SOCK variable to a Podman socket works now (so no more linking required).

@mscherer
Copy link
Contributor

mscherer commented May 6, 2022

So as Fedora 36 is out very soon, so I upgraded my CI and I can confirm that podman 4.0 work out of the box if the socket is correctly set (eg, either with the variable, or by using the rpm podman-docker or equivalent).

I guess we can close that issue after adding some documentation, I guess ?

@6543 6543 changed the title Support podman as backend Support "native" podman as backend May 7, 2022
@6543
Copy link
Member

6543 commented May 7, 2022

docker compatibility mode works now ... -> #901 - so renamed the issue

@NewRedsquare
Copy link

Can someone send an example ? i can't get it working using both commands :

  • podman container run --privileged --rm --tty -v /run/user/1000/podman/podman.sock:/var/user/1000/podman/podman.sock -e WOODPECKER_SERVER=IP -e WOODPECKER_AGENT_SECRET=xxxxxxxxxx -e WOODPECKER_BACKEND=docker -e DOCKER_HOST=unix:///run/user/1000/podman/podman.sock --network=host docker.io/woodpeckerci/woodpecker-agent:latest
{"level":"error","error":"Cannot connect to the Docker daemon at unix:///run/user/1000/podman/podman.sock. Is the docker daemon running?","time":"2022-06-05T17:31:42Z","message":"could not kill container '0_333181564417296953_clone'"}
{"level":"error","error":"Cannot connect to the Docker daemon at unix:///run/user/1000/podman/podman.sock. Is the docker daemon running?","time":"2022-06-05T17:31:42Z","message":"could not remove container '0_333181564417296953_clone'"}
{"level":"error","error":"Cannot connect to the Docker daemon at unix:///run/user/1000/podman/podman.sock. Is the docker daemon running?","time":"2022-06-05T17:31:42Z","message":"could not kill container '0_333181564417296953_stage_0'"}
{"level":"error","error":"Cannot connect to the Docker daemon at unix:///run/user/1000/podman/podman.sock. Is the docker daemon running?","time":"2022-06-05T17:31:42Z","message":"could not remove container '0_333181564417296953_stage_0'"}
{"level":"error","error":"Cannot connect to the Docker daemon at unix:///run/user/1000/podman/podman.sock. Is the docker daemon running?","time":"2022-06-05T17:31:42Z","message":"could not remove volume '0_333181564417296953_default'"}
{"level":"error","error":"Cannot connect to the Docker daemon at unix:///run/user/1000/podman/podman.sock. Is the docker daemon running?","time":"2022-06-05T17:31:42Z","message":"could not remove network '0_333181564417296953_default'"}
{"level":"error","error":"rpc error: code = Unknown desc = Proc finished with exitcode 1, Cannot connect to the Docker daemon at unix:///run/user/1000/podman/podman.sock. Is the docker daemon running?","time":"2022-06-05T17:31:42Z","message":"grpc error: wait(): code: Unknown: rpc error: code = Unknown desc = Proc finished with exitcode 1, Cannot connect to the Docker daemon at unix:///run/user/1000/podman/podman.sock. Is the docker daemon running?"}
{"level":"warn","repo":"romain/cv","build":"23","id":"73","error":"rpc error: code = Unknown desc = Proc finished with exitcode 1, Cannot connect to the Docker daemon at unix:///run/user/1000/podman/podman.sock. Is the docker daemon running?","time":"2022-06-05T17:31:42Z","message":"cancel signal received"}

but : curl -s --unix-socket /run/user/1000/podman/podman.sock http://d/v1.0.0/libpod/info | jq gives me normal output

  • another error when running : podman container run --privileged --rm --tty -v /run/user/1000/podman/podman.sock:/var/run/docker.sock -e WOODPECKER_SERVER=IP -e WOODPECKER_AGENT_SECRET=xxxxxxxxxx -e WOODPECKER_BACKEND=docker docker.io/woodpeckerci/woodpecker-agent:latest
"level":"error","error":"Error response from daemon: can only kill running containers. xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx is in state created: container state improper","time":"2022-06-05T17:35:10Z","message":"could not kill container '0_5049337694948908527_clone'"}
{"level":"error","error":"Error response from daemon: no container with name or ID \"0_5049337694948908527_step_0\" found: no such container","time":"2022-06-05T17:35:10Z","message":"could not kill container '0_5049337694948908527_stage_0'"}
{"level":"error","error":"Error: No such container: 0_5049337694948908527_step_0","time":"2022-06-05T17:35:10Z","message":"could not remove container '0_5049337694948908527_stage_0'"}
{"level":"error","error":"rpc error: code = Unknown desc = Proc finished with exitcode 1, Error response from daemon: \"slirp4netns\" is not supported: invalid network mode","time":"2022-06-05T17:35:10Z","message":"grpc error: wait(): code: Unknown: rpc error: code = Unknown desc = Proc finished with exitcode 1, Error response from daemon: \"slirp4netns\" is not supported: invalid network mode"}
{"level":"warn","repo":"romain/cv","build":"25","id":"79","error":"rpc error: code = Unknown desc = Proc finished with exitcode 1, Error response from daemon: \"slirp4netns\" is not supported: invalid network mode","time":"2022-06-05T17:35:10Z","message":"cancel signal received"}

i'm kinda lost

@9p4
Copy link
Contributor

9p4 commented Jun 6, 2022

I think your DOCKER_HOST has to be unix:///var/user/1000/podman/podman.sock

@6543 6543 modified the milestones: 1.0.0, 1.1.0 Dec 25, 2022
@major137
Copy link

major137 commented May 9, 2023

Can someone send an example ? i can't get it working using both commands :

* `podman container run --privileged --rm --tty -v /run/user/1000/podman/podman.sock:/var/user/1000/podman/podman.sock -e WOODPECKER_SERVER=IP -e WOODPECKER_AGENT_SECRET=xxxxxxxxxx -e WOODPECKER_BACKEND=docker -e DOCKER_HOST=unix:///run/user/1000/podman/podman.sock --network=host docker.io/woodpeckerci/woodpecker-agent:latest`

If -v /run/user/1000/podman/podman.sock:/**var**/user/1000/podman/podman.sock,
DOCKER_HOST should be /**var**/user/1000/podman/podman.sock not /**run**/user/1000/podman/podman.sock

I managed to make it work on Fedora 38 with the latest version of podman in rootless mode:

  • First install podman-remote and activate the socket: systemctl enable --now podman.socket
  • Only then you can mount a volume: --volume=/run/user/1000/podman/podman.sock:/var/run/docker.sock
  • I used /var/run/docker.sock as it's the default value woodpecker is looking for ; no need to set DOCKER_HOST

@Taywee
Copy link

Taywee commented Sep 4, 2023

In case anybody else has issues with the clone step never finishing when using podman as a backend, check this containers/podman#19581

Effectively, set the contents of your containers.conf (in my case, it's ~/.config/containers/containers.conf) to

[containers]
log_driver="json-file"

[engine]
events_logger="file"

I guess it was reporting events to a different location than the podman service expected to find them. Now I have woodpecker running entirely through podman! This is important to me, because I need to use CI to run podman build, which I couldn't get working in a docker container, no matter what I tried. I also am really happy to not have to run a rootful container service to run my woodpecker agent.

With the above config, I was able to run woodpecker with a Podman backend, but in order to run podman build in podman, I needed also:

[containers]
label=false
devices=["/dev/fuse"]
default_capabilities = [
    "CHOWN",
    "DAC_OVERRIDE",
    "FOWNER",
    "FSETID",
    "KILL",
    "NET_BIND_SERVICE",
    "SETFCAP",
    "SETGID",
    "SETPCAP",
    "SETUID",
    "SYS_CHROOT",
    "SYS_ADMIN",
    "MKNOD",
]

This allows podman (buildah, I guess?) to do the the stuff that it needs to do to mount and run rootful containers in the podman rootless container. I couldn't get rootless-in-rootless working, even when following the information in this guide, but the issues I was running into might have been specific to the container in question, and I might have been able to fix it with some hackery.

edit: And now socketed podman won't run for me at all, due to some debug logging from conmon. Oh well; I tried.

@pat-s pat-s modified the milestones: 2.0.0, 2.x.x Oct 13, 2023
@kaylynb
Copy link
Contributor

kaylynb commented Dec 1, 2023

It looks like v2 broke woodpecker-agent running in a container via podman.

In v1.0.5 I regularly get these errors when running pipelines but everything still works fine, including buildx, etc:

Nov 19 12:35:54 alzirr systemd-woodpecker-agent[5908]: {"level":"error","error":"Error response from daemon: can only kill running containers. d4dba47c6b966c981d03ff1cc89f6679fc7bac975c38e8cf7b252e55
bb8b225d is in state exited: container state improper","time":"2023-11-19T20:35:54Z","message":"could not kill container 'wp_01hfmmqea1258g2a1c93tz73h2_0_stage_0'"}

But in v2.0.0 & latest HEAD (237b225 at time of this post) it seems to actually fail on this error now:

Dec 01 11:51:17 alzirr systemd-woodpecker-agent2[8964]: {"level":"error","error":"rpc error: code = Unknown desc = Step finished with exit code 1, Error response from daemon: can only kill running co
ntainers. 6231ebb6daff67daebbe56144ea658041a087a08d1aa8d9c185901d894b63c35 is in state exited: container state improper","time":"2023-12-01T11:51:17-08:00","message":"grpc error: wait(): code: Unknow
n: rpc error: code = Unknown desc = Step finished with exit code 1, Error response from daemon: can only kill running containers. 6231ebb6daff67daebbe56144ea658041a087a08d1aa8d9c185901d894b63c35 is i
n state exited: container state improper"}
Dec 01 11:51:17 alzirr systemd-woodpecker-agent2[8964]: {"level":"warn","repo":"jaam/skeetcrawl","pipeline":"0","id":"40","error":"rpc error: code = Unknown desc = Step finished with exit code 1, Err
or response from daemon: can only kill running containers. 6231ebb6daff67daebbe56144ea658041a087a08d1aa8d9c185901d894b63c35 is in state exited: container state improper","time":"2023-12-01T11:51:17-0
8:00","message":"cancel signal received"}

I didn't have the time to look through the large diff between v1 & v2 but it does appear that this error should actually be caught here:

func isErrContainerNotFoundOrNotRunning(err error) bool {
// Error response from daemon: Cannot kill container: ...: No such container: ...
// Error response from daemon: Cannot kill container: ...: Container ... is not running"
// Error: No such container: ...
return err != nil && (strings.Contains(err.Error(), "No such container") || strings.Contains(err.Error(), "is not running"))
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent backend new backend feature add new functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.