Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support podman in k8s #1531

Merged
merged 37 commits into from
Dec 14, 2022
Merged

Support podman in k8s #1531

merged 37 commits into from
Dec 14, 2022

Conversation

sgaist
Copy link
Contributor

@sgaist sgaist commented Sep 16, 2022

This pull request is a refactor that allows to use alternatives to Docker as image builder as requested in #1513. In this case, support for Podman is implemented.

Podman is daemonless open source alternative to Docker that can be used as drop-in replacement in many cases. Aside from working directly on the command line, it can run as as service (only on Linux at the time of writing) offering a Docker compatible API that can be used in a a similar fashion. This service mode is what is used in this PR to implement the image build support.

The PR implements a rewrite of the DaemonSet providing the docker image build service.

The new imageBuilderType parameter introduced will let users select which image builder to use:

  • local: same as what setting dind.enabled to false used to doo
  • dind: same as what setting dind.enabled to true used to do
  • pink: new builder using Podman in place of Docker

The daemonset name pink is the acronym of "podman in kubernetes".

Configuration example:

service:
  type: ClusterIP

config:
  BinderHub:
    debug: true
    hub_url: "https://my.host.com/jupyter"

    use_registry: true
    image_prefix: "my.registry.com/my-project/my-image-"

    build_docker_host: "unix:///local/fake_var/run/pink/docker.sock"
    build_image: "docker.io/sgaist/repo2podman:latest"

imageBuilderType: pink

pink:
  hostStorageDir: /local/fake_var/lib/pink/storage/
  hostSocketDir: /local/fake_var/run/pink/

imageCleaner:
  host:
    enabled: false

ingress:
  enabled: true
  ingressClassName: nginx
  hosts:
    - my.host.com

One advantage of this new implementation is that it should allow the addition of new build services without requiring core modifications.

With the addition of Podman as builder option, BinderHub administrators have the option to either use the already provided repo2docker as Podman provides a Docker compatible API or repo2podman that is a plugin for repo2docker using the Podman client. Depending on the use case, some environment variables might be needed to configure it. Thus this patch adds support for them through the new extra_envs traits added to the KubernetesBuildExecutor.

As an example of its use:

03-Repo2Podman: |
    class Repo2PodmanBuild(KubernetesBuildExecutor):
        extra_envs = {
            "CONTAINER_HOST": "unix:///path/to/podman.sock",
            "REGISTRY_AUTH_FILE": "/path/to/.docker/config.json",
        }

        def get_r2d_cmd_options(self):
            return ["--engine=podman"] + super().get_r2d_cmd_options()

    c.BinderHub.build_class = Repo2PodmanBuild

This example shows a configuration that will enable the --remote option for Podman as well provide the path to the credentials to be used when pushing to the registry.

@sgaist
Copy link
Contributor Author

sgaist commented Sep 16, 2022

Small note, as it is a preliminary work I have not written the documentation related to it yet.

@sgaist
Copy link
Contributor Author

sgaist commented Sep 21, 2022

Following this month's JupyterHub's meeting, @manics and myself continued a bit on the podman front discussing the podman engine for repo2docker as well as this patch content. One thing that I was not aware of was that beside podman being a drop in replacement for docker in terms of command line interface, it now also implements a docker compatible API from a service point of view.
Therefore I ran a BinderHub deployment test with the pink daemonset enabled, the standard Build class, and the image cleaner configured to watch the storage folder of podman.

All these together worked quite nicely.

This means that the environment variable addition is not strictly required for the goal of using podman in place of docker for the pod building the images. However it does allow people to choose whether they want to go full podman, full docker, or a mix of both as it would be the default.

This is not a kubernetes service but podman running as a service which
allows for a client/server setup as we can currently do with docker.

This service can be used in place of the one defined by dind so podman
can be used as a build engine while keeping the cache of images
available. This is a privileged pod in the same fashion as the dind one
but it runs as the podman user. The builder pods can then connect to the
service to do the actual work.

The current Build class makes some legitimate assumptions about docker
use which is not an issue except for some naming mismatch that could be
a bit misleading when setting up podman as replacement. It also does
some affinity setup which we have to currently lie to in order to create
the builder pods on the same machine as the service pod.

Finally, there are two environment variable that must be set in order
for the client to be properly configured. The first one is
CONTAINER_HOST which will trigger the "remote" mode of podman and the
second is REGISTRY_AUTH_FILE to ensure podman grabs the proper Docker
credential file. Currently these requires to re-implement the submit
method of the builder class as there is no way to pass them as
configuration options.
Up until now the build pod did not need any custom environment variable.
However, in order for podman to work as we would like, this is a simple
and effective way. The other option would be to modify the podman engine
and add support for additional arguments there that could pass similar
values.

For example the equivalent of
  CONTAINER_HOST="unix:///var/run/docker.sock"
as arguments would be:
  --remote --url unix:///var/run/docker.sock

This would require a bit of redesign to the repo2podman engine in order
to pass these parameters everywhere a command is called. The benefit of
the environment variables is that it allows to change the behaviour
without code change.
@manics
Copy link
Member

manics commented Sep 23, 2022

The diff between the dind and pink K8s templates is fairly small

diff --color -Nur helm-chart/binderhub/templates/dind/daemonset.yaml helm-chart/binderhub/templates/pink/daemonset.yaml
--- helm-chart/binderhub/templates/dind/daemonset.yaml	2021-09-18 20:57:59.971717855 +0100
+++ helm-chart/binderhub/templates/pink/daemonset.yaml	2022-09-23 13:29:38.484322020 +0100
@@ -1,20 +1,20 @@
-{{- if .Values.dind.enabled -}}
+{{- if .Values.pink.enabled -}}
 apiVersion: apps/v1
 kind: DaemonSet
 metadata:
-  name: {{ .Release.Name }}-dind
+  name: {{ .Release.Name }}-pink
 spec:
   updateStrategy:
     type: RollingUpdate
   selector:
     matchLabels:
-      name:  {{ .Release.Name }}-dind
+      name:  {{ .Release.Name }}-pink
   template:
     metadata:
       labels:
-        name: {{ .Release.Name }}-dind
+        name: {{ .Release.Name }}-pink
         app: binder
-        component: dind
+        component: dind  # Lying to build so current affinity rules can work
         release: {{ .Release.Name }}
         heritage: {{ .Release.Service }}
     spec:
@@ -28,47 +28,59 @@
         operator: Equal
         value: user
       nodeSelector: {{ .Values.config.BinderHub.build_node_selector | default dict | toJson }}
-      {{- with .Values.dind.initContainers }}
+      {{- with .Values.pink.initContainers }}
       initContainers:
         {{- . | toYaml | nindent 8 }}
       {{- end }}
+      {{- with .Values.pink.daemonset.image.pullSecrets }}
+      imagePullSecrets:
+        {{- . | toYaml | nindent 8 }}
+      {{- end }}
+      securityContext:
+        fsGroup: 1000
+
       containers:
-      - name: dind
-        image: {{ .Values.dind.daemonset.image.name }}:{{ .Values.dind.daemonset.image.tag }}
+      - name: pink
+        image: {{ .Values.pink.daemonset.image.name }}:{{ .Values.pink.daemonset.image.tag }}
+        imagePullPolicy: {{ .Values.pink.daemonset.image.pullPolicy }}
+
         resources:
-          {{- .Values.dind.resources | toYaml | nindent 10 }}
+          {{- .Values.pink.resources | toYaml | nindent 10 }}
         args:
-          - dockerd
-          - --storage-driver={{ .Values.dind.storageDriver }}
-          - -H unix://{{ .Values.dind.hostSocketDir }}/docker.sock
-          {{- with .Values.dind.daemonset.extraArgs }}
-          {{- . | toYaml | nindent 10 }}
-          {{- end }}
+          - podman
+          - system
+          - service
+          - --time=0
+          - unix:///var/run/pink/docker.sock
         securityContext:
           privileged: true
+          runAsUser: 1000  # podman default user id
+
         volumeMounts:
-        - name: dockerlib-dind
-          mountPath: /var/lib/docker
-        - name: rundind
-          mountPath: {{ .Values.dind.hostSocketDir }}
-        {{- with .Values.dind.daemonset.extraVolumeMounts }}
+        - mountPath: /home/podman/.local/share/containers/storage/
+          name: podman-containers
+        - mountPath: /var/run/pink/
+          name: podman-socket
+
+        {{- with .Values.pink.daemonset.extraVolumeMounts }}
         {{- . | toYaml | nindent 8 }}
         {{- end }}
-        {{- with .Values.dind.daemonset.lifecycle }}
+        {{- with .Values.pink.daemonset.lifecycle }}
         lifecycle:
           {{- . | toYaml | nindent 10 }}
         {{- end }}
       terminationGracePeriodSeconds: 30
       volumes:
-      - name: dockerlib-dind
+      - name: podman-containers
         hostPath:
-          path: {{ .Values.dind.hostLibDir }}
-          type: DirectoryOrCreate
-      - name: rundind
+          path: {{ .Values.pink.hostStorageDir }}
+          type: Directory
+      - name: podman-socket
         hostPath:
-          path: {{ .Values.dind.hostSocketDir }}
-          type: DirectoryOrCreate
-      {{- with .Values.dind.daemonset.extraVolumes }}
+          path: {{ .Values.pink.hostSocketDir }}
+          type: Directory
+
+      {{- with .Values.pink.daemonset.extraVolumes }}
       {{- . | toYaml | nindent 6 }}
       {{- end }}
 {{- end }}

so a couple of alternative options are:

Parameterise the dind templates:

  • Make .Values.dind.securityContext fully configurable
  • Allow the dind volumeMounts to be overridden (or perhaps have an enable/disable switch, and fully configure podman through .Values.dind.daemonset.extraVolumeMounts and .Values.dind.daemonset.extraVolumes)

Add conditionals to the dind templates, e.g.

        args:
       {{- if eq .Values.containerBuilderPod "dind" }}
          - dockerd
          - --storage-driver={{ .Values.dind.storageDriver }}
          - -H unix://{{ .Values.dind.hostSocketDir }}/docker.sock
       {{- end }}
       {{- if eq .Values.containerBuilderPod "pink" }}
          - podman
          - system
          - service
          - --time=0
          - unix:///var/run/pink/docker.sock
       {{- end }}
        {{- with .Values.dind.daemonset.extraArgs }}
        {{- . | toYaml | nindent 10 }}
        {{- end }}

If we decide on this route I think deprecating .Values.dind.enabled and replacing it with something like .Values.containerBuilderPod will be more extensible in the future.

@sgaist
Copy link
Contributor Author

sgaist commented Sep 27, 2022

I like the idea of a configurable containerBuilderPod

Should we also consider modifying the dind label used to keep build pods on the same node that the builder pod itself ?

@manics
Copy link
Member

manics commented Oct 9, 2022

Maybe we could do this in two PRs? This one for getting podman to work (so a configurable containerBuilderPod with no breaking changes). Then a separate PR for renaming the dind label to something more generic, since it's an internal breaking change- in fact a PR to change the label could be opened now, since it'll be a fairly minor conflict to resolve, and means others can agree/disagree!

sgaist added a commit to sgaist/binderhub that referenced this pull request Oct 10, 2022
In parallel of the work from jupyterhub#1531, this change will allow the use of
other builders than Docker without being tied to its nomenclature which
might make the whole a bit confusing when deploying and debugging.
In a similar fashion to what the dind daemonset does, create the pink
folder under /var/run if it does not exists.
The new architecture shall allow to more easily integrate other container
building system.
This is now handled through the containerBuilderPod enumeration.
@sgaist
Copy link
Contributor Author

sgaist commented Oct 28, 2022

As per your excellent idea I merged the two templates into a "new one" that covers both cases and should be easy to extend further.

I haven't touched yet the securityContext part as I wanted to first ensure that the new template would fit the bill.

@sgaist sgaist force-pushed the podman_in_k8s branch 2 times, most recently from 18d33a4 to 289d2b3 Compare November 10, 2022 14:01
@manics
Copy link
Member

manics commented Dec 14, 2022

Sounds fine to me as well, but given the length of this PR how does everyone feel about merging as is, and doing changes in a follow-up?

Copy link
Member

@consideRatio consideRatio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work @sgaist!!!! Thank you for being so thorough with this!

@manics, this LGTM - agree to go for a merge?

@consideRatio
Copy link
Member

(Not to be blocking a merge as this can be done after merge)

  • @sgaist could you review the PR description a final time so that it doesn't include outdated descriptions of what this PR ended up doing?

@manics
Copy link
Member

manics commented Dec 14, 2022

Merging!

@manics manics merged commit c64d689 into jupyterhub:main Dec 14, 2022
@manics manics changed the title Podman in k8s Breaking: Podman in k8s Dec 14, 2022
consideRatio pushed a commit to jupyterhub/helm-chart that referenced this pull request Dec 14, 2022
@manics
Copy link
Member

manics commented Dec 14, 2022

This broke the BinderHub CI upgrade test on main:

- name: "(Upgrade) Install ${{ matrix.upgrade-from }} chart"

Run . ./ci/common

Installing already released binderhub version 1.0.0-0.dev.git.2920.hc64d689
using old config
Error: INSTALLATION FAILED: execution error at (binderhub/templates/NOTES.txt:194:4): 

#################################################################################
######   BREAKING: The config values passed contained no longer accepted    #####
######             options. See the messages below for more details.        #####
######                                                                      #####
######             To verify your updated config is accepted, you can use   #####
######             the `helm template` command.                             #####
#################################################################################

CHANGED:

dind:
  enabled: true

must as of version 0.3.0 be replace by

imageBuilderType: dind

On the bright side we've now tested the upgrade message 😃

@consideRatio
Copy link
Member

Thanks for following up @manics!!

@sgaist sgaist deleted the podman_in_k8s branch December 15, 2022 07:36
@sgaist
Copy link
Contributor Author

sgaist commented Dec 15, 2022

Thanks for the reviews guys !

@consideRatio I have updated the description and I think it's now complete.

@manics can you double check it in case I missed something ?

@consideRatio
Copy link
Member

@sgaist thanks! Can you provide a leading paragraph, one or two sentence, to introduce what podman is in the PR description as well?

Relevant for the paragraph:

@sgaist
Copy link
Contributor Author

sgaist commented Dec 15, 2022

@consideRatio Sure thing !
Done in three sentences but I can rework that if needed.

@consideRatio
Copy link
Member

consideRatio commented Dec 15, 2022

Perfect @sgaist thanks, looks great!

sgaist added a commit to sgaist/binderhub that referenced this pull request Dec 15, 2022
This is a follow up of jupyterhub#1531.

This patch refactors the creation and handling of the image cleaner daemonset.

The current behaviour makes it so that if one doesn't disable the host cleaner
explicitly when using dind (or pink), they would end up with two cleaners
instead of one.

The new behaviour avoids this issue by ensuring that only the container matching
the selected imageBuilderType is created. Same goes for the volumes handling.

In order to have consistent naming, the "local" imageBuilderType enumeration
value is replaced with "host".

It also deprecates the use of the imageCleaner.host.enabled parameter as you
either want image cleanup or not.
sgaist added a commit to sgaist/binderhub that referenced this pull request Dec 15, 2022
This is a follow up of jupyterhub#1531.

This patch refactors the creation and handling of the image cleaner daemonset.

The current behaviour makes it so that if one doesn't disable the host cleaner
explicitly when using dind (or pink), they would end up with two cleaners
instead of one.

The new behaviour avoids this issue by ensuring that only the container matching
the selected imageBuilderType is created. Same goes for the volumes handling.

In order to have consistent naming, the "local" imageBuilderType enumeration
value is replaced with "host".

It also deprecates the use of the imageCleaner.host.enabled parameter as you
either want image cleanup or not.
sgaist added a commit to sgaist/binderhub that referenced this pull request Dec 15, 2022
This is a follow up of jupyterhub#1531.

This patch refactors the creation and handling of the image cleaner daemonset.

The current behaviour makes it so that if one doesn't disable the host cleaner
explicitly when using dind (or pink), they would end up with two cleaners
instead of one.

The new behaviour avoids this issue by ensuring that only the container matching
the selected imageBuilderType is created. Same goes for the volumes handling.

In order to have consistent naming, the "local" imageBuilderType enumeration
value is replaced with "host".

It also deprecates the use of the imageCleaner.host.enabled parameter as you
either want image cleanup or not.
sgaist added a commit to sgaist/binderhub that referenced this pull request Dec 15, 2022
This is a follow up of jupyterhub#1531.

This patch refactors the creation and handling of the image cleaner daemonset.

The current behaviour makes it so that if one doesn't disable the host cleaner
explicitly when using dind (or pink), they would end up with two cleaners
instead of one.

The new behaviour avoids this issue by ensuring that only the container matching
the selected imageBuilderType is created. Same goes for the volumes handling.

In order to have consistent naming, the "local" imageBuilderType enumeration
value is replaced with "host".

It also deprecates the use of the imageCleaner.host.enabled parameter as you
either want image cleanup or not.
manics added a commit to manics/binderhub that referenced this pull request Dec 17, 2022
@minrk minrk changed the title Breaking: Podman in k8s Support podman in k8s May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants