Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add youki runtime support #8411

Merged
merged 1 commit into from
Jan 21, 2022

Conversation

electrocucaracha
Copy link
Contributor

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind api-change
/kind bug
/kind cleanup
/kind design
/kind documentation
/kind failing-test

/kind feature

/kind flake

What this PR does / why we need it:
These changes enable youki runtime support for CRI-O

Which issue(s) this PR fixes:

Special notes for your reviewer:
In order to test this, it's required to set container_manager to crio and enable youki runtime via youki_enabled var.

Does this PR introduce a user-facing change?:

Add youki runtime support to CRI-O

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 11, 2022
@cristicalin
Copy link
Contributor

@electrocucaracha could you add a docs page explaining how to enable the feature and why a deployer would consider it?

Additionally, to avoid shipping broken releases, I think new CRI's should have some minimal molecule coverage and at least a CI job covering them, is youki mature enough to pass tests ?

@electrocucaracha
Copy link
Contributor Author

@electrocucaracha could you add a docs page explaining how to enable the feature and why a deployer would consider it?

Additionally, to avoid shipping broken releases, I think new CRI's should have some minimal molecule coverage and at least a CI job covering them, is youki mature enough to pass tests ?

@cristicalin I have requested more information about the state of the project. Regarding adding more molecule test, I'm in favor of this idea so maybe another PRs need to be created for covering crun, gvisor and kata-qemu runtimes before this one.

@gattytto
Copy link

@electrocucaracha could you add a docs page explaining how to enable the feature and why a deployer would consider it?

It will bring the size of the containers down just like Knative containers do for golang, avoiding the need of a base image that contains a distro, and letting place for a "from scratch" docker image that runs RUST built binaries natively.

@cristicalin
Copy link
Contributor

Regarding adding more molecule test, I'm in favor of this idea so maybe another PRs need to be created for covering crun, gvisor and kata-qemu runtimes before this one.

gvisor and kata-containers both have molecule coverage for the default container manager (containerd), crun is actually on my radar to add since I've not seen anyone keen on taking the time to do it. This does not mean that this new feature should go in without test coverage. As a best practice we should implement testing for new features as they go into the project to ease future maintenance burden.

It will bring the size of the containers down just like Knative containers do for golang, avoiding the need of a base image that contains a distro, and letting place for a "from scratch" docker image that runs RUST built binaries natively.

From the description it seems this is just a CRI implementation so how is this special for RUST_built_binaries_natively than crun or runc?

BTW, I'm not arguing to describe the use-case in this PR but to add a documentation entry in this PR to explain the new feature and how to use it and when.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 13, 2022
@electrocucaracha
Copy link
Contributor Author

@cristicalin I have implemented most of the suggestions but I'm going to need some help from @gattytto to solve the molecule issues given that I'm getting the following error in both distros (almalinux8 and ubuntu20):

    host = <testinfra.host.Host ansible://almalinux8>
    
        def test_run_pod(host):
            runtime = "youki"
    
            run_command = "/usr/local/bin/crictl run --with-pull --runtime {} /tmp/container.json /tmp/sandbox.json".format(runtime)
            with host.sudo():
                cmd = host.command(run_command)
    >       assert cmd.rc == 0
    E       assert 1 == 0
    E        +  where 1 = CommandResult(command=b"sudo /bin/sh -c '/usr/local/bin/crictl run --with-pull --runtime youki /tmp/container.json /tm...ner: run pod sandbox: rpc error: code = Unknown desc = error reading container (probably exited) json message: EOF"\n').rc
    
    tests/test_default.py:23: AssertionError
    _______________________ test_run_pod[ansible://ubuntu20] _______________________
    

@cristicalin
Copy link
Contributor

@electrocucaracha you can run the molecule test with --destroy never when doing development to be able to log in to the VM and inspect and try to run the tests manually.

@cristicalin
Copy link
Contributor

Looking at the discussion over on slack I think there is a need for further explanation of this runtime and how it should be used. Why does it need a special image to test? If it is not a general purpose CRI the documentation (part of our docs/ tree) should explain clearly the use-case and how to use it or link to documents on the CRI's own docs page explaining why and how to use it.

@electrocucaracha
Copy link
Contributor Author

electrocucaracha commented Jan 14, 2022

Looking at the discussion over on slack I think there is a need for further explanation of this runtime and how it should be used. Why does it need a special image to test? If it is not a general purpose CRI the documentation (part of our docs/ tree) should explain clearly the use-case and how to use it or link to documents on the CRI's own docs page explaining why and how to use it.

I'm concerned about the current state of this container runtime, it seems like there are few things to be implemented for kubernetes. Maybe this PR needs to be changed to draft.

@gattytto can help here to clarify this.

@gattytto
Copy link

Looking at the discussion over on slack I think there is a need for further explanation of this runtime and how it should be used. Why does it need a special image to test? If it is not a general purpose CRI the documentation (part of our docs/ tree) should explain clearly the use-case and how to use it or link to documents on the CRI's own docs page explaining why and how to use it.

I'm concerned about the current state of this container runtime, it seems like there are few things to be implemented for kubernetes. Maybe this PR needs to be changed to draft.

@gattytto can help here to clarify this.

I don't get the error you get when running hello-world with youki:

root@optimum-mayfly:~# kubectl get pod
NAME      READY   STATUS              RESTARTS   AGE
rustest   0/1     ContainerCreating   0          14s
root@optimum-mayfly:~# kubectl get pod
NAME      READY   STATUS      RESTARTS   AGE
rustest   0/1     Completed   0          15s
apiVersion: v1
kind: Pod
metadata:
  name: rustest
  labels:
    name: rust
spec:
  runtimeClassName: youki
  containers:
  - name: rust
    image: hello-world:latest

@gattytto
Copy link

I'm just new to rust so my container is a mess but hello world works in youki.

@electrocucaracha
Copy link
Contributor Author

Looking at the discussion over on slack I think there is a need for further explanation of this runtime and how it should be used. Why does it need a special image to test? If it is not a general purpose CRI the documentation (part of our docs/ tree) should explain clearly the use-case and how to use it or link to documents on the CRI's own docs page explaining why and how to use it.

I'm concerned about the current state of this container runtime, it seems like there are few things to be implemented for kubernetes. Maybe this PR needs to be changed to draft.
@gattytto can help here to clarify this.

I don't get the error you get when running hello-world with youki:

root@optimum-mayfly:~# kubectl get pod
NAME      READY   STATUS              RESTARTS   AGE
rustest   0/1     ContainerCreating   0          14s
root@optimum-mayfly:~# kubectl get pod
NAME      READY   STATUS      RESTARTS   AGE
rustest   0/1     Completed   0          15s
apiVersion: v1
kind: Pod
metadata:
  name: rustest
  labels:
    name: rust
spec:
  runtimeClassName: youki
  containers:
  - name: rust
    image: hello-world:latest

@gattytto I have tested youki runtime in a Kubernetes Cluster deployed with this changes, unfortunately the molecule tests (ansible unit tests) are using crictl to validate its operation. I'm sure that the error that I'm getting is a missing piece in the molecule tests

@electrocucaracha
Copy link
Contributor Author

electrocucaracha commented Jan 19, 2022

@electrocucaracha you can run the molecule test with --destroy never when doing development to be able to log in to the VM and inspect and try to run the tests manually.

@cristicalin apparently I'm not getting a RunPodSandboxResponse

vagrant@ubuntu20:/tmp$ sudo crictl config --set debug=true
vagrant@ubuntu20:/tmp$ sudo /usr/local/bin/crictl run --with-pull --runtime youki /tmp/container.json /tmp/sandbox.json 
DEBU[0000] get runtime connection                       
DEBU[0000] connect using endpoint 'unix:///var/run/crio/crio.sock' with '30s' timeout 
DEBU[0000] connected successfully using endpoint: unix:///var/run/crio/crio.sock 
DEBU[0000] get image connection                         
DEBU[0000] connect using endpoint 'unix:///var/run/crio/crio.sock' with '30s' timeout 
DEBU[0000] connected successfully using endpoint: unix:///var/run/crio/crio.sock 
DEBU[0000] RunPodSandboxRequest: &RunPodSandboxRequest{Config:&PodSandboxConfig{Metadata:&PodSandboxMetadata{Name:youki1,Uid:hdishd83djaidwnduwk28bcsb,Namespace:default,Attempt:1,},Hostname:,LogDirectory:/tmp,DnsConfig:nil,PortMappings:[]*PortMapping{},Labels:map[string]string{},Annotations:map[string]string{},Linux:&LinuxPodSandboxConfig{CgroupParent:,SecurityContext:nil,Sysctls:map[string]string{},},Windows:nil,},RuntimeHandler:youki,} 
DEBU[0000] RunPodSandboxResponse: nil                   
FATA[0000] running container: run pod sandbox: rpc error: code = Unknown desc = error reading container (probably exited) json message: EOF 

But using runc as runtime handler

vagrant@ubuntu20:/tmp$ sudo /usr/local/bin/crictl run --with-pull --runtime runc /tmp/container.json /tmp/sandbox.json 
DEBU[0000] get runtime connection                       
DEBU[0000] connect using endpoint 'unix:///var/run/crio/crio.sock' with '30s' timeout 
DEBU[0000] connected successfully using endpoint: unix:///var/run/crio/crio.sock 
DEBU[0000] get image connection                         
DEBU[0000] connect using endpoint 'unix:///var/run/crio/crio.sock' with '30s' timeout 
DEBU[0000] connected successfully using endpoint: unix:///var/run/crio/crio.sock 
DEBU[0000] RunPodSandboxRequest: &RunPodSandboxRequest{Config:&PodSandboxConfig{Metadata:&PodSandboxMetadata{Name:youki1,Uid:hdishd83djaidwnduwk28bcsb,Namespace:default,Attempt:1,},Hostname:,LogDirectory:/tmp,DnsConfig:nil,PortMappings:[]*PortMapping{},Labels:map[string]string{},Annotations:map[string]string{},Linux:&LinuxPodSandboxConfig{CgroupParent:,SecurityContext:nil,Sysctls:map[string]string{},},Windows:nil,},RuntimeHandler:runc,} 
DEBU[0000] RunPodSandboxResponse: &RunPodSandboxResponse{PodSandboxId:1d02b9c68531a1f61e8313d0c6e96af60ee7e103bdbbc4f9b762026e8923f31b,} 
DEBU[0000] PullImageRequest: &PullImageRequest{Image:&ImageSpec{Image:quay.io/kubespray/hello-world:latest,Annotations:map[string]string{},},Auth:nil,SandboxConfig:&PodSandboxConfig{Metadata:&PodSandboxMetadata{Name:youki1,Uid:hdishd83djaidwnduwk28bcsb,Namespace:default,Attempt:1,},Hostname:,LogDirectory:/tmp,DnsConfig:nil,PortMappings:[]*PortMapping{},Labels:map[string]string{},Annotations:map[string]string{},Linux:&LinuxPodSandboxConfig{CgroupParent:,SecurityContext:nil,Sysctls:map[string]string{},},Windows:nil,},} 
DEBU[0001] PullImageResponse: &PullImageResponse{ImageRef:quay.io/kubespray/hello-world@sha256:f54a58bc1aac5ea1a25d796ae155dc228b3f0e11d046ae276b39c4bf2f13d8c4,} 
DEBU[0001] CreateContainerRequest: &CreateContainerRequest{PodSandboxId:1d02b9c68531a1f61e8313d0c6e96af60ee7e103bdbbc4f9b762026e8923f31b,Config:&ContainerConfig{Metadata:&ContainerMetadata{Name:youki1,Attempt:0,},Image:&ImageSpec{Image:quay.io/kubespray/hello-world:latest,Annotations:map[string]string{},},Command:[],Args:[],WorkingDir:,Envs:[]*KeyValue{},Mounts:[]*Mount{},Devices:[]*Device{},Labels:map[string]string{},Annotations:map[string]string{},LogPath:youki1.0.log,Stdin:false,StdinOnce:false,Tty:false,Linux:&LinuxContainerConfig{Resources:nil,SecurityContext:nil,},Windows:nil,},SandboxConfig:&PodSandboxConfig{Metadata:&PodSandboxMetadata{Name:youki1,Uid:hdishd83djaidwnduwk28bcsb,Namespace:default,Attempt:1,},Hostname:,LogDirectory:/tmp,DnsConfig:nil,PortMappings:[]*PortMapping{},Labels:map[string]string{},Annotations:map[string]string{},Linux:&LinuxPodSandboxConfig{CgroupParent:,SecurityContext:nil,Sysctls:map[string]string{},},Windows:nil,},} 
DEBU[0001] CreateContainerResponse: &CreateContainerResponse{ContainerId:979f5e8f811aa5651722e9c670f1d131c5f1cf4bc1bf51b9fb10c1b85f57b8d7,} 
DEBU[0001] StartContainerRequest: &StartContainerRequest{ContainerId:979f5e8f811aa5651722e9c670f1d131c5f1cf4bc1bf51b9fb10c1b85f57b8d7,} 
DEBU[0001] StartContainerResponse: &StartContainerResponse{} 
979f5e8f811aa5651722e9c670f1d131c5f1cf4bc1bf51b9fb10c1b85f57b8d7
vagrant@ubuntu20:/tmp$ head -291 /etc/crio/crio.conf | tail -33
# The "crio.runtime.runtimes" table defines a list of OCI compatible runtimes.
# The runtime to use is picked based on the runtime_handler provided by the CRI.
# If no runtime_handler is provided, the runtime will be picked based on the level
# of trust of the workload. Each entry in the table should follow the format:
#
#[crio.runtime.runtimes.runtime-handler]
#  runtime_path = "/path/to/the/executable"
#  runtime_type = "oci"
#  runtime_root = "/path/to/the/root"
#
# Where:
# - runtime-handler: name used to identify the runtime
# - runtime_path (optional, string): absolute path to the runtime executable in
#   the host filesystem. If omitted, the runtime-handler identifier should match
#   the runtime executable name, and the runtime executable should be placed
#   in $PATH.
# - runtime_type (optional, string): type of runtime, one of: "oci", "vm". If
#   omitted, an "oci" runtime is assumed.
# - runtime_root (optional, string): root directory for storage of containers
#   state.

[crio.runtime.runtimes.runc]
runtime_path = "/usr/sbin/runc"
runtime_type = "oci"
runtime_root = "/run/runc"
privileged_without_host_devices = false
allowed_annotations = []
[crio.runtime.runtimes.youki]
runtime_path = "/usr/local/bin"
runtime_type = "oci"
runtime_root = "/run/youki"
privileged_without_host_devices = false
allowed_annotations = []

@electrocucaracha
Copy link
Contributor Author

@cristicalin @gattytto I have fixed the crictl issue, the crio.conf file was pointing to a directory instead of the binary in its runtime_path value, so now molecule tests are passing.

@cristicalin
Copy link
Contributor

There are still some errors preventing the molecule test from succeeding: https://gitlab.com/kargo-ci/kubernetes-sigs-kubespray/-/jobs/1989279436#L4816

@electrocucaracha
Copy link
Contributor Author

There are still some errors preventing the molecule test from succeeding: https://gitlab.com/kargo-ci/kubernetes-sigs-kubespray/-/jobs/1989279436#L4816

Apparently the cni config file was ignored but after include it, it seems to be working properly.

@cristicalin
Copy link
Contributor

Nice work @electrocucaracha , thanks for following through with this feature!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 21, 2022
@oomichi
Copy link
Contributor

oomichi commented Jan 21, 2022

Nice work @electrocucaracha

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: electrocucaracha, oomichi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 21, 2022
@k8s-ci-robot k8s-ci-robot merged commit e88aa7c into kubernetes-sigs:master Jan 21, 2022
@floryut floryut added kind/container-managers Containers section in the release note and removed kind/feature Categorizes issue or PR as related to a new feature. labels Jan 24, 2022
sakuraiyuta pushed a commit to sakuraiyuta/kubespray that referenced this pull request Apr 16, 2022
@oomichi oomichi mentioned this pull request May 28, 2022
LuckySB pushed a commit to southbridgeio/kubespray that referenced this pull request Jun 29, 2023
LuckySB pushed a commit to southbridgeio/kubespray that referenced this pull request Oct 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/container-managers Containers section in the release note lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants