Skip to content

krunkit: Use unix socket for --restul-uri #20900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

nirs
Copy link
Contributor

@nirs nirs commented Jun 7, 2025

This removes the limit of using a single machine, removing the need to
manage tcp ports. We use:

--restful-uri unix://$MINIKUBE_HOKE/machines/NAME/krunkit.sock

The socket is created and deleted by krunkit.

NOTES:

Status:

  • The API call works with krunkit using unix socket
  • krunkit does not terminate since graceful shutdown does not work with minikube VM with kernel 6.6.

Based on #20826 for testing.

nirs and others added 7 commits June 7, 2025 22:21
- Update to longterm kernel 6.6.92[1]
- aarch64: Enable Virtio GPU, needed for krunkit driver

Generated using by running:

    make iso-menuconfig-aarch64
    make linux-menuconfig-aarch64

    make iso-menuconfig-x86_64
    make linux-menuconfig-x86_64

This generated many changes in the configs, maybe they were updated
manually previously.

With this change we can boot krunkit with the built iso:

    % minikube start -p krunkit --driver krunkit --container-runtime containerd --iso-url file://$PWD/minikube-arm64-vgpu.iso
    😄  [krunkit] minikube v1.36.0 on Darwin 15.5 (arm64)
    ✨  Using the krunkit (experimental) driver based on user configuration
    👍  Starting "krunkit" primary control-plane node in "krunkit" cluster
    🔥  Creating krunkit VM (CPUs=2, Memory=6000MB, Disk=20000MB) ...
    📦  Preparing Kubernetes v1.33.1 on containerd 1.7.23 ...
        ▪ Generating certificates and keys ...
        ▪ Booting up control plane ...
        ▪ Configuring RBAC rules ...
    🔗  Configuring bridge CNI (Container Networking Interface) ...
    🔎  Verifying Kubernetes components...
        ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
    🌟  Enabled addons: storage-provisioner, default-storageclass
    🏄  Done! kubectl is now configured to use "krunkit" cluster and "default" namespace by default

And now we have accelerated gpu:

    $ tree /dev/dri
    /dev/dri
    |-- by-path
    |   |-- platform-a007000.virtio_mmio-card -> ../card0
    |   `-- platform-a007000.virtio_mmio-render -> ../renderD128
    |-- card0
    `-- renderD128

For example usage of the accelerated GPU see:
https://github.com/medyagh/ai-playground-minikube/tree/main/macos

[1] https://www.kernel.org/
libkrun virtio-net driver enables TSO offloading and checksum
offloading by default, so we must use vment-helper --enable-tso and
--enable-checksum-offload with krunkit. These options do not work with
vfkit.
krunkit is a tool to launch configurable virtual machines using the
libkrun platform, optimized for GPU accelerated virtual machines and AI
workloads on Apple silicon.

It is mostly compatible with vfkit; the driver is a simplified copy of
the vfkit driver. Unlike vfkit, krunkit is available only on Apple
silicon.

Changes compared to vfkit driver:
- krunkit requires unix socket for netwroking, so we must use
  vment-helper.
- krunkit can be controlled only via HTTP, not via unix socket.
- krunkit does not support HardStop so we need to use SIGKILL.
- krunkit does not support --kernel-cmdline
- We must enable vmnet offloading, required for krunkit.
- The code was simplified since vmnet-helper is always used
- Code was cleaned up to use .ResolveStorePath()

Limitations:
- Only one machine can be created since we use the same port for krunkit
  --restful-uri. This should be fixed to use an unused port, or use a
  unix socket when unix socket is supported[1].

[1] containers/krunkit#47
Previously it was used only for vfkit, so we suggested to fallback to
the `nat` network. This advice is not relevant to krunkit or to qemu
(which can also use vmnet-helper).

Change the error to recommend installing vment-helper. We need to think
how we can recommend other networks for vfkit and qemu. Another solution
is to create error for every driver+network combination but this seems
hard to manage.
This is the same way that we test vfkit. This test is not running in the
CI.

Issues:
- Need to install and configure vment-helper (requires root).
This removes the limit of using a single machine, removing the need to
manage tcp ports. We use:

    --restful-uri unix://$MINIKUBE_HOKE/machines/NAME/krunkit.sock

The socket is created and deleted by krunkit.

NOTES:
- Depends on containers/krunkit#51.
  To test this change build and install krunkit from this PR.
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 7, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: nirs
Once this PR has been reviewed and has the lgtm label, please assign spowelljr for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from prezha and spowelljr June 7, 2025 21:19
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 7, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @nirs. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jun 7, 2025
@medyagh
Copy link
Member

medyagh commented Jun 8, 2025

@nirs would you like ISO to be built for this PR ? (for you to test it ?)

@nirs
Copy link
Contributor Author

nirs commented Jun 8, 2025

@nirs would you like ISO to be built for this PR ? (for you to test it ?)

Not needed, this adds one commit to #20826 depending on krunkit PR, and it does not work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants