Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for RISC-V #7778

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Conversation

chazapis
Copy link

@chazapis chazapis commented Jun 14, 2023

Proposed Changes

Add support for the RISC-V architecture

Types of Changes

Add the architecture to the scripts involved in building and packaging K3s; no code changes.
This is really possible through cross-compilation at the moment, so I am also attaching a script that can be used from inside an Ubuntu Docker image to build K3s for RISC-V (tested on an M1 MacBook) so an extra script takes care of preparing the cross-compile environment (edit: updated to reflect latest changes).

Verification

Starting from a clean checkout of the code, one can cross-compile for RISC-V in an Ubuntu 22.04 container, by using the following command (edit: updated to reflect latest changes):

ARCH=riscv64 SKIP_IMAGE=true SKIP_VALIDATE=true SKIP_AIRGAP=true make

This will produce the dist/artifacts/k3s-riscv64 binary, which can be installed in a QEMU VM for verification.
I provide a pre-compiled binary and instructions here.

Sample output from a RISC-V QEMU VM for knowing what to expect:

root@ubuntu:~# uname -a
Linux ubuntu 5.19.0-1012-generic #13~22.04.1-Ubuntu SMP Thu Jan 12 15:34:31 UTC 2023 riscv64 riscv64 riscv64 GNU/Linux
root@ubuntu:~# systemctl --no-pager status k3s
● k3s.service - Lightweight Kubernetes
     Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2023-06-14 11:54:03 UTC; 21min ago
       Docs: https://k3s.io
    Process: 1636 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service (code=exited, status=0/SUCCESS)
    Process: 1638 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
    Process: 1639 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
   Main PID: 1640 (k3s-server)
      Tasks: 28
     Memory: 576.8M
        CPU: 13min 43.205s
     CGroup: /system.slice/k3s.service
             ├─1640 "/usr/local/bin/k3s server" "" "" "" "" "" "" "" "" "" "" "
             └─1671 "containerd " "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""

Jun 14 12:15:39 ubuntu k3s[1640]: E0614 12:15:39.202967    1640 kuberuntime_man…
Jun 14 12:15:39 ubuntu k3s[1640]: E0614 12:15:39.204361    1640 pod_workers…2m_k
Jun 14 12:15:47 ubuntu k3s[1640]: E0614 12:15:47.230229    1640 remote_runtime.…
Jun 14 12:15:47 ubuntu k3s[1640]: E0614 12:15:47.231011    1640 kuberuntime_san…
Jun 14 12:15:47 ubuntu k3s[1640]: E0614 12:15:47.231508    1640 kuberuntime_man…
Jun 14 12:15:47 ubuntu k3s[1640]: E0614 12:15:47.233205    1640 pod_workers…th-p
Jun 14 12:15:53 ubuntu k3s[1640]: E0614 12:15:53.273022    1640 remote_runtime.…
Jun 14 12:15:53 ubuntu k3s[1640]: E0614 12:15:53.273533    1640 kuberuntime_san…
Jun 14 12:15:53 ubuntu k3s[1640]: E0614 12:15:53.273851    1640 kuberuntime_man…
Jun 14 12:15:53 ubuntu k3s[1640]: E0614 12:15:53.274418    1640 pod_workers…2m_k
Hint: Some lines were ellipsized, use -l to show in full.
root@ubuntu:~# kubectl get nodes -o wide
NAME     STATUS   ROLES                  AGE   VERSION                      INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION        CONTAINER-RUNTIME
ubuntu   Ready    control-plane,master   22m   v1.27.2+k3s-b66a1183-dirty   10.0.2.15     <none>        Ubuntu 22.04.2 LTS   5.19.0-1012-generic   containerd://1.7.1-k3s1

Testing

This change is not covered by testing, but I am not sure how this would be possible given limited support of the required utilities for the architecture. This could be initially marked as an "experimental" architecture, until the tools are available for testing.

Linked Issues

This PR is directly related to #7151.

User-Facing Change

NONE

Further Comments

This should ideally be integrated with the Makefile and the automated build system, but I am not sure how these can support cross-compiling K3s - hence the build script. If someone can provide some hints on how to integrate, please do, as it would be really helpful to have an automated K3s release for RISC-V. I will be happy to help. (edit: The changes are now integrated with the K3s automated build system).

Additionally, this PR is really the first step in the direction of RISC-V support. Utility images should also be made available for the architecture, starting with an image for a pause container. I have been unsuccessful in finding the sources and process of building rancher/mirrored-pause:3.6. Any pointers on that would also be very useful. (edit: Required images have been ported to the architecture, changes have been submitted as PRs to relevant projects, read on for details.)

@chazapis chazapis requested a review from a team as a code owner June 14, 2023 12:29
@chazapis
Copy link
Author

chazapis commented Jun 14, 2023

Also note that this uses a prepackaged k3s-root we have manually compiled and have available on GitHub (edit: removed, as the PR has been merged in k3s-root). Once the relevant PR is accepted from k3s-root, the download script can use the "official" binaries.

@brandond
Copy link
Contributor

I'm not sure we're ready to do this until we have some hardware on our side to support native building and testing, for the CI runners. There are some talks occurring between suse and a risc-v ISV to acquire development hardware.

We will also need to begin mirroring images for this platform; the rancher image mirror scripts currently filter it out, even if upstream does offer them - which many do not.

cross-compile-riscv.sh Outdated Show resolved Hide resolved
scripts/download Outdated Show resolved Hide resolved
cross-compile-riscv.sh Outdated Show resolved Hide resolved
@chazapis
Copy link
Author

Updated with a new commit that uses k3s-root v0.13.0 with RISC-V support.

I also had to use a patched version of runc so it allows images built for riscv64 to run (I submitted a PR, so these changes will be merged in the 1.1.x branch). With this change, and a RISC-V compatible pause container, I can now start Pods using K3s.

Next, I will look into integrating the build script (which has been moved into the scripts folder) into Dockerfile.dapper.

scripts/download Outdated Show resolved Hide resolved
scripts/build Outdated Show resolved Hide resolved
Copy link
Contributor

@brandond brandond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to see what it looks like to add a drone pipeline to cross-build riscv64; are you up for taking that on?

@chazapis
Copy link
Author

I'd love to see what it looks like to add a drone pipeline to cross-build riscv64; are you up for taking that on?

Yes, I'm looking into this. It looks like the only practical issue is that Dockerfile.dapper is based on Alpine that does not have all the requirements for cross-compiling available. My plan is to start with another Dockerfile, based on Debian (I have only tried Ubuntu up to now).

@brandond
Copy link
Contributor

Alpine that does not have all the requirements for cross-compiling available

Hmm, that might be a challenge, as we intentionally build with alpine and musl libc. We wouldn't want to ship binaries built on a different platform for just one arch.

I think alpine has some helper scripts that are used to set up a cross build environment, see https://github.com/alpinelinux/aports/blob/master/scripts/bootstrap.sh

@chazapis
Copy link
Author

Alpine that does not have all the requirements for cross-compiling available

Hmm, that might be a challenge, as we intentionally build with alpine and musl libc. We wouldn't want to ship binaries built on a different platform for just one arch.

I think alpine has some helper scripts that are used to set up a cross build environment, see https://github.com/alpinelinux/aports/blob/master/scripts/bootstrap.sh

Ok. I will try using that. Thanks.

@chazapis
Copy link
Author

chazapis commented Jun 23, 2023

I just committed a version that uses Alpine to cross-build for RISC-V instead of Ubuntu. I implemented a scripts/prepare-cross that uses bootstrap.sh just for building the cross-compiler and then installs binary packages for the target architecture from upstream Alpine. The script is generic enough so it can be tailored for other architectures as well (for RISC-V, the riscv64 identifier is also used by Alpine; this may need to change for other platforms).

I have also integrated cross-building with the Dapper environment, meaning that I successfully built a binary on my laptop, by just running:

ARCH=riscv64 SKIP_IMAGE=true SKIP_VALIDATE=true SKIP_AIRGAP=true make

SKIP_IMAGE is a new flag. ARCH was there - I just added it to the flags that Dapper passes over to the build container.

@codecov
Copy link

codecov bot commented Jun 23, 2023

Codecov Report

Patch coverage has no change and project coverage change: +4.31% 🎉

Comparison is base (d8ae6ef) 47.05% compared to head (301b356) 51.37%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7778      +/-   ##
==========================================
+ Coverage   47.05%   51.37%   +4.31%     
==========================================
  Files         143      143              
  Lines       14561    14561              
==========================================
+ Hits         6852     7481     +629     
+ Misses       6616     5890     -726     
- Partials     1093     1190      +97     
Flag Coverage Δ
e2etests 49.23% <ø> (?)
inttests 44.37% <ø> (-0.06%) ⬇️
unittests 19.86% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

see 41 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@chazapis
Copy link
Author

The pipeline produces a working RISC-V binary.

Running in QEMU:

root@ubuntu:~# wget https://k3s-ci-builds.s3.us-east-1.amazonaws.com/k3s-riscv64-604a9b056440205ad253ac7eb7c50133e7c380f7
...
root@ubuntu:~# cp k3s-riscv64-* /usr/local/bin/k3s
root@ubuntu:~# chmod +x /usr/local/bin/k3s
root@ubuntu:~# curl -sfL https:/INSTALL_K3S_EXEC="server --disable traefik,metrics-server --pause-image carvicsforth/pause-riscv:v3.9-v1.27.2" INSTALL_K3S_SKIP_DOWNLOAD="true" bash k3s-install.sh
[INFO]  Skipping k3s download and verify
[INFO]  Skipping installation of SELinux RPM
[INFO]  Creating /usr/local/bin/kubectl symlink to k3s
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Creating /usr/local/bin/ctr symlink to k3s
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
[INFO]  systemd: Enabling k3s unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.
[INFO]  systemd: Starting k3s
root@ubuntu:~# kubectl get nodes -o wide
NAME     STATUS   ROLES                  AGE   VERSION                INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION        CONTAINER-RUNTIME
ubuntu   Ready    control-plane,master   50s   v1.27.3+k3s-604a9b05   10.0.2.15     <none>        Ubuntu 22.04.2 LTS   5.19.0-1012-generic   containerd://1.7.1-k3s1
root@ubuntu:~# kubectl apply -f https://raw.githubusercontent.com/CARV-ICS-FORTH/kubernetes-riscv64/main/ubuntu.yaml
pod/ubuntu created
root@ubuntu:~# kubectl get pods -o wide
NAME     READY   STATUS    RESTARTS   AGE   IP          NODE     NOMINATED NODE   READINESS GATES
ubuntu   1/1     Running   0          61s   10.42.0.4   ubuntu   <none>           <none>
root@ubuntu:~# kubectl exec -it ubuntu -- cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.1 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
root@ubuntu:~# 

@chazapis
Copy link
Author

Rebased.

Signed-off-by: Antony Chazapis <chazapis@ics.forth.gr>
…d script in scripts folder

Signed-off-by: Antony Chazapis <chazapis@ics.forth.gr>
Signed-off-by: Antony Chazapis <chazapis@ics.forth.gr>
Signed-off-by: Antony Chazapis <chazapis@ics.forth.gr>
Signed-off-by: Antony Chazapis <chazapis@ics.forth.gr>
Signed-off-by: Antony Chazapis <chazapis@ics.forth.gr>
Signed-off-by: Antony Chazapis <chazapis@ics.forth.gr>
@chazapis
Copy link
Author

chazapis commented Jul 21, 2023

Rebased, updated runc with the upstream version that supports RISC-V, and resolved all discussions.

The current status is:

  • Both K3s dependencies (k3s-root and runc) have released versions that support RISC-V.
  • This PR produces a K3s RISC-V binary using the default CI pipeline (with Dapper and Alpine Linux).
  • All required containers for running Pods and the default K3s services have been ported to RISC-V and changes have been pushed upstream (I reference this PR in other projects, so the links should be visible here).

If there was a way to define the required container images per architecture, the RISC-V version would work out of the box. Now, you can either patch running Pods to use the correct images or use a prebuilt binary with the riscv64 images already set. These images are (compare with the default image list):

docker.io/carvicsforth/klipper-helm:v0.8.0-build20230716
docker.io/carvicsforth/klipper-lb:v0.4.4
docker.io/carvicsforth/local-path-provisioner:master-head
docker.io/carvicsforth/coredns:1.10.1
docker.io/riscv64/busybox:1.34.1
docker.io/carvicsforth/traefik:2.10.3
docker.io/carvicsforth/metrics-server:v0.6.3
docker.io/carvicsforth/pause:v3.9-v1.27.2

Sample output from QEMU:

root@ubuntu:~# kubectl get nodes
NAME     STATUS   ROLES                  AGE   VERSION
ubuntu   Ready    control-plane,master   9h    v1.27.3+k3s-9d376dfb-dirty
root@ubuntu:~# kubectl get pods -A
NAMESPACE     NAME                                      READY   STATUS      RESTARTS      AGE
kube-system   helm-install-traefik-crd-49rhb            0/1     Completed   0             9h
kube-system   helm-install-traefik-psnrr                0/1     Completed   2             9h
kube-system   svclb-traefik-c0171ed3-9fjr9              2/2     Running     2 (20m ago)   9h
kube-system   traefik-8657d6b9f4-j2znf                  1/1     Running     1 (20m ago)   9h
kube-system   coredns-97b598894-nf45g                   1/1     Running     1 (20m ago)   9h
kube-system   metrics-server-7c55d89d5d-nppqs           1/1     Running     0             9h
kube-system   local-path-provisioner-6d44f4f9d7-gvmv6   1/1     Running     2 (19m ago)   9h
root@ubuntu:~# kubectl top pods -A
NAMESPACE     NAME                                      CPU(cores)   MEMORY(bytes)   
kube-system   coredns-97b598894-nf45g                   27m          14Mi            
kube-system   local-path-provisioner-6d44f4f9d7-gvmv6   5m           6Mi             
kube-system   metrics-server-7c55d89d5d-nppqs           41m          17Mi            
kube-system   svclb-traefik-c0171ed3-9fjr9              0m           2Mi             
kube-system   traefik-8657d6b9f4-j2znf                  23m          27Mi            

Signed-off-by: Antony Chazapis <chazapis@ics.forth.gr>
@brandond
Copy link
Contributor

I was going to try to rebase/squash this onto current master, but you didn't set the PR to allow maintainers to contribute to the branch.

@brandond
Copy link
Contributor

brandond commented Sep 11, 2023

Rebased/squashed PR looks good on my cluster of Unmatched boards. I just can't run anything on it because none of our images currently support risc64; I'm working on building images for that.

unmatched01:~ # kubectl get node -o wide
NAME                    STATUS   ROLES                       AGE   VERSION                INTERNAL-IP   EXTERNAL-IP   OS-IMAGE              KERNEL-VERSION     CONTAINER-RUNTIME
unmatched01.lan.khaus   Ready    control-plane,etcd,master   27m   v1.28.1+k3s-b44521db   10.0.1.182    <none>        openSUSE Tumbleweed   6.4.12-1-default   containerd://1.7.3-k3s2
unmatched02.lan.khaus   Ready    control-plane,etcd,master   19m   v1.28.1+k3s-b44521db   10.0.1.212    <none>        openSUSE Tumbleweed   6.4.12-1-default   containerd://1.7.3-k3s2
unmatched03.lan.khaus   Ready    control-plane,etcd,master   14m   v1.28.1+k3s-b44521db   10.0.1.188    <none>        openSUSE Tumbleweed   6.4.12-1-default   containerd://1.7.3-k3s2

unmatched01:~ # kubectl describe node unmatched01.lan.khaus
Name:               unmatched01.lan.khaus
Roles:              control-plane,etcd,master
Labels:             beta.kubernetes.io/arch=riscv64
                    beta.kubernetes.io/instance-type=k3s
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=riscv64
                    kubernetes.io/hostname=unmatched01.lan.khaus
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/control-plane=true
                    node-role.kubernetes.io/etcd=true
                    node-role.kubernetes.io/master=true
                    node.kubernetes.io/instance-type=k3s
Annotations:        etcd.k3s.cattle.io/node-address: 10.0.1.182
                    etcd.k3s.cattle.io/node-name: unmatched01.lan.khaus-eaad4846
                    flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"2a:55:42:6d:2c:da"}
                    flannel.alpha.coreos.com/backend-type: vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager: true
                    flannel.alpha.coreos.com/public-ip: 10.0.1.182
                    k3s.io/hostname: unmatched01.lan.khaus
                    k3s.io/internal-ip: 10.0.1.182
                    k3s.io/node-args:
                      ["server","--token","********","--cluster-init","true","--disable-servicelb","true","--disable","coredns","--disable","traefik","--disable...
                    k3s.io/node-config-hash: WSHDY5WIYDGUS2ZMBAHTM43P2NZGQLSFI2O6OJYZJ4UFH5TL2VAQ====
                    k3s.io/node-env: {"K3S_DATA_DIR":"/var/lib/rancher/k3s/data/80d685a8843169e5974bea73582cc22a05f19746bdb4ecc381eeb29d1c691e2e"}
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Mon, 11 Sep 2023 18:45:03 +0000
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  unmatched01.lan.khaus
  AcquireTime:     <unset>
  RenewTime:       Mon, 11 Sep 2023 19:12:55 +0000
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Mon, 11 Sep 2023 19:12:47 +0000   Mon, 11 Sep 2023 19:02:34 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Mon, 11 Sep 2023 19:12:47 +0000   Mon, 11 Sep 2023 19:02:34 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Mon, 11 Sep 2023 19:12:47 +0000   Mon, 11 Sep 2023 19:02:34 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Mon, 11 Sep 2023 19:12:47 +0000   Mon, 11 Sep 2023 19:02:34 +0000   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  InternalIP:  10.0.1.182
  Hostname:    unmatched01.lan.khaus
Capacity:
  cpu:                4
  ephemeral-storage:  961544732Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             16359916Ki
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  935390714556
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             16359916Ki
  pods:               110
System Info:
  Machine ID:                 e655aba1732d4799bd8bbcbaf8a3f4ab
  System UUID:                e655aba1732d4799bd8bbcbaf8a3f4ab
  Boot ID:                    0ddbcbc0-1f30-4b4c-ab4d-240eedba6496
  Kernel Version:             6.4.12-1-default
  OS Image:                   openSUSE Tumbleweed
  Operating System:           linux
  Architecture:               riscv64
  Container Runtime Version:  containerd://1.7.3-k3s2
  Kubelet Version:            v1.28.1+k3s-b44521db
  Kube-Proxy Version:         v1.28.1+k3s-b44521db
PodCIDR:                      10.42.0.0/24
PodCIDRs:                     10.42.0.0/24
ProviderID:                   k3s://unmatched01.lan.khaus
Non-terminated Pods:          (0 in total)
  Namespace                   Name    CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----    ------------  ----------  ---------------  -------------  ---
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests  Limits
  --------           --------  ------
  cpu                0 (0%)    0 (0%)
  memory             0 (0%)    0 (0%)
  ephemeral-storage  0 (0%)    0 (0%)
  hugepages-1Gi      0 (0%)    0 (0%)
  hugepages-2Mi      0 (0%)    0 (0%)

@chazapis
Copy link
Author

chazapis commented Sep 12, 2023

I was going to try to rebase/squash this onto current master, but you didn't set the PR to allow maintainers to contribute to the branch.

@brandond, I'm searching for a way to change this, but it looks like it is not supported, since the fork is under an organization account... I see however that you created another branch with the code.

Regarding the images, I have made progress in changing all required images to support RISC-V. I track the current status here. I have submitted PRs to several related projects, of which only CoreDNS has already accepted the changes and released a RISC-V image. You can find ready-made images under our Docker Hub account. I also maintain another branch in our K3S fork with updated manifests, until the changes are merged upstream.

@brandond
Copy link
Contributor

Yeah, the image thing will be interesting. Getting upstream to add support for it is part of it, but we'll also need to make some changes to https://github.com/rancher/image-mirror to add support for mirroring the risc-v architecture, without changing existing manifest lists.

For the CI config, I'd like to discuss with the team perhaps making risc-v only run on nightly and tag, instead of PR, as it ties up the already over-utilized amd64 runners for a significant chunk of time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants