Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support arm64 #166

Closed
vielmetti opened this issue Dec 11, 2018 · 82 comments
Closed

Support arm64 #166

vielmetti opened this issue Dec 11, 2018 · 82 comments

Comments

@vielmetti
Copy link

@vielmetti vielmetti commented Dec 11, 2018

Device under test is a Packet c1.large.arm 96-core arm64 machine running Ubuntu 18.04.

ed@ed-2a-bcc-llvm:~$ go version
go version go1.11.2 linux/arm64
ed@ed-2a-bcc-llvm:~$ go get sigs.k8s.io/kind
ed@ed-2a-bcc-llvm:~$ go/bin/kind create cluster
Creating cluster 'kind-1' ...
 ✓ Ensuring node image (kindest/node:v1.12.2)  
 ✓ [kind-1-control-plane] Creating node container 📦 
 ✗ [kind-1-control-plane] Fixing mounts 🗻 
Error: failed to create cluster: exit status 1
Usage:  
  kind create cluster [flags]
        
Flags:  
      --config string   path to a kind config file
  -h, --help            help for cluster
      --image string    node docker image to use for booting the cluster
      --name string     cluster context name (default "1")
      --retain          retain nodes for debugging when cluster creation fails
      --wait duration   Wait for control plane node to be ready (default 0s)

Global Flags:
      --loglevel string   logrus log level [panic, fatal, error, warning, info, debug] (default "warning")

failed to create cluster: exit status 1
ed@ed-2a-bcc-llvm:~$ docker version
Client:
 Version:           18.09.0
 API version:       1.39
 Go version:        go1.10.4
 Git commit:        4d60db4
 Built:             Wed Nov  7 00:52:41 2018
 OS/Arch:           linux/arm64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.0
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.4
  Git commit:       4d60db4
  Built:            Wed Nov  7 00:17:01 2018
  OS/Arch:          linux/arm64
  Experimental:     false
@vielmetti
Copy link
Author

@vielmetti vielmetti commented Dec 11, 2018

Looking at https://github.com/kubernetes-sigs/kind/blob/master/pkg/cluster/context.go#L196-L206 where the code errors out, there's a TODO for logging from @BenTheElder .

@vielmetti
Copy link
Author

@vielmetti vielmetti commented Dec 11, 2018

Looks like the underlying reason is that kindest/node is not multiarchitecture.

ed@ed-2a-bcc-llvm:~$ docker run --rm mplatform/mquery kindest/node:v1.12.2
Image: kindest/node:v1.12.2
 * Manifest List: No
 * Supports: amd64/linux

@tao12345666333
Copy link
Member

@tao12345666333 tao12345666333 commented Dec 11, 2018

@neolit123
Copy link
Member

@neolit123 neolit123 commented Dec 12, 2018

/priority important-longterm
/kind feature

@BenTheElder BenTheElder added this to the 2019 goals milestone Dec 12, 2018
@BenTheElder BenTheElder changed the title kind create cluster fails on arm64 in fixing mounts Support arm64 Dec 13, 2018
@BenTheElder BenTheElder mentioned this issue Dec 19, 2018
@BenTheElder BenTheElder removed this from the 2019 goals milestone Dec 19, 2018
@BenTheElder BenTheElder added this to the 1.0 milestone Dec 19, 2018
@BenTheElder BenTheElder added this to To do in 1.0 via automation Dec 19, 2018
@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented Dec 19, 2018

@dims hacked up a working version of this: https://paste.fedoraproject.org/paste/gdlF9fqXeSADK-aPN-sEbw/raw 🎉
I think we might want to put a little more thought into how to do this well like #188, but this should be doable. Tentatively tracking this for kind 1.0 in Q1 2019.

@pbnj
Copy link

@pbnj pbnj commented Apr 25, 2019

We're yet to see details around this announcement, but it ought to simplify the process of building ARM images significantly: https://techcrunch.com/2019/04/24/docker-partners-with-arm/

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented Apr 25, 2019

yes!

also somehow forgot to update this issue, we have arm64 support, just no published images yet (that will need some more thinking...) if you build images yourself kind should work on arm64 today 😅

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented Apr 25, 2019

@vielmetti
Copy link
Author

@vielmetti vielmetti commented Apr 25, 2019

My fervent hope is that the new Docker tooling will make multi-architecture manifests much easier to produce. The current setup for most projects is just a bit complex.

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented May 3, 2019

So FWIW I did figure out how to work manifest-tool I think, this definitely seems feasible this year, if a bit clunky, the trickiest part now is we need to write some tooling to cross compile the kind image (or coordinate with kicking off a build on packet or ... 🤔).

I think I'd like to get this into GCB based publishing with a cross-compile so we can start automating publishing mulit-arch node images, I punted looking further into that while working on the breaking image changes in #461, but those are in now :-)

@aojea
Copy link
Contributor

@aojea aojea commented May 3, 2019

@BenTheElder we can use CircleCI or Travis to do that, I see that another project under kubernetes-sig has the integrations enabled https://github.com/kubernetes-sigs/kubeadm-dind-cluster

I have experience building pipelines on those, it will be relatively easy to create a pipeline there to automatically publish the images based on PR or per tag

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented May 3, 2019

@aojea
Copy link
Contributor

@aojea aojea commented May 6, 2019

hmm, I think I have to read more about prow https://github.com/kubernetes/test-infra/blob/master/prow/jobs.md

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented May 15, 2019

Minor update: I've ensured the ip-masq-agent pushes multi-arch (manifest) images upstream before we adopted it, and our own tiny networking daemon is cross compiled and pushing multi-arch images.

These were simpler than Kubernetes node images, but at least we have some more multi-arch samples and we continue to nominally work on arm64.

I think the next step is to support building node images from Kubernetes release tarballs so we can consume Kubernetes's upsteam cross compilation output and save time building & publishing. Then we start building manifest list images from those.

@vielmetti
Copy link
Author

@vielmetti vielmetti commented Mar 4, 2021

latest try by following install instructions:

root@altra-x:~# go version
go version go1.16 linux/arm64
root@altra-x:~# go/bin/kind create cluster
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.20.2) 🖼
 ✓ Preparing nodes 📦
 ✗ Writing configuration 📜
ERROR: failed to create cluster: failed to generate kubeadm config content: failed to get kubernetes version from node: failed to get file: command "docker exec --privileged kind-control-plane cat /kind/version" failed with error: exit status 1
Command Output: Error response from daemon: Container a10eec8ba8ca354ef32ad893ce4ff0c0aa029124a2f45247ab53721f0e0e8e26 is not running

Device under test is a 160-core Ampere Altra.

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented Mar 4, 2021

@vielmetti the default node image is not multi-arch. please see discussion above.

@vielmetti
Copy link
Author

@vielmetti vielmetti commented Mar 4, 2021

Thanks @BenTheElder - per the discussion above I found https://hub.docker.com/r/rossgeorgiev/kind-node-arm64 and will continue with that for my testing.

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented Apr 2, 2021

This is still intended to ship in the next release. I've put up a PR for the current iteration #2176

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented May 14, 2021

kind @ HEAD should "just work" on arm64, but we need verification. Then we're ~prepared to release.

@lasdolphin
Copy link

@lasdolphin lasdolphin commented May 14, 2021

kind @ HEAD should "just work" on arm64, but we need verification. Then we're ~prepared to release.

Hi all. It should but it's not. I'm getting "Failed to run kubelet" err="failed to run Kubelet: invalid configuration: cgroup-r oot [\"kubelet\"] doesn't exist" when using kindest/node:v1.21.1 image.

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented May 14, 2021

@lasdolphin can you share more about the host OS etc? That error isn't related to architecture.

EDIT: we've seen that issue on AMD64 but don't have enough info yet, the thread for that is #2236 (comment)

EDIT2: @lasdolphin please add those details to #2236 instead of this thread, the fact that kubelet ran and crashed suggests that the arm64 aspect is working as intended, as far as we could test before hitting the other, unrelated issue.

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented May 15, 2021

https://twitter.com/JimEwald/status/1393411027329445890 we have a report for arm64 working here: Docker Desktop 3.3.3, macOS 11.3.1, Mac Mini/M1 Apple Silicon.

@lasdolphin
Copy link

@lasdolphin lasdolphin commented May 15, 2021

@lasdolphin can you share more about the host OS etc? That error isn't related to architecture.

EDIT: we've seen that issue on AMD64 but don't have enough info yet, the thread for that is #2236 (comment)

EDIT2: @lasdolphin please add those details to #2236 instead of this thread, the fact that kubelet ran and crashed suggests that the arm64 aspect is working as intended, as far as we could test before hitting the other, unrelated issue.

Docker version 20.10.6, build 370c289
masOS BigSur 11.3
MacbookAir M1
Everything works with alternative node image rossgeorgiev/kind-node-arm64
I assume the issue is in k8s version 1.21.1

@shuuji3
Copy link

@shuuji3 shuuji3 commented May 15, 2021

Hello🙂 I have a similar environment as @lasdolphin, but the current HEAD looks to work well in my environment. I tried creating Pod, Service, Ingress without issues. Also, I deployed Contour following https://kind.sigs.k8s.io/docs/user/ingress and can access to Service via Ingress from localhost.

Environment:

  • kind version: (use kind version):
❯ kind version
kind v0.11.0-alpha go1.16.3 darwin/arm64
  • Kubernetes version: (use kubectl version):
❯ k version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T21:10:45Z", GoVersion:"go1.16.3", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-14T10:11:15Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/arm64"}
  • Docker version: (use docker info):
❯ docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)
  compose: Docker Compose (Docker Inc., 2.0.0-beta.1)
  scan: Docker Scan (Docker Inc., v0.8.0)

Server:
 Containers: 11
  Running: 2
  Paused: 0
  Stopped: 9
 Images: 25
 Server Version: 20.10.6
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc version: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.10.25-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 4
 Total Memory: 1.942GiB
 Name: docker-desktop
 ID: 5Y2M:UAVW:DGCD:DHOB:QFRH:DHKY:Y4FL:OZXY:4IRG:RSNQ:S6Z7:STMB
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
❯ docker version
Client:
 Cloud integration: 1.0.14
 Version:           20.10.6
 API version:       1.41
 Go version:        go1.16.3
 Git commit:        370c289
 Built:             Fri Apr  9 22:46:57 2021
 OS/Arch:           darwin/arm64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.6
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       8728dd2
  Built:            Fri Apr  9 22:44:13 2021
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.4.4
  GitCommit:        05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc:
  Version:          1.0.0-rc93
  GitCommit:        12644e614e25b05da6fd08a38ffa0cfe1903fdec
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
  • OS (e.g. from /etc/os-release):
    • MacBook Air (M1, 2020)
    • macOS Big Sur 11.3.1 (20E241)

@lasdolphin
Copy link

@lasdolphin lasdolphin commented May 15, 2021

@nickolaev
Copy link

@nickolaev nickolaev commented May 15, 2021

Posting here too. Confirmed to work on Ubuntu Server 20.04 aarch64, in Parallels Desktop virtualisation on MacBook Air with M1:

nickolaev@u64:~/bin$ uname -a
Linux u64 5.4.0-73-generic #82-Ubuntu SMP Wed Apr 14 17:29:58 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux
nickolaev@u64:~/bin$ ./kind create cluster --retain
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.21.1) 🖼
 ✓ Preparing nodes 📦
 ✓ Writing configuration 📜
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Have a question, bug, or feature request? Let us know! https://kind.sigs.k8s.io/#community 🙂
nickolaev@u64:~/bin$ kubectl cluster-info --context kind-kind
Kubernetes control plane is running at https://127.0.0.1:38131
CoreDNS is running at https://127.0.0.1:38131/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
nickolaev@u64:~/bin$
nickolaev@u64:~/bin$ docker version
Client: Docker Engine - Community
 Version:           20.10.6
 API version:       1.41
 Go version:        go1.13.15
 Git commit:        370c289
 Built:             Fri Apr  9 22:45:59 2021
 OS/Arch:           linux/arm64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.6
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       8728dd2
  Built:            Fri Apr  9 22:44:09 2021
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.4.4
  GitCommit:        05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc:
  Version:          1.0.0-rc93
  GitCommit:        12644e614e25b05da6fd08a38ffa0cfe1903fdec
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

@vielmetti
Copy link
Author

@vielmetti vielmetti commented May 15, 2021

Success on an Equinix Metal c1.large.arm (96 core ThunderX), the original machine that I opened this issue on.

Built with GO111MODULE="on" go get sigs.k8s.io/kind@HEAD && go/bin/kind create cluster

root@ny-c1-large-arm-01:~# uname -a
Linux ny-c1-large-arm-01 5.4.0-40-generic #44-Ubuntu SMP Mon Jun 22 23:59:48 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux
root@ny-c1-large-arm-01:~# kubectl cluster-info --context kind-kind
Kubernetes control plane is running at https://127.0.0.1:43733
CoreDNS is running at https://127.0.0.1:43733/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
root@ny-c1-large-arm-01:~# docker version
Client:
 Version:           20.10.2
 API version:       1.41
 Go version:        go1.13.8
 Git commit:        20.10.2-0ubuntu1~20.04.2
 Built:             Tue Mar 30 21:33:43 2021
 OS/Arch:           linux/arm64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.2
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.8
  Git commit:       20.10.2-0ubuntu1~20.04.2
  Built:            Mon Mar 29 19:10:09 2021
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.4.4-0ubuntu1~20.04.2
  GitCommit:        
 runc:
  Version:          spec: 1.0.2-dev
  GitCommit:        
 docker-init:
  Version:          0.19.0
  GitCommit:        

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented May 15, 2021

Excellent, thank you all!

Yes -- it's not necessarily expected to work with kind v0.10, in particular kubernetes 1.21 had a breaking change so that new test image is meant to be run with a newer kind binary.

We'll be releasing with this soon, probably sometime Monday, this is the last thing we really wanted to make sure we got in v0.11 😅

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented May 18, 2021

I had a few other things going on today myself, but the key reason we've not released today is kind-ci/containerd-nightlies#19

runc rc93 has a regression we're currently shipping @ HEAD that we thought we'd already moved past by upgrading containerd, but didn't actually pickup due to the bug above.

That should be resolved after re-running the build with the linked patch there, and then updating the kind images. After that we should be pretty clear to release.

#2236 is concerning for non-systemd hosts, but there's no actionable root cause yet and we expect most container hosts will be running systemd, so no plan to block on that.

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented May 18, 2021

runc rc94 fixes are in. @amwat is cutting the release now.

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented May 18, 2021

https://github.com/kubernetes-sigs/kind/releases/tag/v0.11.0#contributors arm64 is out now.

SirNiggo added a commit to SirNiggo/kind that referenced this issue Jun 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment