Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cpu architecture detection issue on linux/arm #7636

Merged

Conversation

fangn2
Copy link
Contributor

@fangn2 fangn2 commented Nov 7, 2022

This PR is to fix cpu architecture detection issues #4109, #21, #1146 when images/containers in ARM arch were built/executed on x86 host. The root cause of the issue is the getCPUVariant function looks for "Cpu architecture" field in /proc/cpuinfo to determine the actual ARM variant. The function will fail if the ARM containers/images ran in a user-mode emulated environment(like qemu) on a x86 host whose /proc/cpuinfo doesn't have "Cpu architecture" field and whose content is shared with ARM containers.
The PR adds a fall back func to get cpu variant from uname machine field through a system call if the previous method failed.

This PR is inspired by the closed PR.
This PR also splits different platforms(Linux and ! Linux) into different go files.

Test method: local unit test(we don't have the test infrastructure(x86 host with ARM qemu) required in CI
Test environment setup:

  1. On x86_64 host
$ uname -a
Linux compute.internal 5.10.135-122.509.amzn2.x86_64 SMP Thu Aug 11 22:41:16 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  1. create an arm64 arch container with qemu
$ docker run --rm -v ~/containerd:/home/containerd -it --platform arm64 golang:1.19-buster
root@c2cedcd6f6e8:/go# uname -a
Linux c2cedcd6f6e8 5.10.135-122.509.amzn2.x86_64 #1 SMP Thu Aug 11 22:41:16 UTC 2022 aarch64 GNU/Linux
  1. Run unit test for TestCPUVariant prior to the PR change
root@a2670a5d75ca:/home/containerd/platforms# go test -v -run TestCPUVariant
=== RUN   TestCPUVariant
ERRO[0000] failure getting variant                       error="getCPUInfo for pattern: Cpu architecture: not found"
    cpuinfo_test.go:39: could not get valid variant as expected: [v8 v7 v6 v5 v4 v3]
--- FAIL: TestCPUVariant (0.03s)
FAIL
exit status 1
FAIL	github.com/containerd/containerd/platforms	0.130s
  1. Run unit test after the PR change
root@a2670a5d75ca:/home/containerd/platforms# go test -v -run TestCPUVariant
=== RUN   TestCPUVariant
    cpuinfo_test.go:34: got valid variant as expected: "v8" = "v8"
--- PASS: TestCPUVariant (0.03s)
PASS
ok  	github.com/containerd/containerd/platforms	0.128s

@k8s-ci-robot
Copy link

Hi @fangn2. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fangn2 fangn2 force-pushed the fix-cpu-architecture-detection-issue-on-arm branch from 0a3ea31 to df1b4bc Compare November 7, 2022 20:59
platforms/cpuinfo.go Outdated Show resolved Hide resolved
platforms/cpuinfo.go Outdated Show resolved Hide resolved
platforms/cpuinfo.go Outdated Show resolved Hide resolved
platforms/cpuinfo.go Outdated Show resolved Hide resolved
Copy link
Member

@kzys kzys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow https://github.com/containerd/project/blob/main/CONTRIBUTING.md#commit-messages regarding commit messages. Some of the information you put in the PR should be in the commit message too.

@fangn2 fangn2 force-pushed the fix-cpu-architecture-detection-issue-on-arm branch 2 times, most recently from 913825a to c024ef7 Compare November 16, 2022 00:12
platforms/cpuinfo.go Outdated Show resolved Hide resolved
platforms/cpuinfo.go Outdated Show resolved Hide resolved
@henry118
Copy link
Member

A more generic question:

Isn't the proc filesystem supposed to provide an interface to kernel data structures? Even under QEMU isn't it expected to expose the cpuinfo of the emulated guest OS?

Just wonder if we are actually trying to work around an issue that should be fixed at some lower level. Or did I misunderstand something?

platforms/cpuinfo_linux.go Outdated Show resolved Hide resolved
platforms/cpuinfo_linux.go Outdated Show resolved Hide resolved
platforms/cpuinfo_linux.go Outdated Show resolved Hide resolved
platforms/cpuinfo_linux.go Outdated Show resolved Hide resolved
@fangn2 fangn2 force-pushed the fix-cpu-architecture-detection-issue-on-arm branch 9 times, most recently from d9d02fe to 3454bc2 Compare November 28, 2022 16:52
@fangn2 fangn2 marked this pull request as draft November 29, 2022 03:59
@fangn2 fangn2 force-pushed the fix-cpu-architecture-detection-issue-on-arm branch 2 times, most recently from 9fc133b to bec21d5 Compare November 29, 2022 16:52
@fangn2 fangn2 marked this pull request as ready for review November 29, 2022 18:02
@fangn2 fangn2 requested review from henry118 and kzys and removed request for henry118 and kzys November 29, 2022 18:10
@fangn2 fangn2 force-pushed the fix-cpu-architecture-detection-issue-on-arm branch 2 times, most recently from c69ad08 to 9bfec67 Compare December 19, 2022 22:26
@fangn2 fangn2 force-pushed the fix-cpu-architecture-detection-issue-on-arm branch 4 times, most recently from 8c7ff37 to 12169f3 Compare December 20, 2022 01:22
@fangn2 fangn2 marked this pull request as ready for review December 20, 2022 01:35
Copy link
Member

@kzys kzys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks almost good!

platforms/cpuinfo.go Outdated Show resolved Hide resolved
platforms/cpuinfo_linux_test.go Outdated Show resolved Hide resolved
@fangn2 fangn2 force-pushed the fix-cpu-architecture-detection-issue-on-arm branch 3 times, most recently from 5c58cc4 to 01f1598 Compare December 20, 2022 19:54
Copy link
Member

@kzys kzys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good. Can you clean up the commit history?

When images/containers in ARM arch were built/executed on x86 host,
getCPUVariant will fail as it tries to look for /proc/cpuinfo, whose
content is from the host. Adding a new method as fallback to check uname
machine when it happens.

Signed-off-by: Tony Fang <nenghui.fang@gmail.com>
Add unit test to function GetCPUVariantFromArch
Fix import issue on non-linux platforms
Fix some style issue

Signed-off-by: Tony Fang <nenghui.fang@gmail.com>
@fangn2 fangn2 force-pushed the fix-cpu-architecture-detection-issue-on-arm branch from 01f1598 to 6e55234 Compare December 20, 2022 22:00
@fangn2
Copy link
Contributor Author

fangn2 commented Dec 20, 2022

Done, commit history cleaned up.

Copy link
Member

@kzys kzys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'm fine merging the two commits. If we revert the first commit, we have to revert the second as well, so they are not logically separated. But not a blocker.

@dmcgowan dmcgowan merged commit c0c3546 into containerd:main Dec 21, 2022
@fangn2 fangn2 deleted the fix-cpu-architecture-detection-issue-on-arm branch December 21, 2022 20:02
@roshanr95
Copy link

Any (rough) idea when this will make it into a release? Seems like it didn't make 1.6.15...

@fangn2
Copy link
Contributor Author

fangn2 commented Jan 27, 2023

It's part of the coming 1.7 release.

I am not sure if maintainers are ok to backport this change to 1.6. Open for suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

9 participants