New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1789260: Disable cpu info metric on non x86 arches #52
Bug 1789260: Disable cpu info metric on non x86 arches #52
Conversation
On an IBM Z system it was observed that the parsecpuinfo function was crashing because the parsing of the CPU info is specific to x86 today - https://bugzilla.redhat.com/show_bug.cgi?id=1789260. For now, the short term fix is to disable cpu info metrics on non x86 arches and in the meanwhile we can work on the fix for supporting cpu info metric reporting for multi-arches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pgier the issue would need to be reported/fixed in procfs upstream AFAICT. scratch that, it seems that the bug is in the code that's been added to collect cpu_info metrics so it should better to fix this line (and possibly others that do the same thing):
cpuinfo[i].VendorID = field[1] |
@@ -77,7 +78,7 @@ func NewCPUCollector() (Collector, error) { | |||
|
|||
// Update implements Collector and exposes cpu related metrics from /proc/stat and /sys/.../cpu/. | |||
func (c *cpuCollector) Update(ch chan<- prometheus.Metric) error { | |||
if *enableCPUInfo { | |||
if *enableCPUInfo && (runtime.GOARCH == "amd64") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we do runtime.GOARCH != "s309x"
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a problem for PPC too. And i am not sure if ARM or other arches will have the same problem...so i just enabled it only for x86
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the parsecpuinfo function is x86 specific which is also why I excluded for other arches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can tell on PPC and other platforms it shouldn't crash, but it won't have all the same info available. I think the suggested change to only activate on amd64 makes sense for now. We can add back other platforms in the future as needed with better testing (possibly upstream).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PPC doesn't crash, but it doesn't parse correctly either. s390x crashes because the format is different..
/test e2e-aws |
/approve |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: crawford, pgier, Prashanth684 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cherry-pick release-4.3 |
@pgier: new pull request created: #53 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cherry-pick release-4.2 |
@pgier: new pull request created: #54 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@Prashanth684: All pull requests linked via external trackers have merged. Bugzilla bug 1789260 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
On an IBM Z system it was observed that the parsecpuinfo function was crashing because the parsing of the CPU info is specific to x86 today - https://bugzilla.redhat.com/show_bug.cgi?id=1789260. For now, the short term fix is to disable cpu info metrics on non x86 arches and in the meanwhile we can work on the fix for supporting cpu info metric reporting for multi-arches.