Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU frequency is inconsistently collected and persisted #550

Open
jyundt opened this issue May 23, 2017 · 2 comments
Open

CPU frequency is inconsistently collected and persisted #550

jyundt opened this issue May 23, 2017 · 2 comments
Assignees

Comments

@jyundt
Copy link
Contributor

jyundt commented May 23, 2017

During new asset inductions, only CPU information for the first socket is persisted. CPU information for sockets 2 - N is discarded.

As an example, given the following CPU information from LSHW, only CPU id 0 will be saved to the database. Note the differences between CPU speed:

Id Cores Threads Speed Description
0 8 16 1.560070 Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz Intel Corp.
1 8 16 1.480089 Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz Intel Corp.

As a result of this behavior, collins will drop information from the second socket:

Id Cores Threads Speed Description
0 8 16 1.560070 Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz Intel Corp.
1 8 16 1.560070 Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz Intel Corp.

This problem originally manifested itself while troubleshooting failing lshw XML tests: #537 (comment). We noticed that our servers were reporting different CPU speeds on different sockets as a result of dynamic frequency scaling. This caused tests in LshwHelperSpec to consistently fail.

As pointed out during the discussion in #537, a more appropriate fix would probably involve disabling dynamic frequency scaling in genesis to avoid different CPU speeds on different sockets.

@byxorna @michaeljs1990

@jyundt jyundt mentioned this issue May 23, 2017
@byxorna
Copy link
Contributor

byxorna commented May 23, 2017

A linked issue against Genesis should be created, to add/update a task to disable speed stepping before lshw collection.

So, collins only stores the CPU speed for the first socket, and only in one dimension? (i.e. CPU_SPEED_GHZ[0]). I wonder if there would be benefit of using dimensionality of tags to represent these values.

@jyundt
Copy link
Contributor Author

jyundt commented May 23, 2017

A linked issue against Genesis should be created, to add/update a task to disable speed stepping before lshw collection.

Will do, I can probably get a PR submitted for this as well.

So, collins only stores the CPU speed for the first socket, and only in one dimension? (i.e. CPU_SPEED_GHZ[0]). I wonder if there would be benefit of using dimensionality of tags to represent these values.

Ugh, I think I have this flipped, it's CPU[N] (the last CPU) that will be stored, not CPU[0]. Sorry for the confusion. I just verified by modifying an lshw XML dump with different product/vendor/speed information and inserting the node into collins.

I don't really have a strong preference on this dimensionality either way. Ideally all CPUs should be identical, however this speed stepping tripped us up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants