-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Add cpuid features integration tests #1052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cpuid features integration tests #1052
Conversation
6d036de to
2ae053d
Compare
| This is a wrapper function for calling lscpu and checking if the | ||
| command returns the expected cpu topology. | ||
| """ | ||
| def _check_guest_cmd_output(test_microvm, guest_cmd, expected_header, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd love to see this helper function in a utils file. There are several similar implementations scattered throughout our CI code-base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just moving it in a utils file is not enough. We should also change the old code in order to call the utils method.
I created #1068 in order to deal with this issue as "technical debt"
| } | ||
| _check_cpu_topology(test_microvm, expected_cpu_topology) | ||
|
|
||
| def _setup_microvm(test_microvm_with_ssh, network_config, vcpu_count, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see this helper fn being used anywhere. Is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. I forgot it there. Removed.
| _check_cpu_topology(test_microvm_with_ssh, 16, 2, "0-15") | ||
| _check_cpu_features(test_microvm_with_ssh, 16, "true") | ||
| _check_cache_topology(test_microvm_with_ssh, 1, 15) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the tests above seem to start the microvm and then immediately attempt to run a command through SSH. Are they always succeeding? It looks like they would sometimes fail, if the guest network was up, but sshd hadn't started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they always work, but there was an interesting thing here. If instead of
_tap, _, _ = test_microvm_with_ssh.ssh_network_config(network_config, '1')
I would use
test_microvm_with_ssh.ssh_network_config(network_config, '1')
They would sometimes fail. Even though _tap isn't needed anywhere.
Maybe it's the problem that you mentioned. Is there any way to fix it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you are describing looks like a design issue we have within our host_tools code: ssh_network_config() returns a self-cleaning Tap object. I.e. when that object gets dropped, it will also delete its underlying host tap interface.
So, discarding the return value of ssh_network_config() means holding no references to the returned Tap object, causing it to get dropped and take the tap interface with it. If, on the other hand, you assign it to a dummy var _tap, that var will hold a reference to the Tap object, and keep it alive until it goes out of scope (in this case, until the end of the test function).
|
Could you add some more comments through the code and the commit messages, please? I find this particular part of the code base especially difficult to follow, if one is to read this with the code in one hand and the CPU vendor manual in the other. I think we could use some more code comments, with enough details on CPU features / architecture and their respective CPUID leaves, such that, if we have to revisit this code later, they'll save us some time sifting through the vendor manuals. |
2ae053d to
c4a9740
Compare
per AMD EPYC PPR specifications Signed-off-by: Serban Iorga <seriorga@amazon.com>
KVM sets the largest extended function to 0x80000000. We have to change it to 0x8000001f since we also use the leaf 0x8000001d. Signed-off-by: Serban Iorga <seriorga@amazon.com>
We need to enable it since we use the Extended Cache Topology leaf. Signed-off-by: Serban Iorga <seriorga@amazon.com>
We don't support more then 64 threads right now. It's safe to put them all on the same processor. Signed-off-by: Serban Iorga <seriorga@amazon.com>
- set the extended apic id - fix the number of threads per core - set the number of nodes per processor Signed-off-by: Serban Iorga <seriorga@amazon.com>
c4a9740 to
6502826
Compare
Yes, sorry about that. I definitely agree. I added some more comments in the code. and in the commits. |
Signed-off-by: Serban Iorga <seriorga@amazon.com>
6502826 to
b8550cc
Compare
|
@dianpopa @dhr @acatangiu I addressed all the comments. Please take another look. |
Issue #, if available: #1010
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.