Skip to content

Conversation

@serban300
Copy link
Contributor

Issue #, if available: #1010

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@serban300 serban300 self-assigned this Apr 9, 2019
@serban300 serban300 force-pushed the serban300-cpuid-integration-test branch 2 times, most recently from 6d036de to 2ae053d Compare April 11, 2019 09:35
acatangiu
acatangiu previously approved these changes Apr 22, 2019
This is a wrapper function for calling lscpu and checking if the
command returns the expected cpu topology.
"""
def _check_guest_cmd_output(test_microvm, guest_cmd, expected_header,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd love to see this helper function in a utils file. There are several similar implementations scattered throughout our CI code-base.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just moving it in a utils file is not enough. We should also change the old code in order to call the utils method.

I created #1068 in order to deal with this issue as "technical debt"

}
_check_cpu_topology(test_microvm, expected_cpu_topology)

def _setup_microvm(test_microvm_with_ssh, network_config, vcpu_count,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see this helper fn being used anywhere. Is it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. I forgot it there. Removed.

_check_cpu_topology(test_microvm_with_ssh, 16, 2, "0-15")
_check_cpu_features(test_microvm_with_ssh, 16, "true")
_check_cache_topology(test_microvm_with_ssh, 1, 15)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the tests above seem to start the microvm and then immediately attempt to run a command through SSH. Are they always succeeding? It looks like they would sometimes fail, if the guest network was up, but sshd hadn't started.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they always work, but there was an interesting thing here. If instead of
_tap, _, _ = test_microvm_with_ssh.ssh_network_config(network_config, '1')
I would use
test_microvm_with_ssh.ssh_network_config(network_config, '1')

They would sometimes fail. Even though _tap isn't needed anywhere.

Maybe it's the problem that you mentioned. Is there any way to fix it ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you are describing looks like a design issue we have within our host_tools code: ssh_network_config() returns a self-cleaning Tap object. I.e. when that object gets dropped, it will also delete its underlying host tap interface.

So, discarding the return value of ssh_network_config() means holding no references to the returned Tap object, causing it to get dropped and take the tap interface with it. If, on the other hand, you assign it to a dummy var _tap, that var will hold a reference to the Tap object, and keep it alive until it goes out of scope (in this case, until the end of the test function).

@dhrgit
Copy link
Contributor

dhrgit commented Apr 22, 2019

Could you add some more comments through the code and the commit messages, please? I find this particular part of the code base especially difficult to follow, if one is to read this with the code in one hand and the CPU vendor manual in the other.

I think we could use some more code comments, with enough details on CPU features / architecture and their respective CPUID leaves, such that, if we have to revisit this code later, they'll save us some time sifting through the vendor manuals.

Serban Iorga added 5 commits April 22, 2019 17:01
per AMD EPYC PPR specifications

Signed-off-by: Serban Iorga <seriorga@amazon.com>
KVM sets the largest extended function to 0x80000000.
We have to change it to 0x8000001f since we also use the leaf 0x8000001d.

Signed-off-by: Serban Iorga <seriorga@amazon.com>
We need to enable it since we use the Extended Cache Topology leaf.

Signed-off-by: Serban Iorga <seriorga@amazon.com>
We don't support more then 64 threads right now.
It's safe to put them all on the same processor.

Signed-off-by: Serban Iorga <seriorga@amazon.com>
- set the extended apic id
- fix the number of threads per core
- set the number of nodes per processor

Signed-off-by: Serban Iorga <seriorga@amazon.com>
@serban300 serban300 force-pushed the serban300-cpuid-integration-test branch from c4a9740 to 6502826 Compare April 22, 2019 14:13
@serban300
Copy link
Contributor Author

Could you add some more comments through the code and the commit messages, please? I find this particular part of the code base especially difficult to follow, if one is to read this with the code in one hand and the CPU vendor manual in the other.

I think we could use some more code comments, with enough details on CPU features / architecture and their respective CPUID leaves, such that, if we have to revisit this code later, they'll save us some time sifting through the vendor manuals.

Yes, sorry about that. I definitely agree. I added some more comments in the code. and in the commits.

Signed-off-by: Serban Iorga <seriorga@amazon.com>
@serban300 serban300 force-pushed the serban300-cpuid-integration-test branch from 6502826 to b8550cc Compare April 22, 2019 14:44
@serban300
Copy link
Contributor Author

@dianpopa @dhr @acatangiu I addressed all the comments. Please take another look.

@acatangiu acatangiu merged commit d066dd3 into firecracker-microvm:master Apr 23, 2019
@serban300 serban300 deleted the serban300-cpuid-integration-test branch May 10, 2021 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants