Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFE-1563: Fix logic to detect when running under a Xen Hypervisor #2746

Closed
wants to merge 1 commit into from

Conversation

skreuzer
Copy link
Contributor

On Linux the Xen_Hv_Check() function can never be called because the stat on
/proc/self/status will always return true which results in the OpenVZ_Detect()
function always being called. Move the call to Xen_Hv_Check() to happen if the
stat on /proc/xen/capabilities returns false since this is an indication that
the host is acting as dom0.

Change the Xen_Hv_Check() function to scan the CPUID range from
0x40000000 to 0x40010000 looking for Xen leaves.

@skreuzer
Copy link
Contributor Author

When running cfengine with this change on a Xen guest the output looks like this:

cf-promises --show-classes --verbose | grep -i xen
 verbose: Found Xen CPUID Leaf. eax=40000004 at base 40000000
 verbose: This appears to be a xen hv system.
 verbose: C: discovered hard class xen
 verbose: C: discovered hard class xen_domu_hv
xen                                                          inventory,attribute_name=Virtual host,source=agent,hardclass
xen_domu_hv                                                  source=agent,hardclass

skreuzer referenced this pull request in bsd-hacker/freebsd-ports Dec 7, 2016
Reviewed by:	cy, gjb
Approved by:	cy
Differential Revision:	D8607
{
if ((eax - base) < 2)
{
Log(LOG_LEVEL_ERR, "Insufficient Xen CPUID Leaves. eax=%x at base %x\n", eax, base);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the odds that this error occurs? If Xen is not detected we wouldn't like to log errors in the console. Maybe downgrade verbosity to LOG_LEVEL_VERBOSE?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the odds that this error occurs? If Xen is not detected we wouldn't like to log errors in the console. Maybe downgrade verbosity to LOG_LEVEL_VERBOSE?

The odds of this ever happening are very slim. For that code to execute it means that the call to get the CPU ID returned XenVMMXenVMM meaning that the host is running on a hypervisor but someone was mucking around with an invalid CPUID mask.

@jimis
Copy link
Contributor

jimis commented Feb 23, 2017

@skreuzer It's good for merging after we figure out the logging issue, sorry I couldn't review earlier.

Xen_Cpuid(0x40000000, &eax, &sig.u[0], &sig.u[1], &sig.u[2]);

if (strcmp("XenVMMXenVMM", sig.s) || (eax < 0x40000002))
for (base = 0x40000000; base < 0x40010000; base += 0x100)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the logic in going from 0x40000000 to 0x40010000? I don't see this value mentioned in the link in the comment, and it's not very well explained in the commit message why this is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the logic in going from 0x40000000 to 0x40010000?

Because hypervisor CPUID leaves don't have a standard Xen can be found at the first unused 0x100 aligned offset between 0x40000000 and 0x40010000. While most of the time the vendor id will be found at address 0x40000000, to properly detect if Xen is the hypervisor it is necessary to scan this whole range.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unfortunately is causing segfaults on our older platforms (RHEL4 in particular). I still don't fully understand the choice of this value, since I can't see it mentioned anywhere, but admittedly I don't know much about Xen detection.

Did you just choose a value that you thought was "high enough"? Do you think it might fix the segfaults if we reduced this value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To properly detect Xen you need to scan the leaves between 0x40000000 to 0x40010000. Take a look at xen-detect.c for some additional details.

I am not quite sure why this is segfaulting and since I don't have access to a box running RHEL 4, per your suggestion on #cfengine-dev, the new pull request will just return false if the redhat_4 or centos_4 class is defined.

A quick test:

# /var/cfengine/bin/cf-promises --show-classes --debug | grep -i Xen
 verbose: Found Xen CPUID Leaf. eax=40000002 at base 40000000
 verbose: This appears to be a xen hv system.
   debug: Setting hard class: default:xen
   debug: Setting hard class: default:xen_domu_hv
 verbose: C: discovered hard class xen
 verbose: C: discovered hard class xen_domu_hv
xen                                                          inventory,attribute_name=Virtual host,source=agent,hardclass
xen_domu_hv                                                  source=agent,hardclass

# /var/cfengine/bin/cf-promises --show-classes -Dredhat_4 --debug | grep -i Xen
   debug: Skipping Xen_Hv_Check() to avoid a segfault on RHEL 4

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @skreuzer! Retrying...

@ghost ghost self-assigned this Apr 5, 2017
@ghost ghost mentioned this pull request Apr 7, 2017
On Linux the Xen_Hv_Check() function can never be called because the stat on
/proc/self/status will always return true which results in the OpenVZ_Detect()
function always being called. Move the call to Xen_Hv_Check() to happen if the
stat on /proc/xen/capabilities returns false since this is an indication that
the host is acting as dom0.

Change the Xen_Hv_Check() function to scan the CPUID range from
0x40000000 to 0x40010000 looking for Xen leaves.

Return false if the class redhat_4 or centos_4 is defined to avoid a segfault
on these platforms.
ghost pushed a commit that referenced this pull request Apr 18, 2017
* skreuzer/xen:
  CFE-1563: Fix logic to detect when running under a Xen Hypervisor
@ghost
Copy link

ghost commented Apr 18, 2017

Thanks @skreuzer, that did the trick! Merged manually after adding changelog and a reference to the xen-detect.c for future readers.

@ghost ghost closed this Apr 18, 2017
@ghost
Copy link

ghost commented Apr 18, 2017

Cherry-picked to 3.7.x and 3.10.x as well.

@skreuzer skreuzer deleted the xen branch August 2, 2017 01:47
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants