Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix LXC virtualization facts #58881

Open
wants to merge 1 commit into
base: devel
from

Conversation

Projects
None yet
4 participants
@silverwind
Copy link
Contributor

commented Jul 9, 2019

SUMMARY

LXD guests were wrongly detected when running as non-root (making the check for /proc/1/environ fail) and when systemd was absent (making the check for /run/systemd/container fail).

Fixed this by adding a check for the LXD host <-> guest communication socket /dev/lxd/sock which is almost guaranteed to exist, thought it can theoretically be disabled by configuration.

Ref: lxc/lxd#5923 (comment)
Ref: https://lxd.readthedocs.io/en/latest/dev-lxd/

ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME

facts

ADDITIONAL INFORMATION

Before

"ansible_virtualization_role": "host",
"ansible_virtualization_type": "kvm",

After

"ansible_virtualization_role": "lxc",
"ansible_virtualization_type": "guest",
@silverwind

This comment has been minimized.

Copy link
Contributor Author

commented Jul 9, 2019

There's an option to add another check for existance of /var/lib/lxd/devlxd for the host role, but I think generally, hosts do run systemd so they'd be covered via the /run/systemd/container check, so I opted to not do it. If we want to cover ancient LXD systems without systemd, I guess it should be added, thought.

@silverwind

This comment has been minimized.

Copy link
Contributor Author

commented Jul 9, 2019

Decided to add the host role check as well. Testing against a Ubuntu 18.04 LXD node running various LXC guests showed it as

"ansible_virtualization_role": "host",
"ansible_virtualization_type": "kvm",

which is just wrong. Checking on the system, I see /run/systemd/container is absent so that check won't work there as I had initially assumed. After the fix, it now correctly shows the host role:

"ansible_virtualization_role": "host",
"ansible_virtualization_type": "lxc",

@silverwind silverwind force-pushed the silverwind:lxc-detect branch from b56abdc to 318ed04 Jul 9, 2019

@silverwind silverwind changed the title Fix LXD guest detection for non-root/non-systemd Facts: Fix LXD detection for non-root/non-systemd Jul 9, 2019

@ansibot ansibot added needs_revision and removed core_review labels Jul 9, 2019

@samdoran

This comment has been minimized.

Copy link
Member

commented Jul 11, 2019

Please create unit tests and a changelog fragment. See this fragment as an example.

There are tests in test/units/module_utils/facts/test_ansible_collector.py but it may be easier to create newer pytest-style tests for testing this specific case.

@silverwind

This comment has been minimized.

Copy link
Contributor Author

commented Jul 11, 2019

Any pointers for writing tests which mock a remote file system and then assert the facts for those file systems? Do such tests exist yet?

@samdoran

This comment has been minimized.

Copy link
Member

commented Jul 11, 2019

@silverwind I do not believe we have any tests that do this currently. Go ahead and add the changelog fragment and I'll see if I can come up with some tests.

@silverwind

This comment has been minimized.

Copy link
Contributor Author

commented Jul 11, 2019

Will do. Maybe a rudimentary mock like this could suffice for some basic testing:

{
  "/proc/1/cgroup":  "file content",
  "/dev/lxd/sock": "", # no content
  "/var/lib/lxd/devlxd": "" # no content
}

Thought generally, I think the whole concept of ansible_virtualization_role/type is flawed because a system can host multiple virtualization techs, e.g. LXD, Docker and KVM can all co-exist on the same machine, but we can only return one.

The thing with LXD is that if someone has it installed, it's rather likely that it's their virtualization of choice, so that's why I added the check near the top of the file.

@silverwind

This comment has been minimized.

Copy link
Contributor Author

commented Jul 11, 2019

Did a small change to the host detection. Because LXD docs were apparently wrong, I did only check for the directory containing the socket, not the socket file itself. Filed lxc/lxd#5941 to fix their docs.

Fix LXC virtualization facts
Previously, LXC detection did not work reliably for both the host and
guest rules. For guests, the check did require root privileges and for
hosts, there was no code path that ended in a lxc host role at all.

Added two new check that look for the existance of the LXD communication
socket which is very likely to be present on both host and guest.

Ref: lxc/lxd#5923 (comment)
Ref: https://lxd.readthedocs.io/en/latest/dev-lxd/

@silverwind silverwind force-pushed the silverwind:lxc-detect branch from 0a20512 to d649d24 Jul 11, 2019

@silverwind silverwind changed the title Facts: Fix LXD detection for non-root/non-systemd Fix LXC virtualization facts Jul 11, 2019

@silverwind

This comment has been minimized.

Copy link
Contributor Author

commented Jul 11, 2019

Force-pushed with changelog fragment and a new commit message.

@samdoran

This comment has been minimized.

Copy link
Member

commented Jul 11, 2019

Will do. Maybe a rudimentary mock like this could suffice for some basic testing:

That's what I was thinking as well. Just patch things to behave like files exist.

Thought[sic.] generally, I think the whole concept of ansible_virtualization_role/type is flawed because a system can host multiple virtualization techs, e.g. LXD, Docker and KVM can all co-exist on the same machine, but we can only return one.

Also true. We were discussing this today and it would make more sense for this to be a list since there can be a combination of virtualization going on. This has compatibility implications, but your point is valid.

@silverwind

This comment has been minimized.

Copy link
Contributor Author

commented Jul 12, 2019

This has compatibility implications

Yes, of course. One way I see it done could be:

  • Indroduce a new list-based fact virtualization_roles in the format ["lxc-host", "docker-host"], can be empty too. The first item in the list must correspond to what the module currently returns.
  • Deprecate virtualization_role and virtualization_type and set them to use the first item from the new fact.
  • A few versions later, remove the old facts variables.
@bcoca

This comment has been minimized.

Copy link
Member

commented Jul 12, 2019

That has mostly been my plan for a while, but we was waiting for a facility to 'deprecate specific variables' which we have not been able to add yet. Until that happens, we could just go with the alternate keys and document that these are 'more precise' than the old ones.

@silverwind

This comment has been minimized.

Copy link
Contributor Author

commented Jul 12, 2019

Agree, let's not do deprecation/removal.

I generally came to like software that never does breaking changes, because those are always painful to users and in a case like facts, very unwarranted because the cost of keeping a copied fact around is pretty much zero.

@ansibot ansibot added the stale_ci label Jul 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.