Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interfaces: add microstack-support interface #8926

Merged
merged 20 commits into from
Aug 31, 2021

Conversation

dshcherb
Copy link
Contributor

@dshcherb dshcherb commented Jun 25, 2020

Add an interface to enable MicroStack to work in a confined environment.

@dshcherb dshcherb marked this pull request as draft June 25, 2020 14:26
@anonymouse64 anonymouse64 added the Needs security review Can only be merged once security gave a :+1: label Jun 25, 2020
Copy link
Collaborator

@zyga zyga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review the policy yet. Some quick comments:

  • there needs to be a new plug/slot added to a spread test that checks each kind of interface - the test failures usually indicate which
  • many tests need changes to the tests in the interfaces/policy package

Copy link

@jdstrand jdstrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this. Lots of things to change, but much is just removing things that exist in other interfaces. Please do see various questions inline.

interfaces/builtin/microstack_support.go Show resolved Hide resolved
interfaces/builtin/microstack_support.go Show resolved Hide resolved
interfaces/builtin/microstack_support.go Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@dshcherb dshcherb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented where things needed changing or clarifying and submitted a follow-up commit.

I also added most of the rules needed for LVM and loop device interaction but I need to test a few more things to complete that.

Other basic functionality worked the same way with the updated interface (booting a VM and connecting to it via ssh over the networking layout created for/via OVN).

interfaces/builtin/microstack_support.go Show resolved Hide resolved
interfaces/builtin/microstack_support.go Show resolved Hide resolved
interfaces/builtin/microstack_support.go Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
@dshcherb dshcherb changed the title Add microstack-support interface interfaces: add microstack-support interface Jul 8, 2020
@dshcherb dshcherb requested a review from jdstrand July 8, 2020 10:26
@dshcherb dshcherb force-pushed the 2020-04-28-microstack-interface branch from 60402cc to 987b286 Compare August 5, 2020 11:24
@dshcherb
Copy link
Contributor Author

dshcherb commented Aug 5, 2020

Rebased on top of recent snapd changes and added rules for the LVM support.

One more thing besides the LVM support that's needed for MicroStack to have an LVM-backed volume backend is running tgtd. This has to do with the way Cinder exposes access to LVM volumes to VMs - it is done via iSCSI. There are multiple reasons why that's the case:

  • a VM might be live migrated from a host with the LVM volume but the volume will stay at the old node;
  • LVM volumes may be created on a storage node so they need to be accessed over a network.

In the absence of a better volume backend, LVM provides a way to test volume creation which is what is needed to improve MicroStack gate tests since certain tempest tests require volume-backed instances to be created.

I am going to check which rules will be needed for running tgtd and if the interface will need to be modified further to make that work.

@dshcherb dshcherb force-pushed the 2020-04-28-microstack-interface branch from 987b286 to 4db6c1a Compare August 24, 2020 18:02
@dshcherb
Copy link
Contributor Author

Added the iSCSI bits in 4db6c1a.

Unfortunately, upstream OpenStack does not currently allow one to leverage the QEMU's built-in iscsi support, therefore, MicroStack has to include iscsid in addition to tgtd.

Using iscsid is certainly more complicated as it uses the kernel data plane via the iscsi_tcp kernel module which means that snap removal becomes a more complex exercise involving the iSCSI session cleanup and that the microstack_support interfaces needs to be much more privileged in terms of sysfs access.

@dshcherb
Copy link
Contributor Author

For reference, when tested with snapd built with this PR applied and a test-time workaround for LP #1892895 applied, I managed to get 0 failures of Refstack tests.

https://review.opendev.org/#/c/738242/9
https://zuul.opendev.org/t/openstack/build/19f8f05e192b471cbdb24c8d48467c47/log/job-output.txt#31891

2020-08-26 12:05:37.874765 \| ubuntu-bionic \| Ran: 104 tests in 629.866 sec.
2020-08-26 12:05:37.874779 \| ubuntu-bionic \|  - Success: 102
2020-08-26 12:05:37.874818 | ubuntu-bionic |  - Failures: 0

This is to show the applicability of the interface code to the target workload.

Dmitrii Shcherbakov added 6 commits August 28, 2020 12:53
Add an interface to enable MicroStack to work in a confined environment.
OpenStack relies on the following components when working with LVM
volumes:

* tgtd - a daemon that exposes block devices via iSCSI;
* scsid - a control plane daemon for the iSCSI initiator data plane
implemented in the iscsi_tcp module;
* iscsi_tcp kernel module.

Working with the iSCSI kernel stack requires a more privileged access to
sysfs to that iscsi-adm and iscsid can do their job.
After several tempest test runs, based on the kernel logs, it became
apparent that libvirt requires a wider rw access to the hierarchy under
/sys/fs/cgroup/*/machine.
@dshcherb dshcherb force-pushed the 2020-04-28-microstack-interface branch from 4db6c1a to 3584a4b Compare August 28, 2020 12:57
@dshcherb dshcherb marked this pull request as ready for review August 28, 2020 15:53
@dshcherb dshcherb requested a review from zyga August 28, 2020 18:53
peteyan pushed a commit to peteyan/microstack that referenced this pull request Jan 7, 2021
Major changes:

* Plumbing necessary for strict confinement with
  the microstack-support interface
  snapcore/snapd#8926
  * Until the interface is merged, devmode will be used and kernel
    modules will be loaded via an auxiliary service.
* upgraded OpenStack components to Focal (20.04) and OpenStack Ussuri;
  * reworked the old patches;
  * added the Placement service since it is now separate;
  * addressed various build issues due to changes in snapcraft and
    built dependencies:
    * e.g. libvirt requires the build directory to be separate from the
      source directory) and LP: #1882255;
    * LP: #1882535 and pypa/pip#8414
    * LP: #1882839
    * LP: #1885294
    * https://storyboard.openstack.org/#!/story/2007806
    * LP: #1864589
    * LP: #1777121
    * LP: #1881590
* ML2/OVS replated with ML2/OVN;
  * dnsmasq is not used anymore;
  * neutron l3 and DHCP agents are not used anymore;
  * Linux network namespaces are only used for
    neutron-ovn-metadata-agent.
  * ML2 DNS support is done via native OVN mechanisms;
  * OVN-related database services (southbound and northbound dbs);
  * OVN-related control plane services (ovn-controller, ovn-northd);
* core20 base support (bionic hosts are supported);
* the removal procedure now relies on the "remove" hook since `snap
remove` cannot be used from the confined environment anymore;
* prerequisites to enabling AppArmor confinement for QEMU processes
  created by the confined libvirtd.
* Added the Spice html5 console proxy service to enable clients to
  retrieve and use it via
  `microstack.openstack console url show --spice <servername>`.
* Added missing Cinder templates and DB migrations for the Cinder DB.
* Added experimental support for a loop device-based LVM backend for
  Cinder. Due to LP: #1892895 this is not recommended to be used in
  production except for tempest testing with an applied workaround;
  * includes iscsid and iscsi-tcp kernel module loading;
  * includes LIO and loading of relevant kernel modules;
  * An LVM PV is created on top of a loop device with a backing file
  present in $SNAP_COMMON/cinder-lvm.img;
  * A VG is created on top of the PV;
  * LVs are created by Cinder and exported via LIO over iscsi to iscsid
  which hot-plugs new SCSI devices. Those SCSI devices are then
  propagated by Nova to libvirt and QEMU during volume attachment;
* Added post-deployment testing via rally and tempest (via the
  microstack-test snap). A set of tests included into Refstack 2018.02
  is executed (except for object storage tests due to the lack of object
  storage support).

Change-Id: Ic70770095860a57d5e0a55a8a9451f9db6be7448
Copy link
Contributor

@mardy mardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've written some comments inline.

interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Show resolved Hide resolved
interfaces/builtin/microstack_support.go Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
@mardy
Copy link
Contributor

mardy commented May 12, 2021

Hi again :-) I've been thinking a bit about this interface, and it seems to me that, no matter how carefully we write it, we are always leaving the door open for a process running inside the snap to break out of the confinement, if we allow libvirt to create apparmor profiles and run new processes with the newly created profiles. That's because the process running libvirt could be compromised, or there might be bugs that make it so that the generated apparmor profiles are way more permissive than what we'd like them to be.

This would not be an issue if we could rely on the NoNewPrivileges flag, because then we would be sure that only profile transitions reducing the permission set would be allowed; unfortunately, though, for reasons that I've yet to fully grasp, we are not setting that flag in snap-confine.

So, I wonder if it wouldn't be better, until we have the NoNewPrivileges flag set, to disable apparmor in libvirt and have all processes inside this snap running under the same AppArmor profile. It's a suboptimal solution, but at least it would guarantee us that no process from inside this snap can escape the confinement (as long as we don't have cap_sys_admin enabled).

@jdstrand
Copy link

jdstrand commented May 12, 2021

Hi again :-) I've been thinking a bit about this interface, and it seems to me that, no matter how carefully we write it, we are always leaving the door open for a process running inside the snap to break out of the confinement, if we allow libvirt to create apparmor profiles and run new processes with the newly created profiles. That's because the process running libvirt could be compromised, or there might be bugs that make it so that the generated apparmor profiles are way more permissive than what we'd like them to be.

This would not be an issue if we could rely on the NoNewPrivileges flag, because then we would be sure that only profile transitions reducing the permission set would be allowed; unfortunately, though, for reasons that I've yet to fully grasp, we are not setting that flag in snap-confine.

So, I wonder if it wouldn't be better, until we have the NoNewPrivileges flag set, to disable apparmor in libvirt and have all processes inside this snap running under the same AppArmor profile. It's a suboptimal solution, but at least it would guarantee us that no process from inside this snap can escape the confinement (as long as we don't have cap_sys_admin enabled).

NoNewPrivileges interferes with profile transitions by snap-confine which is why we do not set it there. There has been some work in this area upstream but it is not yet available everywhere. As for setting it after snap-confine has transitioned, because of the interactions between historic apparmor and NNP, this can alter the behavior of the applications that try to use NNP themselves, etc. We've felt that the combination of the existing sandbox technologies (apparmor, seccomp, etc) achieved many of the same goals as NNP but with added flexibility for (privileged) interfaces that need it.

As for the observation about libvirt, you are correct. If an application can load apparmor policy and transition into it, it can trivially break out of confinement (note, libvirt (and docker, containerd, lxd, etc, etc) have many more ways to break out (consider mounting a raw disk that would allow writes that changed policy, apparmor_parser/snapd itself, etc)), so how we've decided to handle that is to called these 'superprivileged' interfaces, where the interfaces allow the application to do its job but we disallow installation/connection/etc through snapd interface policy in the snap declarations. In this manner, we might grant microk8s-support to Canonical since it is the upstream for microk8s, but disallow it to everyone else. We then document that the interface is advisory and grants ownership to the system.

In general, IMHO approaching interfaces that require lots of privilege can be done in phases. The first phase is like I describe above (lenient policy that let's them get the job done, with some guardrails (eg, let them only transition to policy with a certain naming convention, etc)). As technologies mature, later phases could further restrict (where the interface could have attributes that allow choosing the 'flavor' of the interface policy, and these attributes are mediatable in the snap declaration) and take advantage of improvements to sandboxing technologies. It's also possible to make adjustments to the internal components of the snap to understand snap sandboxing (eg, in the case of libvirt, perhaps there is a snap helper that libvirtd could call out to do policy writes/transitions or libvirt is modified to have reduced functionality when running in a snap sandbox (ie, it can't or can only attach a subset of devices, etc)).

CC @alexmurray for awareness

@anonymouse64 anonymouse64 self-requested a review May 19, 2021 12:19
@pedronis
Copy link
Collaborator

this is a high-level remark but it seems some of the accesses here like LVM or iscsi (maybe) and maybe libvirt (not sure, Jamie made that remark a while ago) could be extracted into separate -control or -support interfaces. This makes sense only if the sets are cohesive and not too specific to what microstack does vs what a generic consumer of those (kernel) interfaces would do.

@pedronis
Copy link
Collaborator

... could be extracted into separate -control or -support interfaces

there is something going in that direction here #9585

Copy link
Contributor Author

@dshcherb dshcherb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the long delay in response. Answered some in-line comments in the review.

interfaces/builtin/microstack_support.go Show resolved Hide resolved
interfaces/builtin/microstack_support.go Show resolved Hide resolved
interfaces/builtin/microstack_support.go Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
Dmitrii Shcherbakov added 2 commits July 9, 2021 16:18
* refactoring;
* added Delegate=true since libvirt manages its own cgroup subtree.
@pedronis pedronis self-requested a review July 19, 2021 15:55
@dshcherb
Copy link
Contributor Author

I just did another functional test run (Tempest, refstack subset) with locally built snapd that had microstack-support included.

All volume tests failed due to https://bugs.launchpad.net/snapd/+bug/1892895 which is not something I can fix in this interface (see the details in the bug, including comment #4).

When I manually allowed the necessary devices via wildcard rules to avoid the race condition like this

sudo bash -c 'echo b 253:* rwm > /sys/fs/cgroup/devices/snap.microstack.<service-name>/devices.allow'

I was able to make to get a full pass: https://paste.ubuntu.com/p/TB6VtxThSz/

In summary: I do not see any new rules that need to be added to this interface in order for us to get our functional tests to pass (except for volume tests for experimental/optional volume functionality).

Dmitrii Shcherbakov added 2 commits August 3, 2021 23:57
The snapd team requested a less permissive rule to be used for block
device access. While MicroStack considers volume support experimental
due to https://bugs.launchpad.net/snapd/+bug/1892895, it seems
acceptable to limit the set of VG names to the ones prefixed with
"microstack-" to avoid blocking the whole review on this. Lifting this
naming restriction will be a consideration for the future versions of
this interface.
Copy link
Contributor

@mardy mardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, just minor comments. Given that the interface is relatively trivial in logic (it's complex as far as AppArmor rules are concerned, but it's a very typical interface as far as snapd is concerned) I wouldn't insist on a spread test, but let's see what other people think.

interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support_test.go Show resolved Hide resolved
Copy link
Contributor

@mardy mardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! Let's wait, though, for the security team review (as well for some more expert team-mates of mine). :-)

Copy link
Collaborator

@alexmurray alexmurray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks pretty solid - the only glaring bit which stood out was the iptables related change to the network-control interface - this doesn't seem right to me, but otherwise this seems fine as is (whilst I like the idea of a libvirt-support or similar interface which could encapsulate some of this, I don't think it is fair to block this PR on waiting for that, especially given we don't seem to have other users for such an interface at the moment).

interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
interfaces/builtin/network_control.go Outdated Show resolved Hide resolved
@dshcherb
Copy link
Contributor Author

I did another functional test run after ebc2f4f (without volume testing since those rules were not changed). Results: https://paste.ubuntu.com/p/rPtZFv6DVg/.

Unit tests are passing and most of the spread tests too (the failures are not related to this change).

Copy link
Member

@anonymouse64 anonymouse64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, one question about non-systemd layouts and a test suggestion

interfaces/builtin/microstack_support.go Show resolved Hide resolved
interfaces/builtin/microstack_support_test.go Outdated Show resolved Hide resolved
interfaces/builtin/microstack_support.go Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Aug 23, 2021

Codecov Report

Merging #8926 (dfdb5c1) into master (8170654) will decrease coverage by 2.38%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8926      +/-   ##
==========================================
- Coverage   80.71%   78.32%   -2.39%     
==========================================
  Files         727      883     +156     
  Lines       58158    99322   +41164     
==========================================
+ Hits        46940    77797   +30857     
- Misses       7544    16634    +9090     
- Partials     3674     4891    +1217     
Flag Coverage Δ
unittests 78.32% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
daemon/api_users.go
interfaces/apparmor/apparmor.go
cmd/snap/color.go
overlord/ifacestate/udevmonitor/udevmon.go
cmd/snap/cmd_sign_build.go
interfaces/ifacetest/spec.go
errtracker/errtracker.go
cmd/snap/cmd_find.go
overlord/healthstate/healthstate.go
interfaces/builtin/gpio_control.go
... and 1516 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc6a45f...dfdb5c1. Read the comment docs.

@dshcherb
Copy link
Contributor Author

Hey @jdstrand, any chance you could approve this or dismiss the change request? I think what was requested originally has been addressed but the change request is still active.

Thanks in advance!

@anonymouse64 anonymouse64 dismissed jdstrand’s stale review August 23, 2021 12:34

pr has changed significantly

Copy link
Collaborator

@pedronis pedronis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks alright

@alexmurray alexmurray removed the Needs security review Can only be merged once security gave a :+1: label Aug 31, 2021
@mvo5 mvo5 added the Squash-merge Please squash this PR when merging. label Aug 31, 2021
@mvo5 mvo5 added this to the 2.52 milestone Aug 31, 2021
@mvo5 mvo5 merged commit 9e4cd5f into snapcore:master Aug 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Squash-merge Please squash this PR when merging.
Projects
None yet
9 participants