Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

many: use transient scope for tracking apps and hooks #7825

Merged
merged 199 commits into from
Aug 24, 2020

Conversation

zyga
Copy link
Contributor

@zyga zyga commented Nov 29, 2019

This patch entirely changes how snapd tracks programs for the purpose
of refresh-app-awareness feature. The old method was using cgroup-v1
only behavior that was stealing processes from systemd's point of view.
The new behavior relies on a DBus API call to systemd, to create
a transient scope with a name that allows snapd to derive snap security
tag from. Due to the requirements of the used API the scope name
must be unique. This is enforced by systemd and handled by creating
a scope with the name "snap.${uuid}.${package}.${app}.scope" for apps
and "snap.${uuid}.${package}.hook.${hook}.scope" for hooks. Services
have a .service unit that is giving us equivalent information so they
do not have or need a scope. The UUID is generated by the kernel.

The old approach relied on privileged snap-confine to alter cgroup
configuration. The new approach requires an unprivileged DBus call and
was thus moved to snap-run. The associated C code and apparmor permissions
have been removed.

The new approach does not support systemd used on Ubuntu 14.04.
The feature is not effective there and application tracking fails
silently.

The code has several TODOs that I plan to address as follow-ups,
given that the feature is behind a feature flag anyway:

  • add additional spread test for tracking services
  • add additional spread test for tracking non-root applications

Signed-off-by: Zygmunt Krynicki me@zygoon.pl

This patch entirely changes how snapd tracks programs for the purpose
of refresh-app-awareness feature. The old method was using cgroup-v1
only behavior that was stealing processes from systemd's point of view.
The new behavior relies on a DBus API call to systemd, to create
a transient scope with a name that allows snapd to derive snap security
tag from. Due to the requirements of the used API the scope name
must be unique. This is enforced by systemd and handled by creating
a scope with the name "snap.${uuid}.${package}.${app}.scope" for apps
and "snap.${uuid}.${package}.hook.${hook}.scope" for hooks. Services
have a .service unit that is giving us equivalent information so they
do not have or need a scope. The UUID is generated by the kernel.

The old approach relied on privileged snap-confine to alter cgroup
configuration. The new approach requires an unprivileged DBus call and
was thus moved to snap-run. The associated C code and apparmor permissions
have been removed.

The new approach does not support systemd used on Ubuntu 14.04.
The feature is not effective there and application tracking fails
silently.

The code has several TODOs that I plan to address as follow-ups,
given that the feature is behind a feature flag anyway:

 - add additional spread test for tracking services
 - add additional spread test for tracking non-root applications

In addition I plan to make some organization changes so that large parts
of what is in overlord/snapstate/refresh is moved to sandbox/cgroup
instead. I decided to do this as a follow-up as it is easier to handle
with regards to conflicts.

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
@pedronis
Copy link
Collaborator

@zyga should #7722 and #7726 be closed in the light of this?

@zyga
Copy link
Contributor Author

zyga commented Nov 29, 2019

@pedronis yes, totally. I just was focused on more testing and didn't pay attention to pending/draft PRs.

Copy link
Contributor

@bboozzoo bboozzoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a pass, some comments inline. Need to play with this locally a little.

overlord/snapstate/refresh.go Outdated Show resolved Hide resolved
overlord/snapstate/refresh.go Outdated Show resolved Hide resolved
overlord/snapstate/refresh_test.go Outdated Show resolved Hide resolved
overlord/snapstate/refresh.go Outdated Show resolved Hide resolved
overlord/snapstate/refresh.go Outdated Show resolved Hide resolved
sandbox/cgroup/cgroup.go Outdated Show resolved Hide resolved
Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
Reused some tests for PidsInFile

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
Copy link
Contributor

@anonymouse64 anonymouse64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is looking good, I want to test it out a bit and understand what the performance impact of doing this is, but before that's done, I have a number of wording/typo suggestions, with the following outstanding questions:

  • why do we still test for /run/snapd/cgroup in the spread tests? I thought we abandoned that idea before it was merged?
  • a couple more comments about the locking situation would be appreciated
  • a comment on securityTagFromCgroupPath about expecting the path to not end in "/" would be good to have

cmd/snap/cmd_run.go Outdated Show resolved Hide resolved
overlord/snapstate/refresh.go Outdated Show resolved Hide resolved
overlord/snapstate/refresh.go Outdated Show resolved Hide resolved
overlord/snapstate/refresh.go Outdated Show resolved Hide resolved
overlord/snapstate/refresh.go Outdated Show resolved Hide resolved
tests/main/cgroup-tracking/task.yaml Outdated Show resolved Hide resolved
tests/main/cgroup-tracking/task.yaml Outdated Show resolved Hide resolved
tests/main/cgroup-tracking/task.yaml Outdated Show resolved Hide resolved
tests/main/cgroup-tracking/task.yaml Outdated Show resolved Hide resolved
tests/main/lxd/task.yaml Outdated Show resolved Hide resolved
zyga and others added 8 commits December 2, 2019 20:31
Co-Authored-By: Ian Johnson <person.uwsome@gmail.com>
Co-Authored-By: Ian Johnson <person.uwsome@gmail.com>
Co-Authored-By: Ian Johnson <person.uwsome@gmail.com>
Co-Authored-By: Ian Johnson <person.uwsome@gmail.com>
Co-Authored-By: Ian Johnson <person.uwsome@gmail.com>
Co-Authored-By: Ian Johnson <person.uwsome@gmail.com>
Co-Authored-By: Ian Johnson <person.uwsome@gmail.com>
Co-Authored-By: Ian Johnson <person.uwsome@gmail.com>
Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
It used to require a clean path but it is more robust and simple to
clean internally. This way the function is less brittle.

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
zyga and others added 9 commits July 6, 2020 09:17
The workaround was required because both snap run and snap-confine were
writing to cgroups (indirectly via systemd and directly, respectively).
With a single writer we can restore the simpler code.

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
…wareness-v2

Remove the sleep workaround added in the branch feature/track-launched-apps
The corresponding code is now in sandbox/cgroup.

Signed-off-by: Zygmunt Krynicki <me@zygoon.pl>
@zyga
Copy link
Contributor Author

zyga commented Jul 24, 2020

I've broken out #9057 with the last non-remove, non-noise changes.

@zyga
Copy link
Contributor Author

zyga commented Aug 6, 2020

This now contains almost entirely just the removal of the pid cgroup that used to be used for tracking. There's a a few extra changes like exec sh instead of sh and the use of dbusutil for snap run document portal. I can further reduce those if required.

Copy link
Contributor

@anonymouse64 anonymouse64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New changes LGTM! :shipit:

@zyga
Copy link
Contributor Author

zyga commented Aug 7, 2020

I've broken out the removal to #9119

@zyga
Copy link
Contributor Author

zyga commented Aug 10, 2020

This now contains the removal of the workaround for systemd's "slow" application of the cgroup transition as well as three one-liner changes to tests and "snap run" code.

@zyga
Copy link
Contributor Author

zyga commented Aug 10, 2020

Workaround removal is now in #9129

@mvo5
Copy link
Contributor

mvo5 commented Aug 11, 2020

This looks good, I think the description now needs an update to make sure it reflects what is part of this PR but otherwise I think we can merge it.

@zyga
Copy link
Contributor Author

zyga commented Aug 13, 2020

This no longer has anything complex that's not merged separately. I'd like to merge it when green.

@pedronis pedronis added Squash-merge Please squash this PR when merging. and removed ⛔ Blocked labels Aug 13, 2020
@pedronis
Copy link
Collaborator

pedronis commented Aug 13, 2020

@mvo5 @zyga as discussed this is fine to land as a squash-merge

Copy link

@jdstrand jdstrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was marked as needing security review, but all that is left are routine changes. LGTM

@zyga - thanks so much for sticking with this!! :)

@jdstrand jdstrand removed the Needs security review Can only be merged once security gave a :+1: label Aug 20, 2020
@zyga zyga merged commit bc21b77 into canonical:master Aug 24, 2020
@zyga zyga deleted the feature/refresh-app-awareness-v2 branch August 24, 2020 06:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Security-High Squash-merge Please squash this PR when merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants