many: snapctl outside hooks #3340

Merged
merged 34 commits into from Jun 9, 2017

Conversation

Projects
None yet
5 participants
Contributor

stolowski commented May 17, 2017

Support running snapctl outside of hooks by the apps from the snaps. This is realized by generating a unique SNAP_CONTEXT cookie-file on per-snap basis when the snap is installed, and passing it via the env in snap-confine.

NOTE: this branch was first proposed a few months, but was reworked a lot since then and I tried to address all previous comments. For old comments see #2644.

@pedronis pedronis changed the title from Smany: snapctl outside hooks to many: snapctl outside hooks May 17, 2017

Some quick feedback on the C code

cmd/Makefile.am
@@ -214,6 +214,8 @@ snap_confine_snap_confine_SOURCES = \
snap-confine/snap-confine-args.c \
snap-confine/snap-confine-args.h \
snap-confine/snap-confine.c \
+ snap-confine/context-support.c \
+ snap-confine/context-support.h \
@zyga

zyga May 17, 2017

Contributor

Can you please sort those and add unit tests.

cmd/libsnap-confine-private/snap.c
+ return (status == 0);
+}
+
+bool verify_snap_name(const char *name)
@zyga

zyga May 17, 2017

Contributor

Can you please use the sc_ prefix?

cmd/snap-confine/snap-confine.c
@@ -84,6 +85,20 @@ int main(int argc, char **argv)
die("need to run as root or suid");
}
#endif
+
+ char *snap_context __attribute__ ((cleanup(sc_cleanup_string))) = NULL;
+ // Do no get snap context value if running a hook (we don't want to overwrite hook's SNAP_CONTEXT)
@zyga

zyga May 17, 2017

Contributor

Can you make fmt this please?

stolowski added some commits May 17, 2017

Some more comments on the C parts. Thanks for addressing earlier feedback.

cmd/Makefile.am
@@ -294,6 +296,7 @@ snap_confine_unit_tests_SOURCES = \
libsnap-confine-private/unit-tests.h \
snap-confine/apparmor-support.c \
snap-confine/apparmor-support.h \
+ snap-confine/context-support-test.c \
@zyga

zyga May 22, 2017

Contributor

Thank you :-)

cmd/snap-confine/context-support-test.c
+
+static void test_maybe_set_context_environment__null()
+{
+ setenv("SNAP_CONTEXT", "bar", 1);
@zyga

zyga May 22, 2017

Contributor

This clobbers SNAP_CONTEXT during testing. This might affect other places. Can you please move this to a sub-process or add cleanup code?

@stolowski

stolowski May 23, 2017

Contributor

Done

cmd/snap-confine/context-support.c
+
+/**
+ * Effective value of CONTEXT_DIR
+ *
@zyga

zyga May 22, 2017

Contributor

Looks like stray newline.

cmd/snap-confine/context-support.c
+ err =
+ sc_error_init(SC_ERRNO_DOMAIN, 0,
+ "cannot open context file %s, SNAP_CONTEXT will not be set: %s",
+ context_path, strerror(errno));
@zyga

zyga May 22, 2017

Contributor

Please skip the strerror part, this is implicitly added if we die.

cmd/snap-confine/context-support.c
+ goto out;
+ }
+ // context is a 32 bytes, base64-encoding makes it 44.
+ context_val = calloc(1, 45);
@zyga

zyga May 22, 2017

Contributor

I cannot help to be upset about this internally. Could we do this instead?

struct sc_context_cookie {
  char cookie[45];
}
cmd/snap-confine/context-support.c
+ err =
+ sc_error_init(SC_ERRNO_DOMAIN, 0,
+ "failed to read context file %s: %s",
+ context_path, strerror(errno));
@zyga

zyga May 22, 2017

Contributor

Ditto about strerror

cmd/snap-confine/context-support.c
+void sc_maybe_set_context_environment(const char *context)
+{
+ if (context != NULL) {
+ // Overwrite context env value.
@zyga

zyga May 22, 2017

Contributor

I remember we had a discussion about this the last time. Are we OK with overwriting the context variable in hooks now? Can you please add a justification as a comment.

@stolowski

stolowski May 23, 2017

Contributor

This function will override the env when called, however we prevent this explicitely in snap-confine.c, we want SNAP_CONTEXT of hooks to take precedence if present (see the comment there) and avoid overwriting it.

cmd/snap-confine/snap-confine.c
+
+ char *snap_context __attribute__ ((cleanup(sc_cleanup_string))) = NULL;
+ // Do no get snap context value if running a hook (we don't want to overwrite hook's SNAP_CONTEXT)
+ if (!sc_verify_hook_security_tag_name(security_tag)) {
@zyga

zyga May 22, 2017

Contributor

This is somewhat magic. Can you please add sc_is_hook_security_tag()?

cmd/snap-confine/snap-confine.c
@@ -162,12 +175,9 @@ int main(int argc, char **argv)
debug
("resetting PATH to values in sync with core snap");
setenv("PATH",
- "/usr/local/sbin:"
@zyga

zyga May 22, 2017

Contributor

Interesting that indent would change this. Can you try to not alter this part?

@stolowski

stolowski May 23, 2017

Contributor

Ok, I'm not sure how that happened.. Should be fixed now.

cmd/snap-confine/snap-confine.c
@@ -201,7 +211,7 @@ int main(int argc, char **argv)
#if 0
setup_user_xdg_runtime_dir();
#endif
-
+ sc_maybe_set_context_environment(snap_context);
@zyga

zyga May 22, 2017

Contributor

Could you please move this closer to the exec call. Should we ever re-work it to pass environment explicitly that part should be close to change at the same time.

stolowski added some commits May 22, 2017

codecov-io commented May 23, 2017

Codecov Report

Merging #3340 into master will decrease coverage by 0.39%.
The diff coverage is 59.8%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #3340     +/-   ##
=========================================
- Coverage   77.55%   77.15%   -0.4%     
=========================================
  Files         371      373      +2     
  Lines       25519    25793    +274     
=========================================
+ Hits        19790    19900    +110     
- Misses       3978     4135    +157     
- Partials     1751     1758      +7
Impacted Files Coverage Δ
overlord/snapstate/snapmgr.go 81.45% <0%> (-0.55%) ⬇️
cmd/snapctl/main.go 50% <100%> (+10%) ⬆️
dirs/dirs.go 96.82% <100%> (+0.05%) ⬆️
overlord/hookstate/hookmgr.go 68.06% <15%> (-5.24%) ⬇️
daemon/api.go 73.19% <50%> (+0.06%) ⬆️
overlord/snapstate/handlers.go 69.82% <62.79%> (-0.24%) ⬇️
overlord/hookstate/context.go 87.09% <88.46%> (+4.38%) ⬆️
interfaces/builtin/ppp.go 45% <0%> (-16.12%) ⬇️
interfaces/builtin/joystick.go 73.07% <0%> (-14.43%) ⬇️
interfaces/builtin/framebuffer.go 67.85% <0%> (-12.92%) ⬇️
... and 104 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a020a0f...35d11ab. Read the comment docs.

The review is a bit extensive, sorry, but nothing pointed out is a big departure from what the PR does. It's actually pretty close, and this is mostly polishing.

cmd/snap-confine/context-support.c
+ goto out;
+ }
+ // context is a 32 bytes, base64-encoding makes it 44.
+ context_val = calloc(1, sizeof(sc_context_cookie));
@niemeyer

niemeyer May 25, 2017

Contributor

Feels a bit strange to be reading it so exactly. Puts unnecessary constraints on the other end. Can we try to read something like 512 bytes and trim at whatever length was read instead?

Then we can also drop the comment above it. Here we don't really care about its exact size or how the other end chooses to encode it.

Finally, can we move over to hexing instead of base64ing it? The trivial space savings in this case isn't worth the ugliness of the non-homogeneous encoding, IMO.

cmd/snap-confine/context-support.c
+ if (context_val == NULL) {
+ die("failed to allocate memory for snap context");
+ }
+ if (read(fd, context_val, sizeof(sc_context_cookie) - 1) < 0) {
@niemeyer

niemeyer May 25, 2017

Contributor

Probably not worth worrying about in this particular review as this is most likely already a problem in snap-confine in general, but reads being interrupted and the program failing when it should be retried is a common source of pain in C applications. We should fix all these cases at once at some point soon.

cmd/snap-confine/context-support.c
+ return context_val;
+}
+
+void sc_maybe_set_context_environment(const char *context)
@niemeyer

niemeyer May 25, 2017

Contributor

This function seems like unnecessary weight. The trivial code inside it is much more clear than the whole boilerplate it introduces and the documentation one must now read to figure out why it exists.

@@ -264,6 +264,9 @@
# Allow snap-confine to move to the void
/var/lib/snapd/void/ r,
+ # Allow snap-confine to read snap contexts
+ /var/lib/snapd/contexts/snap.* r,
@niemeyer

niemeyer May 25, 2017

Contributor

We should probably change the context term while we still have time. So far we have it as a hook state context which seems fine, but this is turning it into a snap context, which seems way too general and will probably hurt conversations.

Also, fixing the code is easy but once it hits the disk it's a whole new level of pain to fix it. We can tune only the new piece without touching the snap hook context for the time being.

Perhaps something like snap cookie (and SNAP_COOKIE, etc) which we don't have yet and seems to match pretty well the generally accepted meaning of the term and the sort of use we're making here. It's a good hint that the implementation is already using the term without even us discussing it, for example.

Let's please quickly talk about it elsewhere before we conclude that move.

overlord/hookstate/context.go
@@ -30,11 +28,13 @@ import (
"github.com/snapcore/snapd/overlord/state"
"github.com/snapcore/snapd/snap"
+ "github.com/snapcore/snapd/strutil"
)
// Context represents the context under which a given hook is running.
@niemeyer

niemeyer May 25, 2017

Contributor
// Context represents the context under which the snap is calling back into snapd.
// It is associated with a task when the callback is happening from within a hook,
// or otherwise considered an ephemeral context in that its associated data will
// be discarded once that individual call is finished.
overlord/hookstate/context.go
- _, err := rand.Read(idBytes)
- if err != nil {
- return nil, fmt.Errorf("cannot generate context ID: %s", err)
+// The task is optional, if nil then context becomes ephemeral. If contextID is empty, then a random ID will be generated.
@niemeyer

niemeyer May 25, 2017

Contributor
// NewContext returns a new context associated with the provided task or
// an ephemeral context if task is nil.
//
// A random ID is generated if contextID is empty.
overlord/hookstate/context.go
@@ -98,14 +97,14 @@ func (c *Context) Handler() Handler {
// and OnDone/Done).
func (c *Context) Lock() {
c.mutex.Lock()
- c.task.State().Lock()
+ c.State().Lock()
@niemeyer

niemeyer May 25, 2017

Contributor

c.state.Lock(), same as Unlock.

overlord/hookstate/context.go
- if err := c.task.Get("hook-context", &data); err != nil && err != state.ErrNoState {
- panic(fmt.Sprintf("internal error: cannot unmarshal context: %v", err))
+ if c.IsEphemeral() {
+ if val, ok := c.cache["ephemeral-context"]; ok {
@niemeyer

niemeyer May 25, 2017

Contributor

Both at once:

data, _ := c.cache["ephemeral-context"].(map[string]*json.RawMessage)
overlord/hookstate/context.go
- if err := c.task.Get("hook-context", &data); err != nil {
- return err
+ if c.IsEphemeral() {
+ if val, ok := c.cache["ephemeral-context"]; ok {
@niemeyer

niemeyer May 25, 2017

Contributor

Both at once:

data, _ := c.cache["ephemeral-context"].(map[string]*json.RawMessage)
overlord/hookstate/context.go
@@ -217,3 +235,8 @@ func (c *Context) Done() error {
return firstErr
}
+
+// IsEphemeral returns true if the context is an ephemeral context (only snap contexts are at the moment).
@niemeyer

niemeyer May 25, 2017

Contributor
// IsEphemeral returns whether the context is ephemeral.

Let's please not try to explain what that means here, as it'll be duplicating a better explanation that is necessary at the top level.

overlord/hookstate/hookmgr.go
@@ -143,14 +143,37 @@ func (m *HookManager) Stop() {
m.runner.Stop()
}
+func (m *HookManager) snapContext(contextID string) (context *Context, err error) {
@niemeyer

niemeyer May 25, 2017

Contributor

s/snap/ephemeral/, will help us understanding the flow better once we forget about it.

overlord/hookstate/hookmgr.go
@@ -143,14 +143,37 @@ func (m *HookManager) Stop() {
m.runner.Stop()
}
+func (m *HookManager) snapContext(contextID string) (context *Context, err error) {
+ var contexts map[string]string
+ err = m.state.Get("snap-contexts", &contexts)
@niemeyer

niemeyer May 25, 2017

Contributor

That's probably "snap-cookies" if we end up moving ahead with that change, which does sound much nicer.

overlord/hookstate/hookmgr.go
+ err = m.state.Get("snap-contexts", &contexts)
+ if err != nil {
+ if err != state.ErrNoState {
+ return nil, fmt.Errorf("failed to get snap contexts: %v", err)
@niemeyer

niemeyer May 25, 2017

Contributor

failed to => cannot

overlord/hookstate/hookmgr.go
+ context, err = NewContext(nil, m.state, &HookSetup{Snap: snapName}, nil, contextID)
+ return context, err
+ }
+ return nil, fmt.Errorf("unknown snap context requested")
@niemeyer

niemeyer May 25, 2017

Contributor

This would become something like "invalid snap cookie provided", I think, which is also nicer.

overlord/hookstate/hookmgr.go
context, ok := m.contexts[contextID]
if !ok {
- return nil, fmt.Errorf("no context for ID: %q", contextID)
+ m.state.Lock()
@niemeyer

niemeyer May 25, 2017

Contributor

A bit strange to see lock happening here, as it assumes an understanding of the internals of the function. Should be done inside ephemeralContext itself I think.

@stolowski

stolowski May 29, 2017

Contributor

Good point, fixed.

overlord/hookstate/hookmgr.go
+ context, err = m.snapContext(contextID)
+ m.state.Unlock()
+ if err != nil {
+ return nil, fmt.Errorf("request performed under an unknown context (%q)", contextID)
@niemeyer

niemeyer May 25, 2017

Contributor

Why is this dropping the original error instead of simply forwarding it?

overlord/snapstate/handlers.go
@@ -460,6 +468,52 @@ func (m *SnapManager) cleanupCopySnapData(t *state.Task, _ *tomb.Tomb) error {
return nil
}
+func (m *SnapManager) createSnapContext(st *state.State, snapName string) error {
@niemeyer

niemeyer May 25, 2017

Contributor

s/Context/Cookie/, again, assuming it flies.

overlord/snapstate/handlers.go
@@ -460,6 +468,52 @@ func (m *SnapManager) cleanupCopySnapData(t *state.Task, _ *tomb.Tomb) error {
return nil
}
+func (m *SnapManager) createSnapContext(st *state.State, snapName string) error {
+ contextID := strutil.MakeRandomString(40)
@niemeyer

niemeyer May 25, 2017

Contributor

Hah.. this isn't base64, and it's not 44 chars either, right? Is integration testing still missing? Let's please have it before we commit this.

@stolowski

stolowski May 29, 2017

Contributor

Right. The discrepancy arose after I switched from random bytes + base64 encoding to MakeRandomString and forgot to update it in snap-confine, and the code and tests are only making sure we don't exceed 44 characters.
Added a spread test to check we have the exact length.

overlord/snapstate/handlers.go
+ err := st.Get("snap-contexts", &contexts)
+ if err != nil {
+ if err != state.ErrNoState {
+ return fmt.Errorf("failed to get snap contexts: %v", err)
@niemeyer

niemeyer May 25, 2017

Contributor

failed to => cannot

overlord/snapstate/handlers.go
@@ -470,6 +524,10 @@ func (m *SnapManager) doLinkSnap(t *state.Task, _ *tomb.Tomb) error {
return err
}
+ if err := m.createSnapContext(st, snapsup.Name()); err != nil {
@niemeyer

niemeyer May 25, 2017

Contributor

Instead of using EnsureState above, that can be a more trivial atomic write and this call can be made conditional on there being no other revision of this snap at the moment (snapst.Current).

Also, this call should be closer to the end of the function, after LinkSnap itself has succeeded, otherwise we'll write down the file and keep leftover state around in case something else here fails. Talk to @pedronis for details on this.

overlord/snapstate/handlers.go
@@ -646,6 +704,10 @@ func (m *SnapManager) undoLinkSnap(t *state.Task, _ *tomb.Tomb) error {
return err
}
+ if err := m.removeSnapContext(st, snapsup.Name()); err != nil {
@niemeyer

niemeyer May 25, 2017

Contributor

Similarly, we want to do this only if what is being undone is the linking of the very first snap, instead of every revision. This is actually a bug rather than just an optimization.. as it is undoing an installation will kill the previous snap's context. Probably worth a spread test.

overlord/snapstate/handlers.go
@@ -745,7 +807,11 @@ func (m *SnapManager) doUnlinkSnap(t *state.Task, _ *tomb.Tomb) error {
// mark as inactive
snapst.Active = false
Set(st, snapsup.Name(), snapst)
- return nil
+
+ if err := m.removeSnapContext(st, snapsup.Name()); err != nil {
@niemeyer

niemeyer May 25, 2017

Contributor

Besides the above concern, this one should also be moved to doDiscardSnap which is the place where we actually kill the left overs of the snap similar to that one.

overlord/snapstate/snapmgr.go
@@ -301,6 +303,11 @@ func Manager(st *state.State) (*SnapManager, error) {
runner: runner,
}
+ // create snap context directory
@niemeyer

niemeyer May 25, 2017

Contributor

Comment is restating the code.

stolowski added some commits May 25, 2017

Contributor

pedronis commented Jun 1, 2017

if I understand snapctl always come from the core snap, but snapd doesn't, so in places where we don't do reexec things will break (because of s/CONTEXT/COOKIE/ ) until they upgrade snapd as well

Contributor

stolowski commented Jun 1, 2017

@pedronis You're right. I didn't know we allow re-exec to be disabled. In this case yes, snapctl with have a newer version in the core and will not work. So yes, snapctl need to look for both variables to mitigate this.

stolowski added some commits Jun 2, 2017

Contributor

stolowski commented Jun 7, 2017

The branch is ready for another review. I've addressed compatiblity issus by having both environment variables set for the time being.

Some extra comments on the C part

cmd/snap-confine/cookie-support-test.c
+static void create_dumy_cookie_file(const char *snap_name,
+ const char *dummy_cookie)
+{
+ char path[256];
@zyga

zyga Jun 8, 2017

Contributor

Can you please use PATH_MAX here

@stolowski

stolowski Jun 9, 2017

Contributor

Done

+ struct sc_error *err = NULL;
+ char *cookie;
+
+ char *dummy = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijmnopqrst";
@zyga

zyga Jun 8, 2017

Contributor

It would be nice if we also checked that the size of this string matches the cookie size.

(Which would be done by the compiler had cookie been a distinct structure type with a fixed-size field inside)

@stolowski

stolowski Jun 9, 2017

Contributor

Technically snap-confine shouldn't care what the exact size is, it just needs to expose the value from cookie file via env variable (with sane limits of course, which is dictated by read, see also the comment below).

cmd/snap-confine/cookie-support.c
+ sc_error_init(SC_ERRNO_DOMAIN, 0,
+ "cannot open cookie file %s, SNAP_COOKIE will not be set",
+ context_path);
+ sc_error_forward(errorp, err);
@zyga

zyga Jun 8, 2017

Contributor

Can you please move this to the bottom of the function and make it execute unconditionally like it is done in other places. This ensures that the error is set to NULL if nothing goes wrong. In the current implementation this is not the case.

@stolowski

stolowski Jun 9, 2017

Contributor

Done

cmd/snap-confine/cookie-support.c
+ return NULL;
+ }
+
+ char context_val[255];
@zyga

zyga Jun 8, 2017

Contributor

Where is the size of the context coming from?

@stolowski

stolowski Jun 9, 2017

Contributor

As per @niemeyer earlier comment, I've changed this not be so precise. The read below will read the value up to the context_val buffer size.

cmd/snap-confine/cookie-support.c
+ sc_error_forward(errorp, err);
+ return NULL;
+ }
+ context_val[n] = 0;
@zyga

zyga Jun 8, 2017

Contributor

Can you instead please use strndup() below?

@stolowski

stolowski Jun 9, 2017

Contributor

Done

cmd/snap-confine/cookie-support.c
+ return NULL;
+ }
+ context_val[n] = 0;
+ return strdup(context_val);
@zyga

zyga Jun 8, 2017

Contributor

Can you please check for NULL result of strdup and return an error if this happens?

@stolowski

stolowski Jun 9, 2017

Contributor

Done

A few small changes and +1

cmd/snap-confine/cookie-support.c
+ fd = open(context_path, O_RDONLY | O_NOFOLLOW | O_CLOEXEC);
+ if (fd < 0) {
+ err =
+ sc_error_init(SC_ERRNO_DOMAIN, 0,
@zyga

zyga Jun 9, 2017

Contributor

I should have noticed earlier:

err = sc_error_init_from_errno(errno, "cannot open cookie file %s", context_path);
cmd/snap-confine/cookie-support.c
+ goto out;
+ }
+
+ char context_val[255];
@zyga

zyga Jun 9, 2017

Contributor

Can you add a comment explaining that this is an arbitrary "large enough" buffer for an opaque cookie.

cmd/snap-confine/cookie-support.c
+ }
+
+ char context_val[255];
+ int n = read(fd, context_val, sizeof(context_val) - 1);
@zyga

zyga Jun 9, 2017

Contributor
ssize_t n
cmd/snap-confine/cookie-support.c
+ int n = read(fd, context_val, sizeof(context_val) - 1);
+ if (n < 0) {
+ err =
+ sc_error_init(SC_ERRNO_DOMAIN, 0,
@zyga

zyga Jun 9, 2017

Contributor
err = sc_error_init_from_errno(errno, "cannot read cookie file %s", context_path)
cmd/snap-confine/snap-confine.c
+ __attribute__ ((cleanup(sc_cleanup_error))) = NULL;
+ snap_context = sc_cookie_get_from_snapd(snap_name, &err);
+ if (err != NULL) {
+ error("cannot get cookie: %s", sc_error_msg(err));
@zyga

zyga Jun 9, 2017

Contributor

I think it is fine to just print the message, errors already talk about the cookie.

Contributor

stolowski commented Jun 9, 2017

@zyga Done

zyga approved these changes Jun 9, 2017

+1

@stolowski stolowski merged commit 2afccb5 into snapcore:master Jun 9, 2017

7 checks passed

artful-amd64 autopkgtest finished (success)
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
xenial-amd64 autopkgtest finished (success)
Details
xenial-i386 autopkgtest finished (success)
Details
xenial-ppc64el autopkgtest finished (success)
Details
yakkety-amd64 autopkgtest finished (success)
Details
zesty-amd64 autopkgtest finished (success)
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment