Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/snap-update-ns: add actual implementation #3225

Merged
merged 35 commits into from May 15, 2017

Conversation

zyga
Copy link
Collaborator

@zyga zyga commented Apr 24, 2017

This patch adds a non-dummy implementation of snap-update-ns. There are
still three pieces missing. There's no locking so concurrently running
snap-confine is not synchronized. The function that determines if a
mount change is needed is dummy and always returns true. The mount
changes are not really performed yet as the Perform function is just a
stub. The stubs will be addressed with separate PRs.

All that the tool now does is to print what should be done instead of
actually doing it.

Signed-off-by: Zygmunt Krynicki zygmunt.krynicki@canonical.com

@zyga
Copy link
Collaborator Author

zyga commented Apr 24, 2017

I found a small bug that required changes to ChangesNeeded. I'll update the branch shortly.

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
This assists in computing the effective current profile as all
the kept and mounted things are in in.

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
This patch adds a non-dummy implementation of snap-update-ns. There are
still two pieces missing. The function that determines if a mount change
is needed is dummy and always returns true. The mount changes are not
really performed yet as the Perform function is just a stub. The stubs
will be addressed with separate PRs.

All that the tool now does is to print, to stdout, what should be done
instead of actually doing it. Stderr is a bit more noisy, but
essentially explains the same thing with more detail.

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Copy link
Contributor

@stolowski stolowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just made 1st quick pass over these changes. Looks good, some nitpicks, see individual comments, will do 2nd pass later.

It seems that we will be keeping .lock file around forever, which is fine... Just curious if there are strong reason s to do that, instead of creating them with O_EXCL and removing when done?


// lockFileName returns the name of the lock file for the given snap.
func lockFileName(snapName string) string {
return filepath.Join(dirs.SnapRunLockDir, fmt.Sprintf("%s.lock", snapName))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we will ever want more lock files for other non-conflicting operations, in which case it would make sense to give this lock a more specific name, e.g. snap.mount-lock?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far all locking is either global (all namespaces) or scoped to a specific snap. The lock file protects the $SNAP_NAME.mnt file from concurrent modification.

changesNeeded := mount.NeededChanges(current, desired)
fmt.Fprintf(os.Stderr, "CHANGES NEEDED:\n")
for _, change := range changesNeeded {
fmt.Fprintf(os.Stderr, " - %s\n", change)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a small lambda to avoid the repetitions of fmt.Fprintf(os.Stderr, " - %s\n".... above and below? The lambda could possibly replace the entire loop, but I'm not sure of that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'll just drop those. I don't think we need them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped now.

# current mount namespace.
/usr/lib/snapd/snap-discard-ns $PLUG_SNAP
echo "Check that snap-update-ns fails after discarding the mount namespace"
/usr/lib/snapd/snap-update-ns $PLUG_SNAP 2>snap-update-ns.log | MATCH "cannot update snap namespace: cannot switch mount namespace: invalid argument"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks for all these tests!

Copy link
Collaborator Author

@zyga zyga Apr 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have way more coming. This works but I have also the full-blown version that does everything automatically and I'll be just adding more tests now.

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
@zyga
Copy link
Collaborator Author

zyga commented Apr 25, 2017

We cannot remove the lock files as this would make them useless. If we open them with exclusive flag then only one process can succeed and ... what then? What does the 2nd guy do? Try again? The trick is that nobody removes them (maybe snapd could when the snap is purged entirely) so that anyone can open them and then the real race is around the only primitive that is sensible, flock itself.

// There is some C code that runs before main() is started.
// That code always runs and sets an error condition if it fails.
// Here we just check for the error.
if err := BootstrapError(); err != nil {
// If there is no mount namespace to transition to let's just quit
// instantly without any errors as there is nothing to do anymore.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please bear with me and excuse me my ignorance... Can you explain why not having a mount ns to transition to is ok here and can be silently ignored? Perhaps extending this comment to explain what is the typical scenario for this to happen would be good for anyone not familiar with namespaces :}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of the tool is to update a mount namespace. If no mount namespace exists there is nothing to do

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This essentially allows snapd to just use this tool without having to coordinate

Copy link
Contributor

@niemeyer niemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, glad to see the feature almost there.

"fmt"
"syscall"
"unsafe"
)

var (
ErrNoNS = errors.New("no namespace")
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a single line, and it'd be nice to have a still terse message but slightly more clear one so that if it ever leaks we know where to look at:

var ErrNoNS = errors.New("cannot find namespace to update")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, will change

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// of snap-confine are synchronized and will see consistent state.
lock, err := mount.OpenLock(snapName)
if err != nil {
return fmt.Errorf("cannot open mount namespace lock file: %s", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the descriptive errors here and below!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, can we please add the snap name to all of these errors? This will definitely be helpful when debugging.

"cannot open mount namespace lock file for snap %q: %s"

etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done,

if err := lock.Lock(); err != nil {
return fmt.Errorf("cannot lock mount namespace: %s", err)
}
defer lock.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be before the branch above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

changesMade = append(changesMade, change)
continue
}
// Read mount info each time as our operations may have unexpected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems awkward. Doing that when something errors is perhaps justifiable since we don't know whether it worked or not, but loading it every single time because we have no idea seems very suspect.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is ok to err on the safe side. The alternative is to say the we know exactly how the kernel (including bugs) performs mount and unmount operations so that we can simulate them here. I'm not sure I like that assumption.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not comfortable with that. It's akin to rebooting the system because one has absolutely no clue of what is going on. Yes, it tends to work, but it demonstrates lack of understanding of the system, and problems that are being ignored.

If we need to reload this on every iteration, we very much need to know why we're doing that. What is changing between each of these iterations that could modify something that will affect follow up iterations? If the answer is we don't know, we need to think harder about what this tool is doing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is now gone, along with Change.Needed

if err != nil {
return fmt.Errorf("cannot read mount-info table: %s", err)
}
if !change.Needed(mounted) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this consider prefixes as well? I don't recall seeing that logic in Needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on this? I think one thing we need to handle better here is when an operation fails we should abort all the changes to the sub-tree (e.g. don't try to mount something when earlier unmount in the same sub-tree failed). Is that what you mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if mounted is a prefix of the modification described in change, and what should happen?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, interesting! I think that the algorithm that computes the needed changes already handles prefix changes. Since I removed the Change.Needed code entirely I think this is okay now. We just do exactly what we computed and we always keep track of what we did.

changesMade = append(changesMade, change)
continue
}
fmt.Printf("%s\n", change)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left over?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this version it is used for trivial testing. It gets removed when the Change.Perform branch is combined with a more extensive tests that measures actual mounts being changed, not just this being printed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, since Change.Perform branch has been merged I can iterate on this. Let me update the tests to do real stuff now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's still TODO?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's done now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still in the PR. We shouldn't be printing random output like this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now gone :)


// Compute the new current profile so that it contains only changes that were made
// and save it back for next runs.
current = &mount.Profile{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var current mount.Profile

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is re-setting the existing variable, am I missing anything?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed current to currentBefore and currentAfter so that there's no confusion about this. Also applied the suggestion you made.

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Copy link
Contributor

@stolowski stolowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just two comments regarding tests.

// of snap-confine are synchronized and will see consistent state.
lock, err := mount.OpenLock(snapName)
if err != nil {
return fmt.Errorf("cannot open lock file for mount namespace of snap %q: %s", snapName, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have a test for this error case, can you add one?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on a branch with unit tests for all of the code here.

changesMade = append(changesMade, change)
continue
}
fmt.Printf("%s\n", change)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's still TODO?

@zyga
Copy link
Collaborator Author

zyga commented Apr 28, 2017

@stolowski it is not a todo, it is used by tests (the printf)

as for missing tests I think that testing the locking error is possible but as you see there are no unit tests at all here, just integration tests. I will be iterating on this (primarily on testing) but I'd love to see this land so that we can start testing it the hard way to discover the more interesting bugs.

Copy link
Contributor

@stolowski stolowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sure. Looking forward for the upcoming branches then. +1

zyga added 4 commits May 3, 2017 09:43
This patch adds a non-dummy implementation of snap-update-ns. There are
still two pieces missing. The function that determines if a mount change
is needed is dummy and always returns true. The mount changes are not
really performed yet as the Perform function is just a stub. The stubs
will be addressed with separate PRs.

All that the tool now does is to print, to stdout, what should be done
instead of actually doing it. Stderr is a bit more noisy, but
essentially explains the same thing with more detail.

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
zyga added 14 commits May 15, 2017 13:01
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
The snap-update-ns tool used to print things so that initial tests could
measure that something was going on. As the tool does everything now and
runs automatically tests can be simplified to look for real side-effects
instead.

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
This patch removes the unimplemented `Change.Needed` method. The method
was designed to inspect the mount namespace, as exposed by the kernel
mountinfo interface, and look for signs that a change has already
occured but was not recorded (e.g. it was constructed by version of
snap-confine earlier than 2.25). In retrospective this feature is very
complex and not really needed as we know exactly what was mounted so we
don't need to guess (using the much more complex kernel interface).

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Copy link
Contributor

@chipaca chipaca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM; just a couple of minor things

# Check that the shared content is not mounted.
snap run --shell $PLUG_SNAP.content-plug -c 'test ! -e $SNAP/import/shared-content'

# Run snap-update-ns to see that setns part worked and we got did nothing at all.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we got did nothing at all

this sentence has uses too many verbs

// Error returns error (if any) encountered in pre-main C code.
var (
// ErrNoNS is a distinct error returned when a snap namespace does not exist.
ErrNoNS = errors.New("cannot update mount namespace that was not created yet")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're trying to have all errors called FooError, not ErrFoo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I was wrong, as it's a variable and not a type)

zyga added 2 commits May 15, 2017 15:49
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
@zyga zyga dismissed niemeyer’s stale review May 15, 2017 15:17

Changes applied as requested. Gustavo is off for two days and I'd like to iterate. Chipaca approved

@zyga zyga merged commit 295dfb6 into snapcore:master May 15, 2017
@zyga zyga deleted the feature/update-ns/tool branch May 15, 2017 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants