Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
cmd/snap-update-ns: add actual implementation #3225
Conversation
|
I found a small bug that required changes to |
zyga
added some commits
Apr 24, 2017
stolowski
requested changes
Apr 25, 2017
•
Just made 1st quick pass over these changes. Looks good, some nitpicks, see individual comments, will do 2nd pass later.
It seems that we will be keeping .lock file around forever, which is fine... Just curious if there are strong reason s to do that, instead of creating them with O_EXCL and removing when done?
| + changesNeeded := mount.NeededChanges(current, desired) | ||
| + fmt.Fprintf(os.Stderr, "CHANGES NEEDED:\n") | ||
| + for _, change := range changesNeeded { | ||
| + fmt.Fprintf(os.Stderr, " - %s\n", change) |
stolowski
Apr 25, 2017
Contributor
How about a small lambda to avoid the repetitions of fmt.Fprintf(os.Stderr, " - %s\n".... above and below? The lambda could possibly replace the entire loop, but I'm not sure of that.
| + | ||
| +// lockFileName returns the name of the lock file for the given snap. | ||
| +func lockFileName(snapName string) string { | ||
| + return filepath.Join(dirs.SnapRunLockDir, fmt.Sprintf("%s.lock", snapName)) |
stolowski
Apr 25, 2017
Contributor
I wonder if we will ever want more lock files for other non-conflicting operations, in which case it would make sense to give this lock a more specific name, e.g. snap.mount-lock?
zyga
Apr 25, 2017
Contributor
So far all locking is either global (all namespaces) or scoped to a specific snap. The lock file protects the $SNAP_NAME.mnt file from concurrent modification.
| + # current mount namespace. | ||
| + /usr/lib/snapd/snap-discard-ns $PLUG_SNAP | ||
| + echo "Check that snap-update-ns fails after discarding the mount namespace" | ||
| + /usr/lib/snapd/snap-update-ns $PLUG_SNAP 2>snap-update-ns.log | MATCH "cannot update snap namespace: cannot switch mount namespace: invalid argument" |
zyga
Apr 25, 2017
•
Contributor
I have way more coming. This works but I have also the full-blown version that does everything automatically and I'll be just adding more tests now.
zyga
added some commits
Apr 25, 2017
|
We cannot remove the lock files as this would make them useless. If we open them with exclusive flag then only one process can succeed and ... what then? What does the 2nd guy do? Try again? The trick is that nobody removes them (maybe snapd could when the snap is purged entirely) so that anyone can open them and then the real race is around the only primitive that is sensible, flock itself. |
| // There is some C code that runs before main() is started. | ||
| // That code always runs and sets an error condition if it fails. | ||
| // Here we just check for the error. | ||
| if err := BootstrapError(); err != nil { | ||
| + // If there is no mount namespace to transition to let's just quit | ||
| + // instantly without any errors as there is nothing to do anymore. |
stolowski
Apr 25, 2017
Contributor
Please bear with me and excuse me my ignorance... Can you explain why not having a mount ns to transition to is ok here and can be silently ignored? Perhaps extending this comment to explain what is the typical scenario for this to happen would be good for anyone not familiar with namespaces :}
zyga
Apr 25, 2017
Contributor
The goal of the tool is to update a mount namespace. If no mount namespace exists there is nothing to do
zyga
Apr 25, 2017
Contributor
This essentially allows snapd to just use this tool without having to coordinate
| "fmt" | ||
| "syscall" | ||
| "unsafe" | ||
| ) | ||
| +var ( | ||
| + ErrNoNS = errors.New("no namespace") | ||
| +) |
niemeyer
Apr 25, 2017
Contributor
This can be a single line, and it'd be nice to have a still terse message but slightly more clear one so that if it ever leaks we know where to look at:
var ErrNoNS = errors.New("cannot find namespace to update")
| + // of snap-confine are synchronized and will see consistent state. | ||
| + lock, err := mount.OpenLock(snapName) | ||
| + if err != nil { | ||
| + return fmt.Errorf("cannot open mount namespace lock file: %s", err) |
niemeyer
Apr 25, 2017
Contributor
Oh, can we please add the snap name to all of these errors? This will definitely be helpful when debugging.
"cannot open mount namespace lock file for snap %q: %s"
etc.
| + if err := lock.Lock(); err != nil { | ||
| + return fmt.Errorf("cannot lock mount namespace: %s", err) | ||
| + } | ||
| + defer lock.Close() |
| + changesMade = append(changesMade, change) | ||
| + continue | ||
| + } | ||
| + // Read mount info each time as our operations may have unexpected |
niemeyer
Apr 25, 2017
Contributor
That seems awkward. Doing that when something errors is perhaps justifiable since we don't know whether it worked or not, but loading it every single time because we have no idea seems very suspect.
zyga
Apr 25, 2017
Contributor
I think it is ok to err on the safe side. The alternative is to say the we know exactly how the kernel (including bugs) performs mount and unmount operations so that we can simulate them here. I'm not sure I like that assumption.
niemeyer
May 10, 2017
Contributor
I'm still not comfortable with that. It's akin to rebooting the system because one has absolutely no clue of what is going on. Yes, it tends to work, but it demonstrates lack of understanding of the system, and problems that are being ignored.
If we need to reload this on every iteration, we very much need to know why we're doing that. What is changing between each of these iterations that could modify something that will affect follow up iterations? If the answer is we don't know, we need to think harder about what this tool is doing.
| + if err != nil { | ||
| + return fmt.Errorf("cannot read mount-info table: %s", err) | ||
| + } | ||
| + if !change.Needed(mounted) { |
niemeyer
Apr 25, 2017
Contributor
Shouldn't this consider prefixes as well? I don't recall seeing that logic in Needed.
zyga
Apr 25, 2017
Contributor
Can you expand on this? I think one thing we need to handle better here is when an operation fails we should abort all the changes to the sub-tree (e.g. don't try to mount something when earlier unmount in the same sub-tree failed). Is that what you mean?
niemeyer
May 10, 2017
Contributor
What happens if mounted is a prefix of the modification described in change, and what should happen?
zyga
May 15, 2017
Contributor
Aha, interesting! I think that the algorithm that computes the needed changes already handles prefix changes. Since I removed the Change.Needed code entirely I think this is okay now. We just do exactly what we computed and we always keep track of what we did.
| + changesMade = append(changesMade, change) | ||
| + continue | ||
| + } | ||
| + fmt.Printf("%s\n", change) |
zyga
Apr 25, 2017
Contributor
In this version it is used for trivial testing. It gets removed when the Change.Perform branch is combined with a more extensive tests that measures actual mounts being changed, not just this being printed.
zyga
Apr 27, 2017
Contributor
Oh, since Change.Perform branch has been merged I can iterate on this. Let me update the tests to do real stuff now.
niemeyer
May 10, 2017
Contributor
It's still in the PR. We shouldn't be printing random output like this.
| + | ||
| + // Compute the new current profile so that it contains only changes that were made | ||
| + // and save it back for next runs. | ||
| + current = &mount.Profile{} |
zyga
Apr 27, 2017
Contributor
I renamed current to currentBefore and currentAfter so that there's no confusion about this. Also applied the suggestion you made.
zyga
added some commits
Apr 27, 2017
stolowski
requested changes
Apr 28, 2017
Looks good, just two comments regarding tests.
| + // of snap-confine are synchronized and will see consistent state. | ||
| + lock, err := mount.OpenLock(snapName) | ||
| + if err != nil { | ||
| + return fmt.Errorf("cannot open lock file for mount namespace of snap %q: %s", snapName, err) |
stolowski
Apr 28, 2017
Contributor
It would be good to have a test for this error case, can you add one?
| + changesMade = append(changesMade, change) | ||
| + continue | ||
| + } | ||
| + fmt.Printf("%s\n", change) |
zyga
Apr 25, 2017
Contributor
In this version it is used for trivial testing. It gets removed when the Change.Perform branch is combined with a more extensive tests that measures actual mounts being changed, not just this being printed.
zyga
Apr 27, 2017
Contributor
Oh, since Change.Perform branch has been merged I can iterate on this. Let me update the tests to do real stuff now.
niemeyer
May 10, 2017
Contributor
It's still in the PR. We shouldn't be printing random output like this.
|
@stolowski it is not a todo, it is used by tests (the printf) as for missing tests I think that testing the locking error is possible but as you see there are no unit tests at all here, just integration tests. I will be iterating on this (primarily on testing) but I'd love to see this land so that we can start testing it the hard way to discover the more interesting bugs. |
stolowski
approved these changes
Apr 28, 2017
Ok, sure. Looking forward for the upcoming branches then. +1
zyga
added some commits
May 3, 2017
| -// Error returns error (if any) encountered in pre-main C code. | ||
| +var ( | ||
| + // ErrNoNS is a distinct error returned when a snap namespace does not exist. | ||
| + ErrNoNS = errors.New("cannot update mount namespace that was not created yet") |
| + # Check that the shared content is not mounted. | ||
| + snap run --shell $PLUG_SNAP.content-plug -c 'test ! -e $SNAP/import/shared-content' | ||
| + | ||
| + # Run snap-update-ns to see that setns part worked and we got did nothing at all. |
zyga
added some commits
May 15, 2017
zyga
dismissed
niemeyer’s
stale review
May 15, 2017
Changes applied as requested. Gustavo is off for two days and I'd like to iterate. Chipaca approved
zyga commentedApr 24, 2017
This patch adds a non-dummy implementation of snap-update-ns. There are
still three pieces missing. There's no locking so concurrently running
snap-confine is not synchronized. The function that determines if a
mount change is needed is dummy and always returns true. The mount
changes are not really performed yet as the Perform function is just a
stub. The stubs will be addressed with separate PRs.
All that the tool now does is to print what should be done instead of
actually doing it.
Signed-off-by: Zygmunt Krynicki zygmunt.krynicki@canonical.com