Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
api, ifacestate: resolve disconnect early #3212
Conversation
stolowski
added some commits
Apr 19, 2017
stolowski
changed the title from
api, ifacemgr: resolve disconnect early
to
api, ifacestate: resolve disconnect early
Apr 19, 2017
zyga
reviewed
Apr 20, 2017
Looks good except for one detail. I don't know what to do about it that isn't terrible though :)
| + if err == nil { | ||
| + for _, connRef := range conns { | ||
| + var ts *state.TaskSet | ||
| + ts, err = ifacestate.Disconnect(st, connRef.PlugRef.Snap, connRef.PlugRef.Name, connRef.SlotRef.Snap, connRef.SlotRef.Name) |
zyga
Apr 20, 2017
Contributor
The only potential downside I can think of is that each disconnect will run in parallel but they need to synchronize on the common snap and I'm not sure this happens already. It will also setup security of one of the snaps multiple times (typically core so I think that's okay, though it will send N requests for udev to reload)
stolowski
Apr 20, 2017
Contributor
Wasn't checkChangeConflct supposed to achieve that in the old implementation (except of course it wouldn't really work because of that aforementioned bug)? Yes, this looks like something I need to address.
pedronis
Apr 26, 2017
Contributor
as we discussed in some standup, at most one doDisconnect task runs at a time (not even per snap but globally)
stolowski
Apr 27, 2017
Contributor
Yes, thanks for reminder, this is achieved with SetBlocked lambda in ifacemgr. So not a problem at all.
pedronis
Apr 28, 2017
Contributor
what we said is true, and this is older than the PR, but there is something weird about the CheckChangeConflict in ifacestate.Disconnect given that CheckChangeConflict itself will not care about disconnect/connect tasks, mmh, we need to rethink that, whether we should drop those checks
| @@ -3473,30 +3485,18 @@ func (s *apiSuite) TestDisconnectPlugFailureNoSuchPlug(c *check.C) { | ||
| c.Assert(err, check.IsNil) | ||
| rec := httptest.NewRecorder() | ||
| interfacesCmd.POST(interfacesCmd, req, nil).ServeHTTP(rec, req) | ||
| - c.Check(rec.Code, check.Equals, 202) | ||
| + c.Check(rec.Code, check.Equals, 400) |
| @@ -458,13 +444,9 @@ func (m *InterfaceManager) doDisconnect(task *state.Task, _ *tomb.Tomb) error { | ||
| return err | ||
| } | ||
| - affectedConns, err := m.repo.ResolveDisconnect(plugRef.Snap, plugRef.Name, slotRef.Snap, slotRef.Name) |
zyga
Apr 20, 2017
Contributor
One thing that we might see (this is probably very unlikely) is new snapd handling a task that was created by old snapd. If that were to happen this code will misbehave.
stolowski
Apr 20, 2017
Contributor
Very good point. I think we can either keep this logic here for such cases, or have a patch to convert old tasks to new approach. Let's discuss in HO/forum.
stolowski
Apr 24, 2017
Contributor
Right. That sounds better than keeping logic that cannot be properly exercised under normal circumstances and is 99% dead for most of the time. Thanks, I'll look at patching.
stolowski
Apr 24, 2017
Contributor
Ok, as discussed in the standup this is a no-issue after all, we will let disconnect fail in such rare cases. No need for patching.
niemeyer
Apr 26, 2017
Contributor
Can we please just ensure that it will error properly instead of arbitrarily? In other words, check that the refs look properly resolved and if not return an error such as "snapd changed, please retry the operation".
stolowski
Apr 27, 2017
Contributor
Ok, fixed to error out with such message on Disconnect call. I also realized it's better to call repo.Disconnect() than repo.DisconnectAll(), fixed that as well.
| + if err != nil { | ||
| + break | ||
| + } | ||
| + ts.JoinLane(st.NewLane()) |
zyga
Apr 20, 2017
Contributor
This is also very nice. Each disconnect will be undone separately if something fails and once we introduce the hook for disconnect this will become essential. Thanks!
| - if err != nil { | ||
| - return err | ||
| - } | ||
| + affectedConns := []interfaces.ConnRef{{PlugRef: plugRef, SlotRef: slotRef}} | ||
| m.repo.DisconnectAll(affectedConns) |
pedronis
Apr 26, 2017
Contributor
so if I understand correctly, this never fails, at most it's a noop?
| + var conns []interfaces.ConnRef | ||
| + repo := c.d.overlord.InterfaceManager().Repository() | ||
| + // note: disconnect can affect multiple connections, so unlike with 'connect' above we cannot give a good summary - | ||
| + // just include the snaps/plug/slot names given to the API, before they are resolved. |
| @@ -458,13 +444,9 @@ func (m *InterfaceManager) doDisconnect(task *state.Task, _ *tomb.Tomb) error { | ||
| return err | ||
| } | ||
| - affectedConns, err := m.repo.ResolveDisconnect(plugRef.Snap, plugRef.Name, slotRef.Snap, slotRef.Name) |
zyga
Apr 20, 2017
Contributor
One thing that we might see (this is probably very unlikely) is new snapd handling a task that was created by old snapd. If that were to happen this code will misbehave.
stolowski
Apr 20, 2017
Contributor
Very good point. I think we can either keep this logic here for such cases, or have a patch to convert old tasks to new approach. Let's discuss in HO/forum.
stolowski
Apr 24, 2017
Contributor
Right. That sounds better than keeping logic that cannot be properly exercised under normal circumstances and is 99% dead for most of the time. Thanks, I'll look at patching.
stolowski
Apr 24, 2017
Contributor
Ok, as discussed in the standup this is a no-issue after all, we will let disconnect fail in such rare cases. No need for patching.
niemeyer
Apr 26, 2017
Contributor
Can we please just ensure that it will error properly instead of arbitrarily? In other words, check that the refs look properly resolved and if not return an error such as "snapd changed, please retry the operation".
stolowski
Apr 27, 2017
Contributor
Ok, fixed to error out with such message on Disconnect call. I also realized it's better to call repo.Disconnect() than repo.DisconnectAll(), fixed that as well.
stolowski
added some commits
Apr 27, 2017
pedronis
requested changes
Apr 28, 2017
there is an issue I fear about the fact that repo.Disconnect can fail because the ref is not resolved (changed snapd), but also because one of the snaps or both has gone away since we created the change
| if err != nil { | ||
| - return err | ||
| + return fmt.Errorf("snapd changed, please retry the operation: %v", err) |
pedronis
Apr 28, 2017
Contributor
I fear this is a bit more complicated, given that we don't use conflicts, we can get this here when plugRef.Snap or slotRef.Snap don't exist anymore at all, in which case the old code would just skip, see below, I think we need to change order of things around do the snapstate.Gets first, and then this, otherwise we would produce this error misleadingly in some corner cases
stolowski
May 4, 2017
Contributor
Ok, I think I addressed this problem be refactoring/reordering things there.
stolowski
added some commits
May 2, 2017
|
@pedronis Ready for another look I think. @stolowski Tests broken it seems. |
stolowski
added some commits
May 11, 2017
pedronis
approved these changes
May 11, 2017
looks good, as we discussed we should fix CheckChangConflict to detect conflicts with connect/disconnect as well but the problem preexists this branch and appears only if people use interface hooks so can be done in a follow up
stolowski commentedApr 19, 2017
Resolve connections affected by "disconnect" early in the api, before calling Disconnect, instead of delaying it till doDisconnect. This is required for disconnect hooks as all snaps need to be known upfront when hook tasks are created (hooks will be introduced in followup PRs).
With this change multiple "disconnect" tasksets will be created for affected snaps, instead of a single taskset. This also fixes the bug described here https://forum.snapcraft.io/t/interface-hooks-disconnect-hooks-wip/179/3