Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[store] Change the Restore action on objects to update instead of delete/create #2281

Conversation

cyli
Copy link
Contributor

@cyli cyli commented Jun 21, 2017

Since if the object already exists, the event produced should be an update and not a delete/create.

This is probably the cause of moby/moby#33541, where an unlock key change is sometimes is not picked up by a snapshot restore from another node.

cc @aaronlehmann

Also, :( generics

@cyli cyli force-pushed the ensure-cluster-updates-when-restoring-from-snapshot branch from 2fdfc46 to e866b7e Compare June 21, 2017 23:11
@codecov
Copy link

codecov bot commented Jun 21, 2017

Codecov Report

Merging #2281 into master will increase coverage by 0.58%.
The diff coverage is 82.6%.

@@            Coverage Diff             @@
##           master    #2281      +/-   ##
==========================================
+ Coverage   60.38%   60.96%   +0.58%     
==========================================
  Files         125      126       +1     
  Lines       20412    20373      -39     
==========================================
+ Hits        12326    12421      +95     
+ Misses       6699     6581     -118     
+ Partials     1387     1371      -16

@cyli cyli force-pushed the ensure-cluster-updates-when-restoring-from-snapshot branch from e866b7e to a8e0adf Compare June 21, 2017 23:40
@aaronlehmann
Copy link
Collaborator

I tried to simplify the Restore function. This is what I came up with:

                Restore: func(tx Tx, snapshot *api.StoreSnapshot) error {
                        clusters, err := FindClusters(tx, All)
                        if err != nil {
                                return err
                        }
                        updated := make(map[string]struct{})
                        for _, n := range snapshot.Clusters {
                                if err := UpdateCluster(tx, n); err == ErrNotExist {
                                        if err := CreateCluster(tx, n); err != nil {
                                                return err
                                        }
                                } else if err != nil {
                                        return err
                                } else {
                                        updated[n.ID] = struct{}{}
                                }
                        }
                      for _, n := range clusters {
                                if _, ok := updated[n.ID]; !ok {
                                        if err := DeleteCluster(tx, n.ID); err != nil {
                                                return err
                                        }
                                }
                        }
                        return nil
                },

I'm not sure it's really better, to be honest. What do you think?

Another idea I have is to have the type-specific Restore functions convert the slice to []api.StoreObject and call a type-independent helper function that uses tx methods:

func RestoreTable(tx Tx, table string, newObjects []api.StoreObject) error {
        checkType := func(by By) error {
                return nil
        }
        var oldObjects []api.StoreObject
        appendResult := func(o api.StoreObject) {
                oldObjects = append(oldObjects, o)
        }

        err := tx.find(table, All, checkType, appendResult)
        if err != nil {
                return nil
        }

        updated := make(map[string]struct{})

        for _, o := range newObjects {
                if existing := tx.lookup(table, indexID, o.GetID()); existing != nil {
                        if err := tx.update(table, o); err != nil {
                                return err
                        }
                        updated[o.GetID()] = struct{}{}
                } else {
                        if err := tx.create(table, o); err != nil {
                                return err
                        }
                }
        }
        for _, o := range oldObjects {
                if _, ok := updated[o.GetID()]; !ok {
                        if err := tx.delete(table, o.GetID()); err != nil {
                                return err
                        }
                }
        }
        return nil
}

I used my algorithm here as a proof of concept, but I'd be fine with using yours instead as long as it's only implemented in one place. Yours might be a bit more memory-efficient.

The downsides of this approach are that it needs an extra slice copy, and that it skips the validation in the type-specific Create* / Update* wrappers. But any objects inside a snapshot should already have passed validation. Ultimately I like this solution because it removes the bulk of the duplicated code.

@cyli
Copy link
Contributor Author

cyli commented Jun 22, 2017

@aaronlehmann I really like the idea of de-duplication - if we get more objects (such as generic resources) it will make the store code much easier to maintain. I like your algorithm better - I think it's actually the more memory efficient one, since it only creates one map, and a bit easier to read. Thanks!

@cyli cyli force-pushed the ensure-cluster-updates-when-restoring-from-snapshot branch 2 times, most recently from 5488fbb to a2ce9dd Compare June 22, 2017 21:55
}
for _, o := range oldObjects {
if _, ok := updated[o.GetID()]; !ok {
if err := tx.delete(table, o.GetID()); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call GetID a single time for each loop iteration. I was sloppy when I wrote the PoC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to make this change, but out of curiosity, is the function call particularly expensive? All the objects just seem to return the ID field, and there doesn't seem to be much indirection?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not expensive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it better practice to only make the function call once? I don't particularly have an opinion either way, I was just wondering if it was a style reason or some other reason?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes the code a little easier to read and it has a very marginal performance impact because it avoids a redundant function call, but that's so insignificant I hesitate to bring it up.

I guess I don't care either. Just disagreeing with myself, reviewing my own code.

if err := tx.update(table, o); err != nil {
return err
}
updated[o.GetID()] = struct{}{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call GetID a single time for each loop iteration. I was sloppy when I wrote the PoC.

@aaronlehmann
Copy link
Collaborator

Almost LGTM. extensions.go and resources.go also need these changes.

@cyli
Copy link
Contributor Author

cyli commented Jun 23, 2017

@aaronlehmann Question about adding this to extensions - currently extensions are immutable - if we restore, we'd emit an EventUpdateExtension. Technically I guess we aren't updating anything, we are just restoring from a snapshot, but I just wanted to check that this was ok?

…eady

exist are updated, rather than everything being deleted and re-created.

Signed-off-by: Ying Li <ying.li@docker.com>
@cyli cyli force-pushed the ensure-cluster-updates-when-restoring-from-snapshot branch from a2ce9dd to cfff7d1 Compare June 23, 2017 00:38
@aaronlehmann
Copy link
Collaborator

LGTM

@aaronlehmann aaronlehmann merged commit 7d0a128 into moby:master Jun 23, 2017
@cyli cyli deleted the ensure-cluster-updates-when-restoring-from-snapshot branch June 23, 2017 07:08
andrewhsu pushed a commit to docker-archive/docker-ce that referenced this pull request Jul 14, 2017
- moby/swarmkit#2266 (support for templating Node.Hostname in docker executor)
- moby/swarmkit#2281 (change restore action on objects to be update, not delete/create)
- moby/swarmkit#2285 (extend watch queue with timeout and size limit)
- moby/swarmkit#2253 (version-aware failure tracking in the scheduler)
- moby/swarmkit#2275 (update containerd and port executor to container client library)
- moby/swarmkit#2292 (rename some generic resources)
- moby/swarmkit#2300 (limit the size of the external CA response)
- moby/swarmkit#2301 (delete global tasks when the node running them is deleted)

Minor cleanups, dependency bumps, and vendoring:
- moby/swarmkit#2271
- moby/swarmkit#2279
- moby/swarmkit#2283
- moby/swarmkit#2282
- moby/swarmkit#2274
- moby/swarmkit#2296 (dependency bump of etcd, go-winio)

Signed-off-by: Ying Li <ying.li@docker.com>
Upstream-commit: 4509a00
Component: engine
silvin-lubecki pushed a commit to silvin-lubecki/docker-ce that referenced this pull request Feb 3, 2020
- moby/swarmkit#2281 - fixes an issue where some cluster updates
  could be missed if a manager receives a catch-up snapshot from another manager
- moby/swarmkit#2300 - fixes a possible memory issue if an
  external CA sends an overlarge response

Signed-off-by: Ying <ying.li@docker.com>
silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Feb 3, 2020
- moby/swarmkit#2281 - fixes an issue where some cluster updates
  could be missed if a manager receives a catch-up snapshot from another manager
- moby/swarmkit#2300 - fixes a possible memory issue if an
  external CA sends an overlarge response

Signed-off-by: Ying <ying.li@docker.com>
silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Mar 10, 2020
- moby/swarmkit#2281 - fixes an issue where some cluster updates
  could be missed if a manager receives a catch-up snapshot from another manager
- moby/swarmkit#2300 - fixes a possible memory issue if an
  external CA sends an overlarge response

Signed-off-by: Ying <ying.li@docker.com>
silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Mar 16, 2020
- moby/swarmkit#2266 (support for templating Node.Hostname in docker executor)
- moby/swarmkit#2281 (change restore action on objects to be update, not delete/create)
- moby/swarmkit#2285 (extend watch queue with timeout and size limit)
- moby/swarmkit#2253 (version-aware failure tracking in the scheduler)
- moby/swarmkit#2275 (update containerd and port executor to container client library)
- moby/swarmkit#2292 (rename some generic resources)
- moby/swarmkit#2300 (limit the size of the external CA response)
- moby/swarmkit#2301 (delete global tasks when the node running them is deleted)

Minor cleanups, dependency bumps, and vendoring:
- moby/swarmkit#2271
- moby/swarmkit#2279
- moby/swarmkit#2283
- moby/swarmkit#2282
- moby/swarmkit#2274
- moby/swarmkit#2296 (dependency bump of etcd, go-winio)

Signed-off-by: Ying Li <ying.li@docker.com>
Upstream-commit: 4509a00
Component: engine
silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Mar 23, 2020
- moby/swarmkit#2281 - fixes an issue where some cluster updates
  could be missed if a manager receives a catch-up snapshot from another manager
- moby/swarmkit#2300 - fixes a possible memory issue if an
  external CA sends an overlarge response

Signed-off-by: Ying <ying.li@docker.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants