Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd3/watcher: Event.Object should have the same rev as etcd delete #25396

Merged
merged 2 commits into from
May 10, 2016

Conversation

hongchaodeng
Copy link
Contributor

What's the problem?

When a delete is watched, the revision should be larger than any previous to guarantee ordering. However, currently etcd3 decodes the previous rev into returned object:

oldObj, err = decodeObj(codec, versioner, getResp.Kvs[0].Value, getResp.Kvs[0].ModRevision)

This will break, for example, cacher's assumption here if it re-watch.

if event.ResourceVersion > resourceVersion {
c.sendWatchCacheEvent(event)
}

The etcd2 impl. also takes the current ModifiedIndex to ensure it's a larger number:

if res.Node != nil {
// Note that this sends the *old* object with the etcd index for the time at
// which it gets deleted. This will allow users to restart the watch at the right
// index.
node.ModifiedIndex = res.Node.ModifiedIndex
}

What's this PR?

It fixes above problem by using etcd's delete revision.

@hongchaodeng
Copy link
Contributor Author

/cc @xiang90 @wojtek-t @timothysc

It's a problem I found with test failure. After fixing it, the test failure never happen.

// which it gets deleted. Users use that revision to tell ordering.
rev = e.rev
}
oldObj, err = decodeObj(codec, versioner, getResp.Kvs[0].Value, rev)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is strange... why shall we change the revision of the old object? for the watch issue, can you show me the code for etcd2? i would assume there are two objs in the event curObj and oldObj and e.Rev should be the rev of curObj. So the rev of the oldObj should not matter at all for ordering purpose. No?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The etcd2 code is written in first comment too:

if res.Node != nil {
// Note that this sends the *old* object with the etcd index for the time at
// which it gets deleted. This will allow users to restart the watch at the right
// index.
node.ModifiedIndex = res.Node.ModifiedIndex
}

watch.Event contains only one object.

So the rev of the oldObj should not matter at all for ordering purpose. No?

I have the same concern. The object is actually returned via Informer too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK... This seems to be fragile...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They use the event number to signal the last known resource version in the informer in case there is a watch break.

@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-label-needed labels May 10, 2016
@xiang90
Copy link
Contributor

xiang90 commented May 10, 2016

LGTM

@xiang90 xiang90 added release-note-none Denotes a PR that doesn't merit a release note. e2e-not-required lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed release-note-label-needed labels May 10, 2016
@k8s-github-robot k8s-github-robot added needs-ok-to-merge and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels May 10, 2016
@@ -123,7 +123,7 @@ func TestDeleteTriggerWatch(t *testing.T) {
if err := store.Delete(ctx, key, &api.Pod{}, nil); err != nil {
t.Fatalf("Delete failed: %v", err)
}
testCheckResult(t, 0, watch.Deleted, w, storedObj)
testCheckResult(t, 0, watch.Deleted, w, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we still check the expected storeObject? we probably need to change the rev of the storeObj? Or if we do not care about the triggered result, we should explicitly express that when calling the testCheckResult func.

@@ -227,14 +265,17 @@ func (c *testCodec) Decode(data []byte, defaults *unversioned.GroupVersionKind,
return nil, nil, errors.New("Expected decoding failure")
}

func testCheckResult(t *testing.T, i int, expectEventType watch.EventType, w watch.Interface, expectObj *api.Pod) {
func testCheckResult(t *testing.T, i int, expectEventType watch.EventType, w watch.Interface, expectObj *api.Pod, ignoreResult bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually... how about adding another function called testExpectReceivingEvent, which only tries to receive event from the given chan? I dislike the bool arg.

@@ -227,14 +265,25 @@ func (c *testCodec) Decode(data []byte, defaults *unversioned.GroupVersionKind,
return nil, nil, errors.New("Expected decoding failure")
}

func testCheckEventType(t *testing.T, expectEventType watch.EventType, w watch.Interface) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiang90
Your suggestion sounds actually a good idea.
I have split another function to check event type specifically.

@xiang90 xiang90 added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 10, 2016
@k8s-bot
Copy link

k8s-bot commented May 10, 2016

GCE e2e build/test passed for commit 30f2a64ca8386b5c9b1184d4b5c374fddc07cc2f.

@hongchaodeng
Copy link
Contributor Author

@k8s-bot unit test this please issue: #25037

log link

@k8s-github-robot
Copy link

Automatic merge from submit-queue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants