manager: Always run the watch server #2323

aaronlehmann · 2017-07-20T17:08:55Z

The watch server was wrongly changed to only run on the leader node. It needs to run on all managers because this is one of the RPC services that is not proxied to the leader (since all nodes receive events through Raft).

This is regression introduced by #2310 (sorry). It never made it into the moby tree. It is covered by tests in that tree.

cc @cyli @aluzzardi

The watch server was wrongly changed to only run on the leader node. It needs to run on all managers because this is one of the RPC services that is not proxied to the leader (since all nodes receive events through Raft). Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>

codecov · 2017-07-20T17:19:56Z

Codecov Report

Merging #2323 into master will decrease coverage by 0.03%.
The diff coverage is 25%.

@@            Coverage Diff             @@
##           master    #2323      +/-   ##
==========================================
- Coverage   60.27%   60.23%   -0.04%     
==========================================
  Files         128      128              
  Lines       25972    25972              
==========================================
- Hits        15654    15645       -9     
- Misses       8918     8946      +28     
+ Partials     1400     1381      -19

cyli · 2017-07-20T17:44:01Z

manager/manager.go

@@ -491,6 +491,10 @@ func (m *Manager) Run(parent context.Context) error {
 	healthServer.SetServingStatus("Raft", api.HealthCheckResponse_NOT_SERVING)
 	localHealthServer.SetServingStatus("ControlAPI", api.HealthCheckResponse_NOT_SERVING)

+	if err := m.watchServer.Start(ctx); err != nil {


Non-blocking: probably doesn't matter much, but would it make sense to start this after the raft node has started up and joined (when the other watches are set)? Otherwise if something starts watching right away, would they get the events from the raft node starting up and loading all of its state from disk?

Hmm, where in the code would you suggest? I don't think the end of this startup process is signaled to higher-level code, but I may be forgetting something.

You're right, there's no guarantee or anything. But we start the other watches at https://github.com/docker/swarmkit/pull/2323/files#diff-8077df928eb040c7c69eea83f15e3c9dL542, after we've finished loading raft state from disk and we've observed a leader and cluster state - possibly starting the watch server can also happen at that time?

I like the suggestion, but I'm worried it would trigger the error here, which is not recoverable:

https://github.com/moby/moby/blob/8d703b98b5c403743bf17e22395e32a7271b8d3c/daemon/cluster/noderunner.go#L212-L215

I'd prefer to just fix the regression in this PR, and a future change that's coordinated with some added retry/backoff in moby/moby could delay starting the watch server.

Ah ok, sounds good.

cyli · 2017-07-20T18:16:31Z

LGTM!

- moby/swarmkit#2323 (fix for watch server being run only on leader) Signed-off-by: Ying <ying.li@docker.com>

- moby/swarmkit#2309 (updating the service spec version when rolling back) - moby/swarmkit#2310 (fix for slow swarm shutdown) - moby/swarmkit#2323 (run watchapi server on all managers) Signed-off-by: Ying <ying.li@docker.com>

- moby/swarmkit#2323 (fix for watch server being run only on leader) Signed-off-by: Ying <ying.li@docker.com>

- moby/swarmkit#2309 (updating the service spec version when rolling back) - moby/swarmkit#2310 (fix for slow swarm shutdown) - moby/swarmkit#2323 (run watchapi server on all managers) Signed-off-by: Ying <ying.li@docker.com>

- moby/swarmkit#2323 (fix for watch server being run only on leader) Signed-off-by: Ying <ying.li@docker.com>

- moby/swarmkit#2309 (updating the service spec version when rolling back) - moby/swarmkit#2310 (fix for slow swarm shutdown) - moby/swarmkit#2323 (run watchapi server on all managers) Signed-off-by: Ying <ying.li@docker.com>

cyli reviewed Jul 20, 2017

View reviewed changes

aaronlehmann merged commit 51c1f1f into moby:master Jul 20, 2017

aaronlehmann deleted the watch-server-availability branch July 20, 2017 18:36

This was referenced Jul 25, 2017

[17.06] Re-vendor swarmkit for various fixes docker-archive/docker-ce#134

Merged

[17.07] Re-vendor swarmkit docker-archive/docker-ce#139

Merged

silvin-lubecki pushed a commit to silvin-lubecki/docker-ce that referenced this pull request Feb 3, 2020

Re-vendors swarmkit to include the following fix:

9c3c014

- moby/swarmkit#2323 (fix for watch server being run only on leader) Signed-off-by: Ying <ying.li@docker.com>

silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Feb 3, 2020

Re-vendors swarmkit to include the following fix:

1b9ad09

- moby/swarmkit#2323 (fix for watch server being run only on leader) Signed-off-by: Ying <ying.li@docker.com>

glours pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Mar 11, 2020

Re-vendors swarmkit to include the following fix:

ffcac6f

- moby/swarmkit#2323 (fix for watch server being run only on leader) Signed-off-by: Ying <ying.li@docker.com>

silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Mar 17, 2020

Re-vendors swarmkit to include the following fix:

09f6571

- moby/swarmkit#2323 (fix for watch server being run only on leader) Signed-off-by: Ying <ying.li@docker.com>

silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Mar 23, 2020

Re-vendors swarmkit to include the following fix:

b908d6a

- moby/swarmkit#2323 (fix for watch server being run only on leader) Signed-off-by: Ying <ying.li@docker.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

manager: Always run the watch server #2323

manager: Always run the watch server #2323

aaronlehmann commented Jul 20, 2017

codecov bot commented Jul 20, 2017 •

edited

Loading

cyli Jul 20, 2017

aaronlehmann Jul 20, 2017

cyli Jul 20, 2017

aaronlehmann Jul 20, 2017

cyli Jul 20, 2017

cyli commented Jul 20, 2017

manager: Always run the watch server #2323

manager: Always run the watch server #2323

Conversation

aaronlehmann commented Jul 20, 2017

codecov bot commented Jul 20, 2017 • edited Loading

Codecov Report

cyli Jul 20, 2017

Choose a reason for hiding this comment

aaronlehmann Jul 20, 2017

Choose a reason for hiding this comment

cyli Jul 20, 2017

Choose a reason for hiding this comment

aaronlehmann Jul 20, 2017

Choose a reason for hiding this comment

cyli Jul 20, 2017

Choose a reason for hiding this comment

cyli commented Jul 20, 2017

codecov bot commented Jul 20, 2017 •

edited

Loading