Skip to content

Commit

Permalink
MB-47254 (7.1.0 1910) Avoid log flooding from watcherServer connect fail
Browse files Browse the repository at this point in the history
indexing repo piece of this fix. There is also a gometa repo piece.

In a cluster with a large number of Index nodes, a long-lived network
partition flooded the logs with error messages from watcherServer.go
runOnce(), because even though the retries backed off to be 30 seconds
apart for each peer, these retries were ongoing for attempts to contact
all other Index nodes, which multiplies their frequency by the number of
such nodes. This made indexer.log wrap in less than an hour at one
customer.

The retry timers were all ticking at integer numbers of seconds from
their start times, which themselves are all one second apart in a
network partition case because of a foreground 1-second wait for success
by the outer caller, metadata_provider.go WatchMetadata(), before
switching to background forever waits.

The fix is:

1. Only log the connection failure messages for each peer on first
   failure and then every 100 retries thereafter. I also added the try
   number and the hostname with which contact failed to the logging.

2. (Minor:) In the case of an explicit kill, the old Timer needs to be
   stopped and its channel potentially drained before returning, else it
   can never be garbage collected.

3. Change the 1000 ms foreground wait in WatchMetadata() to 971 ms,
   a prime number, to prevent the retry Timers from all waking up on
   1-second harmonics of the start of launch if the network is in fact
   partitioned.

Change-Id: Ic88fe91cc18cd806901042443dca171e074a16ec
  • Loading branch information
cherkauer-couchbase committed Dec 20, 2021
1 parent 7d0d14c commit 451687a
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions secondary/manager/client/metadata_provider.go
Expand Up @@ -293,8 +293,10 @@ func (o *MetadataProvider) WatchMetadata(indexAdminPort string, callback watcher
// start a watcher to the indexer admin
watcher, readych := o.startWatcher(indexAdminPort)

// wait for indexer to connect
success, _ := watcher.waitForReady(readych, 1000, nil)
// Wait for indexer to connect for a prime number of ms to prevent retry Timers in watcherServer
// from all being aligned on harmonics of 1 sec if the network is partitioned. (This used to
// foreground wait for 1,000 ms which led to "thundering herd" retries.)
success, _ := watcher.waitForReady(readych, 971, nil)
if success {
// if successfully connected, retrieve indexerId
success, _ = watcher.notifyReady(indexAdminPort, 0, nil)
Expand Down

0 comments on commit 451687a

Please sign in to comment.