Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
raft: Avoid busy loop during leader election.
When a server doesn't see a leader yet, e.g. during leader re-election, if a transaction comes from a client, it will cause 100% CPU busy loop. With debug log enabled it is like: 2020-02-28T04:04:35.631Z|00059|poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164 2020-02-28T04:04:35.631Z|00062|poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164 2020-02-28T04:04:35.631Z|00065|poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164 2020-02-28T04:04:35.631Z|00068|poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164 2020-02-28T04:04:35.631Z|00071|poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164 2020-02-28T04:04:35.631Z|00074|poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164 2020-02-28T04:04:35.631Z|00077|poll_loop|DBG|wakeup due to 0-ms timeout at ../ovsdb/trigger.c:164 ... The problem is that in ovsdb_trigger_try(), all cluster errors are treated as temporary error and retry immediately. This patch fixes it by introducing 'run_triggers_now', which tells if a retry is needed immediately. When the cluster error is with detail 'not leader', we don't immediately retry, but will wait for the next poll event to trigger the retry. When 'not leader' status changes, there must be a event, i.e. raft RPC that changes the status, so the trigger is guaranteed to be triggered, without busy loop. Signed-off-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
- Loading branch information