Log reason(s) for reroute if exceptional #58259

DaveCTurner · 2020-06-17T14:16:19Z

Sometimes we call reroute() because something went wrong, and often that "something" is a problem common to multiple shards. For instance, a CircuitBreakingException during shard fetching will likely affect all the shard fetches on that node at once. We want to know about such exceptions in the logs, but there's little value in logging the exception for every single shard.

Today we discussed this as a team (in relationship to #57804 (comment)) and decided that it seems natural to use the batching facility built into the BatchedRerouteService to record examples of these failures in the logs, on a reroute-by-reroute basis, without having to record them all.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-06-17T14:16:21Z

Pinging @elastic/es-distributed (:Distributed/Allocation)

DaveCTurner added >enhancement :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Jun 17, 2020

elasticmachine added the Team:Distributed Meta label for distributed team label Jun 17, 2020

DaveCTurner mentioned this issue Jun 17, 2020

verbose logging in Master creating log volume #57804

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log reason(s) for reroute if exceptional #58259

Log reason(s) for reroute if exceptional #58259

DaveCTurner commented Jun 17, 2020

elasticmachine commented Jun 17, 2020

Log reason(s) for reroute if exceptional #58259

Log reason(s) for reroute if exceptional #58259

Comments

DaveCTurner commented Jun 17, 2020

elasticmachine commented Jun 17, 2020