You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the read_timeout is reached in Elastomer::Client, an exception is raised and the Ruby code aborts the request handling process. The Elasticsearch cluster will continue processing the search request and will eventually send a response - the HTTP connection is still open. Subsequent calls using this same connection can return search results for a previous request that timed out. This scenario is the reason that the OpaqueId middleware exists.
A more proactive solution would be to not reuse the connection when a read timeout is reached. Instead, the connection should be discarded and a new connection established. This will prevent subsequent search requests from triggering the OpaqueId error condition.
After doing a little more digging, I think the read timeout theory is a red herring. In production we are using a persistent Excon adapter. Excon will close the current socket if it reaches a read timeout or a write timeout. Because of this, we will never get an OpaqueId error because an HTTP connection is reused after a timeout.
The root cause of the OpaqueId errors is due to something else.
The original purpose of the OpaqueId middleware was to work around a (now fixed) bug in excon that didn't reset the connection properly. We still see OpaqueId errors occasionally, so there's something else in the stack that can cause the issue.
When the
read_timeout
is reached in Elastomer::Client, an exception is raised and the Ruby code aborts the request handling process. The Elasticsearch cluster will continue processing the search request and will eventually send a response - the HTTP connection is still open. Subsequent calls using this same connection can return search results for a previous request that timed out. This scenario is the reason that theOpaqueId
middleware exists.A more proactive solution would be to not reuse the connection when a read timeout is reached. Instead, the connection should be discarded and a new connection established. This will prevent subsequent search requests from triggering the OpaqueId error condition.
/cc @grantr @tmm1
The text was updated successfully, but these errors were encountered: