You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We use Chewy and import documents in a custom Sidekiq worker. We also use Sidekiq for other purposes, and it is expected that some unrelated jobs will sometimes fail.
This means a single Sidekiq worker process will regularly close threads and start new ones (as Sidekiq does on job failure), therefore resulting in new Chewy.client instances.
One thing we noticed though, is that the underlying Elasticsearch connections are only closed when Ruby's garbage collector collects the dead thread's Chewy.client instance, which seems to be the cause of a file descriptor leak in our application.
We believe we have found a way to close these connections by adding the following code to the error handler in a custom Sidekiq middleware:
Chewy.client.transport.transport.connections.eachdo |connection|
# This bit of code is tailored for the HTTPClient Faraday adapterconnection.connection.app.instance_variable_get(:@client)&.reset_allend
However, this piece of code breaks multiple layers of abstractions, going through chewy, elasticsearch, elasticsearch-transport, faraday and faraday-httpclient, even accessing an otherwise unexposed instance variable at one point.
Is there a better way to manually close Chewy's connections to Elasticsearch? Are we missing something obvious about their lifecycle?
Digging into it, my understanding of the issue is that neither chewy, elasticsearch nor elasticsearch-transport provide a method to close connections.
It looks like faraday has Faraday::Connection#close but that appears to not actually be implemented in most adapters, and in particular not in the faraday-httpclient adapter that ends up being used in our app.
We use Chewy and import documents in a custom Sidekiq worker. We also use Sidekiq for other purposes, and it is expected that some unrelated jobs will sometimes fail.
This means a single Sidekiq worker process will regularly close threads and start new ones (as Sidekiq does on job failure), therefore resulting in new
Chewy.client
instances.One thing we noticed though, is that the underlying Elasticsearch connections are only closed when Ruby's garbage collector collects the dead thread's
Chewy.client
instance, which seems to be the cause of a file descriptor leak in our application.We believe we have found a way to close these connections by adding the following code to the error handler in a custom Sidekiq middleware:
However, this piece of code breaks multiple layers of abstractions, going through
chewy
,elasticsearch
,elasticsearch-transport
,faraday
andfaraday-httpclient
, even accessing an otherwise unexposed instance variable at one point.Is there a better way to manually close Chewy's connections to Elasticsearch? Are we missing something obvious about their lifecycle?
Digging into it, my understanding of the issue is that neither
chewy
,elasticsearch
norelasticsearch-transport
provide a method to close connections.It looks like
faraday
hasFaraday::Connection#close
but that appears to not actually be implemented in most adapters, and in particular not in thefaraday-httpclient
adapter that ends up being used in our app.I opened a similar issue for the
elasticsearch
gem: elastic/elasticsearch-ruby#2389Of course, I may have missed something, and would be glad to know what if that's the case!
The text was updated successfully, but these errors were encountered: