Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/use thread locals #77

Closed
wants to merge 4 commits into from
Closed

Conversation

ph
Copy link
Contributor

@ph ph commented Sep 13, 2017

The elasticsearch ruby client isn't thread safe
Since the ES Ruby client isn't thread safe, we are currently using
Thread local to create a client per workers to make the connection pool
thread safe.

closes: #76

@ph
Copy link
Contributor Author

ph commented Sep 14, 2017

@andrewvc Could you take a look? This will unblock @gingerwizard

@gingerwizard
Copy link

Is it possible to control the number of connections used by each client? Can we pass this as an option? This would atleast allow us to have some control, if somewhat abstracted, over the number of connections to ES. @ph

@gingerwizard
Copy link

It seems it uses a connection per host so this seems unlikely without a custom connection strategy.

@ph
Copy link
Contributor Author

ph commented Sep 14, 2017

Is it possible to control the number of connections used by each client? Can we pass this as an option? This would atleast allow us to have some control, if somewhat abstracted, over the number of connections to ES.

Well IIRC how the ruby client handle his pool, this PR change will make 1 connection per worker? Every call to the ES host are sync in the current context.

@andrewvc
Copy link
Contributor

I'm not too worried about conn usage. ES is pretty about handling high numbers of conns. Yes, in some cases it could be a problem. Unfortunately, we have no short-term options to fix the concurrency bug and limit connection usage.

Fixing this stuff is definitely on our roadmap however.

:logger => @logger
}
@client = LogStash::Filters::ElasticsearchClient.new(@user, @password, options)
@clients_pool = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't used. We should, however, use this as opposed to Thread.current[:filter_elasticsearch_client] since we'll need to close all the clients on pipeline shutdown.

I recommend making this a java.util.concurrent.ConcurrentHashMap indexed by Thread.current, and using that to handle the close behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funny, It was my first draft yesterday. I've figured since we already have Thread.current's hash it would have been enough.

Since the ES Ruby client isn't thread safe, we are currently using
Thread local to create a client per workers to make the connection pool
thread safe.

closes: logstash-plugins#76
@ph
Copy link
Contributor Author

ph commented Sep 15, 2017

@andrewvc updated with your review, travis still seems to be on the fritz.

@andrewvc
Copy link
Contributor

andrewvc commented Sep 15, 2017 via email

@ph
Copy link
Contributor Author

ph commented Sep 15, 2017

@andrewvc When I did the PR, I've looked at the elasticsearch-ruby documentation and the code. The client and the connection class doesn't expose any way to close the connection.

By looking at the factory https://github.com/logstash-plugins/logstash-filter-elasticsearch/blob/master/lib/logstash/filters/elasticsearch/client.rb#L28 We are using the default transport which should be net/http with Faraday. Complete adapter code is at https://github.com/lostisland/faraday/blob/master/lib/faraday/adapter/net_http.rb, the library is using the block syntax, so It should open/close on every request.

Even if we were using the manticore adapter that we have developed it doesn't have a close implementation.

Next steps?

  • I think its to use our own thin wrapper?
  • or the low level rest client.

@andrewvc
Copy link
Contributor

@ph thanks for the investigation.

That's really bad news about the implementation, but I guess there's nothing to do here. We'll fix that when we move to using the ES output's client

Copy link
Contributor

@andrewvc andrewvc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want ruby 2.0 syntax here? I don't think that's the case.

end

def get_client
@clients_pool.computeIfAbsent(Thread.current, ->(k) { new_client })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ruby 2.0 syntax, so it won't work with logstash < 6. Is that intentional?

Copy link
Contributor Author

@ph ph Sep 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't good catch, I am already used to the new koolaid.

@elasticsearch-bot
Copy link

Pier-Hugues Pellerin merged this into the following branches!

Branch Commits
master cb6d07d, 4b6261a, dd1f1dd, d1f5ab3

elasticsearch-bot pushed a commit that referenced this pull request Sep 22, 2017
elasticsearch-bot pushed a commit that referenced this pull request Sep 22, 2017
elasticsearch-bot pushed a commit that referenced this pull request Sep 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot get Connection from Pool
4 participants