New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RestClient improvements #19055

Closed
javanna opened this Issue Jun 24, 2016 · 23 comments

Comments

Projects
None yet
@javanna
Member

javanna commented Jun 24, 2016

This is a meta issue to track improvements that need to be made to the recently added RestClient (#18735).

  • add async variant of performRequest method (#19301)
  • add docs page on how to use it and configuration for common scenarios like ssl, proxy, basic auth etc. although we can refer to apache http client documentation (#19618)
  • higher level client that exposes api specific methods and helps with providing the right parameters. Bonus points if we can keep backwards compatibility with the current java api by ether implementing the Client interface or at least reusing the request and response objects that we already have, or something along those lines to ease migration.
@nik9000

This comment has been minimized.

Show comment
Hide comment
@nik9000

nik9000 Jun 24, 2016

Contributor

Bonus points if we can keep backwards compatibility with the current java api by implementing the Client interface and reusing the RequestBuilder objects that we already have, or something along those lines to ease migration.

I think that'd be super useful but if we're going to do it I'd want to do it in yet another project. So, like :client:high-level for a useful high level client and :client:transport-adapter or something. That way the high level client doesn't have to pull in all of Elasticsearch's dependencies and we don't have to figure out a way to slice the transport client's interface out of Elasticsearch.

Contributor

nik9000 commented Jun 24, 2016

Bonus points if we can keep backwards compatibility with the current java api by implementing the Client interface and reusing the RequestBuilder objects that we already have, or something along those lines to ease migration.

I think that'd be super useful but if we're going to do it I'd want to do it in yet another project. So, like :client:high-level for a useful high level client and :client:transport-adapter or something. That way the high level client doesn't have to pull in all of Elasticsearch's dependencies and we don't have to figure out a way to slice the transport client's interface out of Elasticsearch.

javanna added a commit to javanna/elasticsearch that referenced this issue Jul 20, 2016

Rest client: introduce async performRequest method and use async clie…
…nt under the hood for sync requests too

The new method accepts the usual parameters (method, endpoint, params, entity and headers) plus a response listener and an async response consumer. Shortcut methods are also added that don't require params, entity and the async response consumer optional.

There are a few relevant api changes as a consequence of the move to async client that affect sync methods:
- Response doesn't implement Closeable anymore, responses don't need to be closed
- performRequest throws Exception rather than just IOException, as that is the the exception that we get from the FutureCallback#failed method in the async http client
- ssl configuration is a bit simpler, one only needs to call setSSLStrategy from a custom HttpClientConfigCallback, that doesn't end up overridng any other default around connection pooling (it used to happen with the sync client and make ssl configuration more complex)

Relates to elastic#19055
@mkw

This comment has been minimized.

Show comment
Hide comment
@mkw

mkw Aug 31, 2016

I have configured an anonymous user for shield with limited monitoring privileges. Consequently, I get back a 403, not a 401, when making anonymous requests. This prevents the commons http client from responding with the correct credentials.

There should be a way to force pre-emptive authentication, either by setting credentials in the configuration manually (and letting the RestClient manage basic auth by itself), or by exposing the the HttpClientContext object used with the request. See: https://hc.apache.org/httpcomponents-client-ga/httpclient/examples/org/apache/http/examples/client/ClientPreemptiveBasicAuthentication.java

mkw commented Aug 31, 2016

I have configured an anonymous user for shield with limited monitoring privileges. Consequently, I get back a 403, not a 401, when making anonymous requests. This prevents the commons http client from responding with the correct credentials.

There should be a way to force pre-emptive authentication, either by setting credentials in the configuration manually (and letting the RestClient manage basic auth by itself), or by exposing the the HttpClientContext object used with the request. See: https://hc.apache.org/httpcomponents-client-ga/httpclient/examples/org/apache/http/examples/client/ClientPreemptiveBasicAuthentication.java

@jaymode

This comment has been minimized.

Show comment
Hide comment
@jaymode

jaymode Aug 31, 2016

Member

@mkw you can also set shield.authc.anonymous.authz_exception: false in your elasticsearch.yml to have the response be a 401 instead of the 403. This should allow you to use the client as is today.

Member

jaymode commented Aug 31, 2016

@mkw you can also set shield.authc.anonymous.authz_exception: false in your elasticsearch.yml to have the response be a 401 instead of the 403. This should allow you to use the client as is today.

@mkw

This comment has been minimized.

Show comment
Hide comment
@mkw

mkw Aug 31, 2016

@jaymode Thanks! That is very helpful and vasty superior to manually adding a default "Authentication" header.

mkw commented Aug 31, 2016

@jaymode Thanks! That is very helpful and vasty superior to manually adding a default "Authentication" header.

@mkw

This comment has been minimized.

Show comment
Hide comment
@mkw

mkw Aug 31, 2016

I may have spoken too soon: While using the challenge/response mechanism is clearly more secure than pre-emptive authentication, it forces a double round-trip for every request because Apache HC will only cache authentication for the duration of an HttpClientContext. Simple searches are quick enough that the extra request really adds up.

mkw commented Aug 31, 2016

I may have spoken too soon: While using the challenge/response mechanism is clearly more secure than pre-emptive authentication, it forces a double round-trip for every request because Apache HC will only cache authentication for the duration of an HttpClientContext. Simple searches are quick enough that the extra request really adds up.

@brandonkearby

This comment has been minimized.

Show comment
Hide comment
@brandonkearby

brandonkearby Sep 13, 2016

Contributor

Hi Guys, I'm a little late to the party, but started working on a high level rest client. My company has been stuck on 1.4 due to the size of our codebase and number of clients in production. This low level REST client sparked the idea... [https://github.com/brandonkearby/elasticsearch/tree/rest-client]
What I did was backport the low level rest client into the 1.4 branch and renamed it to InternalRestClient. Then started supporting the various api calls. Everything was going pretty smooth until I started working on Aggregations as they need to provide the type. Anyway I'm interested in working on the high level rest client.

Contributor

brandonkearby commented Sep 13, 2016

Hi Guys, I'm a little late to the party, but started working on a high level rest client. My company has been stuck on 1.4 due to the size of our codebase and number of clients in production. This low level REST client sparked the idea... [https://github.com/brandonkearby/elasticsearch/tree/rest-client]
What I did was backport the low level rest client into the 1.4 branch and renamed it to InternalRestClient. Then started supporting the various api calls. Everything was going pretty smooth until I started working on Aggregations as they need to provide the type. Anyway I'm interested in working on the high level rest client.

@clintongormley clintongormley added v5.0.0 and removed v5.0.0-beta1 labels Sep 14, 2016

@clintongormley clintongormley removed the v5.0.0 label Oct 18, 2016

@dhay

This comment has been minimized.

Show comment
Hide comment
@dhay

dhay Oct 31, 2016

I like the promise of the new 5.0 rest client. However, in it's current form, it isn't terribly useful. We would have to rewrite all of our query and index manipulation code to generate JSON. Having a Client implementation based on the rest-client would be ideal, but doesn't seem like a straightforward thing to build.

dhay commented Oct 31, 2016

I like the promise of the new 5.0 rest client. However, in it's current form, it isn't terribly useful. We would have to rewrite all of our query and index manipulation code to generate JSON. Having a Client implementation based on the rest-client would be ideal, but doesn't seem like a straightforward thing to build.

@javanna

This comment has been minimized.

Show comment
Hide comment
@javanna

javanna Nov 28, 2016

Member

Quick update: we are not going to implement the Client interface in the upcoming high level REST client. We'd rather to start with a clean design and focus on a better API for the time being. We will reuse as much as possible from the current java API though, meaning requests, responses and builders. We are experimenting on this. We are considering to extract the current java api into a common library, or just make the high level client depend on elasticsearch core, at least in the beginning.

Member

javanna commented Nov 28, 2016

Quick update: we are not going to implement the Client interface in the upcoming high level REST client. We'd rather to start with a clean design and focus on a better API for the time being. We will reuse as much as possible from the current java API though, meaning requests, responses and builders. We are experimenting on this. We are considering to extract the current java api into a common library, or just make the high level client depend on elasticsearch core, at least in the beginning.

@brandonkearby

This comment has been minimized.

Show comment
Hide comment
@brandonkearby

brandonkearby Dec 7, 2016

Contributor

@javanna Not sure you took a peek, but I have a pretty much complete high level REST client already written for 1.4.1. It works with the existing Client and Admin interfaces

https://github.com/brandonkearby/elasticsearch/commits/rest-client-1.4.1

Contributor

brandonkearby commented Dec 7, 2016

@javanna Not sure you took a peek, but I have a pretty much complete high level REST client already written for 1.4.1. It works with the existing Client and Admin interfaces

https://github.com/brandonkearby/elasticsearch/commits/rest-client-1.4.1

@jettro

This comment has been minimized.

Show comment
Hide comment
@jettro

jettro Dec 7, 2016

@javanna If at all possible I would not make a dependency on core if that would mean you also have to pull in all the Lucene jars. I personally really like the idea of having as little dependencies as possible.

jettro commented Dec 7, 2016

@javanna If at all possible I would not make a dependency on core if that would mean you also have to pull in all the Lucene jars. I personally really like the idea of having as little dependencies as possible.

@javanna

This comment has been minimized.

Show comment
Hide comment
@javanna

javanna Dec 7, 2016

Member

I personally really like the idea of having as little dependencies as possible.

@jettro that makes sense and everybody would like to have no deps of course, but we need to focus first on having something to release rather sooner than later (for which the dependency on es core, hence lucene, is acceptable). After that we can focus on a long term solution, where we'll iterate and hopefully remove the dependency on es core. Other important point is trying to ease the migration from the Java API, for which reusing requests and responses is very important, I think more than removing the dependency on core at this point.

Member

javanna commented Dec 7, 2016

I personally really like the idea of having as little dependencies as possible.

@jettro that makes sense and everybody would like to have no deps of course, but we need to focus first on having something to release rather sooner than later (for which the dependency on es core, hence lucene, is acceptable). After that we can focus on a long term solution, where we'll iterate and hopefully remove the dependency on es core. Other important point is trying to ease the migration from the Java API, for which reusing requests and responses is very important, I think more than removing the dependency on core at this point.

@brandonkearby

This comment has been minimized.

Show comment
Hide comment
@brandonkearby

brandonkearby Dec 8, 2016

Contributor

Yeah @javanna ease of migration was very important for us. It's the primary reason for adding it in 1.4.1 since that's the version we are on. That way we don't have to maintain to ways of doing things while in the transition state. This allows us to continue to use the transport client and once a customer's cluster has been reindex into a 5.0 cluster, we then switch them to the rest client. Same code using transport and rest in the same jvm.

Contributor

brandonkearby commented Dec 8, 2016

Yeah @javanna ease of migration was very important for us. It's the primary reason for adding it in 1.4.1 since that's the version we are on. That way we don't have to maintain to ways of doing things while in the transition state. This allows us to continue to use the transport client and once a customer's cluster has been reindex into a 5.0 cluster, we then switch them to the rest client. Same code using transport and rest in the same jvm.

@javanna

This comment has been minimized.

Show comment
Hide comment
@javanna

javanna Dec 8, 2016

Member

hey @brandonkearby thanks for the heads up I had a look at your branch. It is similar to what we intend to do, but we will most likely add the parsing code as fromXContent methods to all the classes that implement ToXContent, one step at a time. Also, ease of migration doesn't mean 1:1 replacement in our plan. We don't plan on implementing the Client interface like you did. We have been unhappy with the Client interface for a long time, see #9201. We would like to take the chance and clean that up, have a sync method and an async method, rather than a mix of the two. The compatibility will be on the request, builders and response side of things. The call itself will need some slight updating, which should be acceptable.

Member

javanna commented Dec 8, 2016

hey @brandonkearby thanks for the heads up I had a look at your branch. It is similar to what we intend to do, but we will most likely add the parsing code as fromXContent methods to all the classes that implement ToXContent, one step at a time. Also, ease of migration doesn't mean 1:1 replacement in our plan. We don't plan on implementing the Client interface like you did. We have been unhappy with the Client interface for a long time, see #9201. We would like to take the chance and clean that up, have a sync method and an async method, rather than a mix of the two. The compatibility will be on the request, builders and response side of things. The call itself will need some slight updating, which should be acceptable.

@brandonkearby

This comment has been minimized.

Show comment
Hide comment
@brandonkearby

brandonkearby Dec 8, 2016

Contributor

@javanna, I started down the path of doing fromXContent, but there were places where the object graph didn't match the JSON. This made it difficult without a DOM approach. As far as the changes in #9201, looks a lot cleaner!

Contributor

brandonkearby commented Dec 8, 2016

@javanna, I started down the path of doing fromXContent, but there were places where the object graph didn't match the JSON. This made it difficult without a DOM approach. As far as the changes in #9201, looks a lot cleaner!

@dhay

This comment has been minimized.

Show comment
Hide comment
@dhay

dhay Dec 8, 2016

Has there been any thought to collaborating on improving the existing REST client that is out there, Jest?

https://github.com/searchbox-io/Jest

dhay commented Dec 8, 2016

Has there been any thought to collaborating on improving the existing REST client that is out there, Jest?

https://github.com/searchbox-io/Jest

@javanna

This comment has been minimized.

Show comment
Hide comment
@javanna

javanna Dec 14, 2016

Member

We just published a blogpost on the current state of the Java clients which should answer most of the questions around the REST client and its future: https://www.elastic.co/blog/state-of-the-official-elasticsearch-java-clients .

@dhay as explained in the blog we considered different approaches but we decided to start from our own Java API and take it from there.

Member

javanna commented Dec 14, 2016

We just published a blogpost on the current state of the Java clients which should answer most of the questions around the REST client and its future: https://www.elastic.co/blog/state-of-the-official-elasticsearch-java-clients .

@dhay as explained in the blog we considered different approaches but we decided to start from our own Java API and take it from there.

@brusic

This comment has been minimized.

Show comment
Hide comment
@brusic

brusic Dec 14, 2016

Contributor

From the blog: "Benchmarks, however, show that the performance of the HTTP client is close enough to that of the Transport client"

I do not read the results in the same way. :) Would love to see the benchmarks for multi-threaded indexing, not single-threaded. But this issue is regarding the RestClient, so I will rest my defense of the pure Java API for another time.

Contributor

brusic commented Dec 14, 2016

From the blog: "Benchmarks, however, show that the performance of the HTTP client is close enough to that of the Transport client"

I do not read the results in the same way. :) Would love to see the benchmarks for multi-threaded indexing, not single-threaded. But this issue is regarding the RestClient, so I will rest my defense of the pure Java API for another time.

@danielmitterdorfer

This comment has been minimized.

Show comment
Hide comment
@danielmitterdorfer

danielmitterdorfer Dec 15, 2016

Member

I do not read the results in the same way. :)

@brusic Can you please elaborate? I did the original benchmarks. Maybe something I did not explain something clearly enough in the blog post or I made a mistake (benchmarking is hard and even with great care something can go wrong :)).

Would love to see the benchmarks for multi-threaded indexing, not single-threaded.

That's one aspect that I thought about originally but I disregarded the idea because we needed to get an initial idea about the performance characteristics of REST vs. transport. Moreover, multi-threaded performance should be dominated by contention in the connection pool and not the client. Nevertheless, we could extend the benchmarks at a later time.

Member

danielmitterdorfer commented Dec 15, 2016

I do not read the results in the same way. :)

@brusic Can you please elaborate? I did the original benchmarks. Maybe something I did not explain something clearly enough in the blog post or I made a mistake (benchmarking is hard and even with great care something can go wrong :)).

Would love to see the benchmarks for multi-threaded indexing, not single-threaded.

That's one aspect that I thought about originally but I disregarded the idea because we needed to get an initial idea about the performance characteristics of REST vs. transport. Moreover, multi-threaded performance should be dominated by contention in the connection pool and not the client. Nevertheless, we could extend the benchmarks at a later time.

@brusic

This comment has been minimized.

Show comment
Hide comment
@brusic

brusic Dec 15, 2016

Contributor

According to the original blog post "The HTTP client has between 4% and 7% smaller bulk indexing throughput than the transport client." In my world, those numbers are not close enough. :) I want to ask my boss for a 4%-7% raise. Hey, it's close enough!

The blog continues to say that those benchmarks are under lab conditions, but the underlying problem remains. Then again, without multi-threaded benchmarks, I am disregarding the results. :) I use the binary API for indexing and REST for searching since I need all the performance I can get from indexing. The only reason why I use Jest is because I have a dependency on an older version of Lucene in my code and therefore cannot use the elasticsearch jar.

Contributor

brusic commented Dec 15, 2016

According to the original blog post "The HTTP client has between 4% and 7% smaller bulk indexing throughput than the transport client." In my world, those numbers are not close enough. :) I want to ask my boss for a 4%-7% raise. Hey, it's close enough!

The blog continues to say that those benchmarks are under lab conditions, but the underlying problem remains. Then again, without multi-threaded benchmarks, I am disregarding the results. :) I use the binary API for indexing and REST for searching since I need all the performance I can get from indexing. The only reason why I use Jest is because I have a dependency on an older version of Lucene in my code and therefore cannot use the elasticsearch jar.

@javanna

This comment has been minimized.

Show comment
Hide comment
@javanna

javanna Dec 15, 2016

Member

heya @brusic I think numbers outside of context can be misleading. Using the Java API has so many downsides at this point, like explained in the blogpost, that ~ 5% performance loss is reasonable. We will also work to make the REST client as performant as we can (and do more benchmarks like you are suggesting). That's the right trade-off. The main reason why we had the Java API in the first place wasn't performance, rather that it came for free :) If we started today from scratch we wouldn't expose the transport layer to users anymore, which is why we decided to go down this path.

Member

javanna commented Dec 15, 2016

heya @brusic I think numbers outside of context can be misleading. Using the Java API has so many downsides at this point, like explained in the blogpost, that ~ 5% performance loss is reasonable. We will also work to make the REST client as performant as we can (and do more benchmarks like you are suggesting). That's the right trade-off. The main reason why we had the Java API in the first place wasn't performance, rather that it came for free :) If we started today from scratch we wouldn't expose the transport layer to users anymore, which is why we decided to go down this path.

@godiedelrio

This comment has been minimized.

Show comment
Hide comment
@godiedelrio

godiedelrio Feb 3, 2017

If the request / response objects are meant to be reused in the Java Rest Client, care should be taken to not impose java 8 on the client side, IMHO. That way many codebases that can't easly upgrade to Java 8 can still continue to use Elasticsearch 5 through the Java Rest Client.

godiedelrio commented Feb 3, 2017

If the request / response objects are meant to be reused in the Java Rest Client, care should be taken to not impose java 8 on the client side, IMHO. That way many codebases that can't easly upgrade to Java 8 can still continue to use Elasticsearch 5 through the Java Rest Client.

@javanna

This comment has been minimized.

Show comment
Hide comment
@javanna

javanna Feb 23, 2017

Member

hi @godiedelrio , we high level REST client will still depend on Elasticsearch at least in the beginning. That means it will also require java 8. People that can't upgrade to java 8 or don't want Elasticsearch as a dependency can use the low level REST client, and take care of writing requests and reading back responses in their own application.

Member

javanna commented Feb 23, 2017

hi @godiedelrio , we high level REST client will still depend on Elasticsearch at least in the beginning. That means it will also require java 8. People that can't upgrade to java 8 or don't want Elasticsearch as a dependency can use the low level REST client, and take care of writing requests and reading back responses in their own application.

@javanna

This comment has been minimized.

Show comment
Hide comment
@javanna

javanna Feb 23, 2017

Member

Closing this in favour of #23331 that's more specific to the high level REST client.

Member

javanna commented Feb 23, 2017

Closing this in favour of #23331 that's more specific to the high level REST client.

@javanna javanna closed this Feb 23, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment