Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Elasticsearch Client API to be less trappy #9201

Closed
s1monw opened this issue Jan 8, 2015 · 33 comments

Comments

Projects
None yet
@s1monw
Copy link
Contributor

commented Jan 8, 2015

Today we have 3 flavors for each endpoint (for demonstration purpose I only use the _health API)

public interface ClusterAdminClient extends ElasticsearchClient<ClusterAdminClient> {

    ActionFuture<ClusterHealthResponse> health(ClusterHealthRequest request);

    void health(ClusterHealthRequest request, ActionListener<ClusterHealthResponse> listener);

    ClusterHealthRequestBuilder prepareHealth(String... indices);

At least one of them is redundant here already but lemme first explain what is trappy here. If you use the prepareHealth().setWaitForGreenStatus() basically nothing happens. you need to call .get() or .execute().get() or .actionGet() or any of the X flavors. This is not so much of a problem for health because you are waiting for a response, but for endpoints that don't return info that is generally consumed ie. update settings this has in the past lead to so many test bugs that I consider this evil.

Yet, I think we should remove this method entirely and detach the Builders entirely from the Client. Today once a builder is created like this:

  ClusterHealthRequestBuilder builder = client.prepareHealth().setWaitForGreenStatus()

it's tied to a client and if you close that client the builder can't be executed. To me if we even keep the builder in the future it should only have a build() method that returns the corresponding request. Then you can call:

  ClusterHealthRequestBuilder builder = AdminBuilders.health().setWaitForGreenStatus();
  client.health(builder.build());

which is less trappy... oh wait... client.health(builder.build()); returns an ActionFuture<ClusterHealthResponse> which is also redundant IMO. We already have a non-blocking API using a listener which is equally powerful.

I think what we should move to is this:

public interface ClusterAdminClient extends ElasticsearchClient<ClusterAdminClient> {

    ClusterHealthResponse health(ClusterHealthRequest request);

    void health(ClusterHealthRequest request, ActionListener<ClusterHealthResponse> listener);

that way it's clearly a blocking API and it's easy to use. We still keep the builder (for now ;) ) but remove it's coupling to the client which is super trappy.

@rmuir

This comment has been minimized.

Copy link
Contributor

commented Jan 8, 2015

This looks much simpler to me.

@mikemccand

This comment has been minimized.

Copy link
Contributor

commented Jan 8, 2015

+1

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jan 8, 2015

Big +1.

@s1monw

This comment has been minimized.

Copy link
Contributor Author

commented Jan 8, 2015

the one thing that we are loosing here is request timeouts here that we where able to put into the get(TimeValue) and friends calls. I think we can just put this on the request itself. Also if users want to have Future like classes that they can just collect I think we can provide a utility listener that allows you do to this.

@rjernst

This comment has been minimized.

Copy link
Member

commented Jan 8, 2015

Sounds great to me! I've been bitten by not calling .get() more than once...

@bleskes

This comment has been minimized.

Copy link
Member

commented Jan 8, 2015

+1. Looks like a good compromise between the alternatives.

@kimchy

This comment has been minimized.

Copy link
Member

commented Jan 9, 2015

I like it!. I think that adding the timeout to the request itself will be misleading, since there users might think its a timeout on the call itself (like search timeout, that still returns values), compared to just giving up on the call completely (which is what the future does). In that context, I like the notion of future that also implements an ActionListener that someone can pass to the call and then work with it.

@s1monw

This comment has been minimized.

Copy link
Contributor Author

commented Jan 9, 2015

ok cool it seems like we are sold here. I took a quick look at this and it's going to create a boat load of compile errors so I think the easiest path here is to do this over time and add one API a time and / or deprecate them? I'd like to start with this rather sooner than later since that way everybody can help when folks are idling etc?

@rjernst

This comment has been minimized.

Copy link
Member

commented Jan 9, 2015

I'd like to start with this rather sooner than later since that way everybody can help when folks are idling etc?

+1

@jprante

This comment has been minimized.

Copy link
Contributor

commented Jan 10, 2015

The API can be a bit confusing when mixed with blocking/non-blocking methods. It could even be more simplified with Observables, like in Couchbase http://docs.couchbase.com/developer/java-2.0/observables.html explained in this blog post http://blog.couchbase.com/why-couchbase-chose-rxjava-new-java-sdk

public interface ClusterAdminService ... {
    Observable<ClusterHealthResponse> health(ClusterHealthRequest request);

but that is for the time when ES is ready for Java 8 :)

@s1monw

This comment has been minimized.

Copy link
Contributor Author

commented Jan 11, 2015

I actually wonder if it makes sense to seperate blocking and non-blocking API interface wise and just make one call another. that would mean 1 method per endpoint and clear distinction.

@uboness

This comment has been minimized.

Copy link
Contributor

commented Jan 11, 2015

we could actually just have a single version of the method that returns the already available listenable action future :

public interface ClusterAdminClient extends ElasticsearchClient<ClusterAdminClient> {

    ListenableActionFuture<ClusterHealthResponse> health(ClusterHealthRequest request);

then, if you'd like blocking:

ClusterHealthResponse response = client.health(request).actionGet();

blocking with timeout:

ClusterHealthResponse response = client.health(request).actionGet(5, TimeUnit.SECONDS);

non-blocking:

client.health(request).addListener(new ActionListener<ClusterHealthResponse>() {
    void onResponse(Response response) {
        ...
    }

    void onFailure(Throwable e) {
        ...
    }
}

I'm not a big fan of the actionGet method. Would've been nice if we could just use the get method of the Future... but that would come at the price of giving up the elasticsearch exception translation.. that said, the interface will be much clearer (instead of having both actionGet and get... we'll only have get)... we could consider defining new constructs to fix/clean this.

Also, I'd say that with any blocking version of the method, you'd want the option of specifying a timeout which will clutter the blocking API IMO. Encapsulating all this logic (blocking/non-blocking/timeouts) in a single object, keeps the API clean. (just something to consider)

@ppurang

This comment has been minimized.

Copy link

commented Jan 11, 2015

+1 for uboness with reservations.

Generally any get, post, put etc. should be a sum type: \/[Throwable, HttpResponse] which is somehow encoded by the ActionListener callbacks, but callbacks don't compose. Then you need a transformation (think map) from the HttpResponse to the target type - noting that a Http response represents failures too (example: a 400 when a 201 was expected). Having this infrastructure in place makes other things rather easy.

Timeouts SHOULD be enforced by the consumer of the asynchronous calls and they are different from tcp timeouts (connection and read timeouts) which are configured for the underlying http clients.

Providing an API like Future.get that blocks forever should be illegal and is a crime against API consumers if not humanity :)

@rjernst

This comment has been minimized.

Copy link
Member

commented Jan 11, 2015

Having the version that returns a Future is exactly what makes the current api trappy. It is very easy to not realize you need to call get/actionGet/whatever. The original proposed version that takes a request and returns a response is what we need 90% of the time. Whether the async version is via an additional optional listener argument, or through a wrapper, I don't care, but the common case should be simple and obvious how to use.

@s1monw

This comment has been minimized.

Copy link
Contributor Author

commented Jan 11, 2015

I'm not a big fan of the actionGet method. Would've been nice if we could just use the get method of the Future... but that would come at the price of giving up the elasticsearch exception translation.. that said, the interface will be much clearer (instead of having both actionGet and get... we'll only have get)... we could consider defining new constructs to fix/clean this.

sorry but this is kind of over-engineering is the problem here and in many other places of elasticsearch. We have to make APIs simple and only be useable in a single way. you have async and sync you have all you need. We are trying to fix things and not making them worse. the .get(). needs to go and those API need to be simple to call. if you need timeouts etc. just use async you can impl that on top of this and or expose it on a different level ie. lightweight clients. But first fix this API that essentially can cause critical issues ie. tests not finding problems because of a missing get.

@uboness

This comment has been minimized.

Copy link
Contributor

commented Jan 11, 2015

Having the version that returns a Future is exactly what makes the current api trappy. It is very easy to not realize you need to call get/actionGet/whatever. The original proposed version that takes a request and returns a response is what we need 90% of the time. Whether the async version is via an additional optional listener argument, or through a wrapper, I don't care, but the common case should be simple and obvious how to use.

sorry, I disagree with this argument. Futures are core Java construct since 1.5 and it's the basic construct in java when it comes to async calls. I also don't buy the argument that it's trappy... I don't think that in 2015 we need to explain developers what a Future is, but for argument sake, if you need response X and you get a Future<X> back, you'll need to get the X out of the returned Future (simply because there'll be no other way to get it). Moreover, I'd argue that returning a Future solidifies the API as it forces the user to "think" async and be aware of the consequences of blocking vs. non-blocking... we then leave the decision what to use when to the user, based on its use case.

sorry but this is kind of over-engineering is the problem here and in many other places of elasticsearch. We have to make APIs simple and only be useable in a single way. you have async and sync you have all you need. We are trying to fix things and not making them worse. the .get(). needs to go and those API need to be simple to call. if you need timeouts etc. just use async you can impl that on top of this and or expose it on a different level ie. lightweight clients.

I don't see the over-engineering when it coming to using Futures for a remote API.

We have to make APIs simple and only be useable in a single way. you have async and sync you have all you need. We are trying to fix things and not making them worse. the .get(). needs to go and those API need to be simple to call. if you need timeouts etc. just use async you can impl that on top of this and or expose it on a different level ie. lightweight clients.

not exposing timeouts in the API is throwing a load on the users. The client interfaces are the Java clients, why would you want to ask users to build additional clients on top of those? Furthermore, as stated above, IMO we should encourage users to think async and timeouts...

But first fix this API that essentially can cause critical issues ie. tests not finding problems because of a missing get.

If we have those, then these are bugs in our tests, not an API issue... IMO at least

All said above, obviously this is turning into one of those discussions where ppl have different strong opinions on. I have no intention here in starting a religious war :)... just to state an opinion (at the end, the approach the most ppl vote for should be used)

@rmuir

This comment has been minimized.

Copy link
Contributor

commented Jan 11, 2015

sorry, I disagree with this argument. Futures are core Java construct since 1.5 and it's the basic construct in java when it comes to async calls.

@uboness I think, the confusion is not around returning Future, but the fact that Future is still not a "free pass" as far as making things intuitive.

In my opinion, the APIs have to try to make it obvious that the call is asynchronous. You can still return Future, if you take other precautions such as naming the method submitXXX(), or if the whole class has Async in the name, or other possibilities. Alternatively, asynchronous methods can return void and take a CompletionHandler-like parameter, thats also obvious.

@rmuir

This comment has been minimized.

Copy link
Contributor

commented Jan 11, 2015

If we have those, then these are bugs in our tests, not an API issue... IMO at least

Well, it depends what the tests are doing. If the tests are using the API the same way the user would, and frequently making these mistakes, then it means we should think about the API? Its a sign that users might make the same exact mistakes in their code.

On the other hand, if the test bugs in question are doing crazy expert stuff that only tests should be doing, then it should carry much less weight.

@pickypg

This comment has been minimized.

Copy link
Member

commented Jan 11, 2015

I don't think that in 2015 we need to explain developers what a Future is, but for argument sake, if you need response X and you get a Future back, you'll need to get the X out of the returned Future (simply because there'll be no other way to get it).

I do occasionally see people calling things and not blocking on them. This is usually trouble around bulk indexing that results in them filling up their queues. I really wish that it weren't true, but it is.

I personally wish that we would go with the Futures as well because I think that the listener approach is more effort. However, I think the that complication will force people to use the synchronous approach more often then not, which will lead to less unexpected issues from them.

Whether the async version is via an additional optional listener argument, or through a wrapper, I don't care, but the common case should be simple and obvious how to use.

I wonder if we should add healthAsync to be more obvious about what's going on when the two functions live side-by-side. I feel like the .NET Framework does this to great effect.

@rjernst

This comment has been minimized.

Copy link
Member

commented Jan 11, 2015

Whether the async version is via an additional optional listener argument, or through a wrapper, I don't care, but the common case should be simple and obvious how to use.

I wonder if we should add healthAsync to be more obvious about what's going on when the two functions live side-by-side.

This is what Robert suggested, and I'm fine with that. What I would like to see from this issue is a clearer separation between sync and async, and the sync should be easy to call (since we use it the vast majority of the time). I was not suggesting before that async was trappy, but that having the common way to make a request (through prepareX right now) is trappy for calls that I don't care about the result (which is how I've been bitten by this in the past).

@rjernst

This comment has been minimized.

Copy link
Member

commented Jan 11, 2015

And +1 to Simon's sentiment that there should be exactly one way to make a request (one way for a sync request, and one way for an async request).

@pickypg

This comment has been minimized.

Copy link
Member

commented Jan 11, 2015

+1 to that

@uboness

This comment has been minimized.

Copy link
Contributor

commented Jan 12, 2015

I like the xxxxAsync version as long as both methods (sync & async) are under the same interface. If you split the interfaces (per one of @s1monw's suggestions) I think it's redundant.

I still don't think we should push the timeout logic to the users. it should be easy to provide a timeout to the blocking call.

@uboness

This comment has been minimized.

Copy link
Contributor

commented Jan 12, 2015

personally wish that we would go with the Futures as well because I think that the listener approach is more effort.

The futures will need to be listenable futures anyway (futures you can't listen to are useless IMO)... so effort-wise there's not much different between the pure listener/callback approach to a listenable future one. I also don't think providing a listener (either to the future or to the method itself) really frightens ppl...

@uboness

This comment has been minimized.

Copy link
Contributor

commented Jan 12, 2015

Well, it depends what the tests are doing. If the tests are using the API the same way the user would, and frequently making these mistakes, then it means we should think about the API? Its a sign that users might make the same exact mistakes in their code.

so it's the question indeed, but since all our APIs return a response, not "getting" the response and asserting it should be considered a bug.. no? (I can't think of any "fire & forget" API that we expose)

@rmuir

This comment has been minimized.

Copy link
Contributor

commented Jan 12, 2015

Its a bug in the API if it makes this mistake easy to happen. That is all I am saying. I recently encountered this bug myself, on #9146.

@clintongormley

This comment has been minimized.

Copy link
Member

commented Nov 26, 2016

Is this something we still want to do?

@javanna javanna referenced this issue Dec 8, 2016

Closed

RestClient improvements #19055

2 of 3 tasks complete
@javanna

This comment has been minimized.

Copy link
Member

commented Dec 8, 2016

We will have a high level Rest Client, which will not implement the Client interface. We will take all this discussion into account there. Not sure what the future of the current Client interface is. If it's removed, this issue can be closed straightaway, otherwise it may become internal only in which case we will still want to clean things up there.

@clintongormley clintongormley added v6.0.0 and removed v6.0.0-alpha1 labels May 3, 2017

javanna added a commit to javanna/elasticsearch that referenced this issue May 3, 2017

Java api: ActionRequestBuilder#execute to return a PlainActionFuture
This change makes the request builder codepath same as `Client#execute`. The request builder used to return a `ListenableActionFuture` when calling execute, which allows to associate listeners with the returned future. For async execution though it is recommended to use the `execute` method that accepts an `ActionListener`, like users would do when using `Client#execute`.

Relates to elastic#24412
Relates to elastic#9201

javanna added a commit that referenced this issue May 3, 2017

Java api: ActionRequestBuilder#execute to return a PlainActionFuture (#…
…24415)

This change makes the request builder code-path same as `Client#execute`. The request builder used to return a `ListenableActionFuture` when calling execute, which allows to associate listeners with the returned future. For async execution though it is recommended to use the `execute` method that accepts an `ActionListener`, like users would do when using `Client#execute`.

Relates to #24412
Relates to #9201
@javanna

This comment has been minimized.

Copy link
Member

commented Jul 21, 2017

I am leaning towards closing this issue. We took all this into account when working on the high level REST client. It supports two methods:

search(SearchRequest)

searchAsync(SearchRequest, ActionListener<SearchResponse>)

Not sure what the future of the Client interface is, but once the transport client is removed there probably will be no need for the Client interface anymore. Anyways it will be internal only hence we are free to change it in any way we want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.