Low level Rest Client #18735

Merged
merged 109 commits into from Jun 22, 2016

Projects

None yet
@javanna
Member
javanna commented Jun 3, 2016 edited

The low level RestClient can be created by simply providing the hosts that it should point to: RestClient.builder().setHosts(new HttpHost("localhost", 9200).build().
It also allows to provide a CloseableHttpClient instance to be used, otherwise one with default settings is going to be created.
It exposes a performRequest method, that accepts all the needed arguments to send an http request: method, endpoint, parameters, body, and headers.
It supports trace logging like all other language clients, to log each request and response in curl format.

client is a new gradle submodule that depends on apache http client and only a few other transitive deps.
client-sniffer is an additional gradle submodule that depends on client and jackson, as it's able to parse the nodes info api response and inject hosts into an existing RestClient instance. Sniffer comes also with a listener that gets notified on failure and can optionally sniff hosts on failure.

Both new modules are java 7 compatible and get compiled with source 1.7 & target 1.7.

This PR also makes our Rest tests use the new RestClient, and it also replaces any usage of apache http client in tests with the new low level RestClient.

What's left, which can probably be worked on after this PR gets in:

  • write some more tests, especially around custom http clients, concurrency and sniffing task.
  • javadocs should be ok but we need some docs page on how to use it and configuration for common scenarios like ssl, proxy, basic auth etc. (although we can refer to apache http client documentation)
  • add async variant of performRequest method

Closes #7743

s1monw and others added some commits Feb 22, 2016
@s1monw @javanna s1monw RestClient prototype 3efbe95
@javanna @javanna javanna Fix checkstyle issues, setup thirdPartyAudit checks and enable forbid…
…denApisMain
c9c33a7
@javanna @javanna javanna clarify why jarhell and dependency license check are disabled, add ok…
…http mockwebserver dependency

Also enable forbiddenApisTest (will fail till we get rid of the sun http web server in RestClientTests that will be replaced with mockwebserver)
5970723
@javanna @javanna javanna Introduce notion of Transport, Connection and ConnectionPool
There are two implementations of connection pool, a static one that allows to enable/disable pings, and a sniffing one that sniffs nodes from the nodes info api.

Transport retrieves a stream of connections from the connection for each request and calls onSuccess or onFailure depending on the result of the request.

Transport also supports a max retry timeout to control the timeout for the request retries overall.
6ae8be5
@javanna @javanna javanna remove Verb enum 2589235
@javanna @javanna javanna get rid of retry timeout exception 94cf843
@javanna @javanna javanna get rid of Node abstraction f6a5a0a
@javanna @javanna javanna get rid of connection selector predicate ce663e9
@javanna @javanna javanna move sniff related stuff to sniff package a472544
@javanna @javanna javanna merge Connection and StatefulConnection into one class, remove generi…
…c type from Transport
062a216
@javanna @javanna javanna Remove Transport class, move all of it within RestClient class bd29dc1
@javanna @javanna javanna remove Scheme enum 9ffdea9
@javanna @javanna javanna add some javadocs to connection pool classes d7c4176
@javanna @javanna javanna remove pinging from static connection pool, can be replaced by a low …
…connect timeout on each request
e85ed3e
@javanna @javanna javanna return the response as part of ElasticsearchResponseException e77ab87
@javanna @javanna javanna prevent unclosed response entities in ElasticsearchResponseException,…
… eagerly read response string in case of error status code
530ad22
@javanna @javanna javanna add missing parentheses e7fe397
@javanna @javanna javanna add curl format trace logging for requests and responses 17a21f0
@javanna @javanna javanna remove ConnectionPool interface e040d2f
@javanna @javanna javanna add builders for simple creation of RestClient and SniffingConnection…
…Pool instances

We have quite some constructor parameters and some defaults should be applied, builders help simplifying creation of objects for users.
9569ebc
@javanna @javanna javanna remove streams and java 8 only api, build with source and target 1.7 599dad5
@javanna @javanna javanna remove streams leftover
Given that we don't use streams anymore, we can check straightaway if the connection iterator is empty before returning it and resurrect a connection when needed directly in the connection pool, no lastResortConnection method required.
b38ef34
@javanna @javanna javanna remove notion of connection pool, turn around dependency between Rest…
…Client and Sniffer

RestClient exposes a setNodes method, which Sniffer can call to update its nodes.
cdffc3d
@javanna @javanna javanna simplify Connection class
The connection class can be greatly simplified now that we don't ping anymore. Pings required a special initial state (UNKNOWN) for connections, to indicate that they require pinging although they are not dead. At this point we don't need the State enum anymore, as connections can only be dead or alive based on the number of failed attempts. markResurrected is also not needed, as it was again a way to make pings required. RestClient can simply pick a dead connection now and use it, no need to change its state when picking the connection.
85a7721
@javanna @javanna javanna add support for headers: default ones and per request c0a72c1
@javanna @javanna javanna fix usage of deprecated apis 2d7a781
@javanna @javanna javanna use versions from versions.properties 2735897
@javanna @javanna javanna add project substitution for org.elasticsearch:client so that we can …
…add it as a dep in other projects
3890842
@javanna @javanna javanna adapt params argument, they can only be Strings in performRequest method 1d06916
@javanna @javanna javanna add getFirstHeader method to ElasticsearchResponse 16ab491
@javanna @javanna javanna Expose whole ElasticsearchResponse within ElasticsearchResponseException
The only small problem is that the response gets closed straightaway and its body read immediately into a string. Should be ok to load it all into memory eagerly though in case of errors. Otherwise it becomes cumbersome to have an exception implement Closeable...
9a38d81
@javanna @javanna javanna support missing OPTIONS method, it is supported in elasticsearch core 6d3f6c7
@javanna @javanna javanna [TEST] add rest client test dependency and replace usage of HttpReque…
…stBuilder with RestClient in integration tests
325b723
@javanna @javanna javanna Replace rest test client with low level RestClient
We still have a wrapper called RestTestClient that is very specific to Rest tests, as well as RestTestResponse etc. but all the low level bits around http connections etc. are now handled by RestClient.
eae914a
@javanna @javanna javanna include response body in ElasticsearchResponseException error message c70e08c
@javanna @javanna javanna remove usage of deprecated api e81aad9
@javanna @javanna javanna make host state immutable
Instead of having a Connection mutable object that holds the state of the connection to each host, we now have immutable objects only. We keep two sets, one with all the hosts, one with the blacklisted ones. Once we blacklist a host we associate it with a DeadHostState which keeps track of the number of failed attempts and when the host should be retried. A new state object is created each and every time the state of the host needs to be updated.
6490355
@javanna @javanna javanna make some classes and methods package private
ElasticsearchResponseException, as well as ElasticsearchResponse, should only be created from o.e.client package.
RequestLogger should only be used from this package too.
29b1f8d
@javanna @javanna javanna add some javadocs 7f4807b
@javanna @javanna javanna move client sniffer to its own project
Create a new subproject called client-sniffer that contains the o.e.client.sniff package. Since it is going to go to a separate jar, due to its additional functionalities and dependency on jackson, it makes sense to have it as a separate project that depends on client. This way we make sure that client doesn't depend on it etc.
044a97c
@javanna @javanna javanna [TEST] be more specific around http method used for sniffing 3745305
@javanna @javanna javanna [TEST] remove okhttp test dependency
Use sun HttpServer instead and disable forbidden-apis for test classes. It turns out to be more flexible than okhttp as it allows get & delete with body.
51e487f
@javanna @javanna javanna add support for PATCH and TRACE methods
Although elasticsearch doesn't support these methods (RestController doesn't even allow to register handler for them), the RestClient should allow to send requests using them.
83c6e73
@javanna @javanna javanna add javadocs on closing responses c9db111
@javanna @javanna javanna add toString to DeadHostState class 6d66fbd
@javanna @javanna javanna check hosts is not null nor empty earlier, remove check from nextHost
if we check at set time, we don't need to check each single time in nextHost
35dbdee
@javanna @javanna javanna [TEST] add setHosts test and rename RestClientBuilderTests to RestCli…
…entTests
47e5204
@javanna @javanna javanna don't use setDefaultHeaders from HttpClient
Store default headers ourselves instead, otherwise default ones cannot be replaced. Don't allow for multiple headers with same key, last one wins and replaces previous ones with same key.

Also fail with null params or headers.
24ea585
@javanna @javanna javanna [TEST] rename restClientTests back to RestClientBuilderTests 9ed2d61
@javanna @javanna javanna [TEST] add RestClient unit tests
 Unit tests rely on mockito to mock the internal HttpClient instance. No http request is performed, we only simulate interaction between RestClient and its internal HttpClient.
4572b69
@javanna @javanna javanna [TEST] add RestClient integration test
Relies on real HttpServer to test the actual interaction between RestClient and HttpClient, and how requests get sent in real life.
13a27a3
@javanna @javanna javanna [TEST] add a lot of forgotten try with resources to wrap Elasticsearc…
…hResponses
4fa824f
@javanna @javanna javanna [TEST] restore needed LogManager.reset to prevent logging to sysout f5825b8
@javanna @javanna javanna [TEST] create standard RestClient at first request and reuse it
A RestClient instance is now created whenever EsIntegTestCase#getRestClient is invoked for the first time. It is then kept until the cluster is cleared (depending on the cluster scope of the test).

Renamed other two restClient methods to createRestClient, as that instance needs to be closed and managed in the tests.
23a94bb
@javanna @javanna javanna [TEST] create standard RestClient at first request and reuse it
A RestClient instance is now created whenever EsIntegTestCase#getRestClient is invoked for the first time. It is then kept until the cluster is cleared (depending on the cluster scope of the test).

Renamed other two restClient methods to createRestClient, as that instance needs to be closed and managed in the tests.
c48b7a7
@javanna @javanna javanna [TEST] expand RequestLoggerTests to all the supported http methods 29eee32
@javanna @javanna javanna rename ElasticsearchResponse#getFirstHeader to getHeader f17f0f9
@javanna @javanna javanna [TEST] remove status matcher and hasStatus assertion
All it does is checking the status code of a response, which can be done with a single line in each test
b891c46
@javanna @javanna javanna Allow to pass socket facttry registry to createDefaultHttpClient method b15279b
@javanna @javanna javanna Build: add missing licenses, SHAs and enable dependency licenses check f2318ed
@javanna @javanna javanna [TEST] remove unused method 56e689e
@javanna @javanna javanna raise default socket timeout to 10s and default connect timeout to 1s 05e26a4
@nik9000 nik9000 and 1 other commented on an outdated diff Jun 4, 2016
client-sniffer/build.gradle
+apply plugin: 'elasticsearch.build'
+
+targetCompatibility = JavaVersion.VERSION_1_7
+sourceCompatibility = JavaVersion.VERSION_1_7
+
+dependencies {
+ compile "org.elasticsearch:client:${version}"
+ compile "org.apache.httpcomponents:httpclient:${versions.httpclient}"
+ compile "org.apache.httpcomponents:httpcore:${versions.httpcore}"
+ compile "commons-codec:commons-codec:${versions.commonscodec}"
+ compile "commons-logging:commons-logging:${versions.commonslogging}"
+ compile "com.fasterxml.jackson.core:jackson-core:${versions.jackson}"
+
+ testCompile "com.carrotsearch.randomizedtesting:randomizedtesting-runner:${versions.randomizedrunner}"
+ testCompile "junit:junit:${versions.junit}"
+ testCompile "org.hamcrest:hamcrest-all:1.3"
@nik9000
nik9000 Jun 4, 2016 Contributor

May as well stick this in version.properties too?

@nik9000 nik9000 commented on an outdated diff Jun 4, 2016
...java/org/elasticsearch/client/sniff/HostsSniffer.java
+/**
+ * Class responsible for sniffing the http hosts from elasticsearch through the nodes info api and returning them back
+ */
+//TODO This could potentially be using _cat/nodes which wouldn't require jackson as a dependency, but we'd have bw comp problems with 2.x
+public class HostsSniffer {
+
+ private static final Log logger = LogFactory.getLog(HostsSniffer.class);
+
+ private final RestClient restClient;
+ private final Map<String, String> sniffRequestParams;
+ private final String scheme;
+ private final JsonFactory jsonFactory;
+
+ public HostsSniffer(RestClient restClient, int sniffRequestTimeout, String scheme) {
+ this.restClient = restClient;
+ this.sniffRequestParams = Collections.<String, String>singletonMap("timeout", sniffRequestTimeout + "ms");
@nik9000
nik9000 Jun 4, 2016 Contributor

Oh java 1.7. I haven't missed you!

@nik9000 nik9000 and 2 others commented on an outdated diff Jun 4, 2016
...main/java/org/elasticsearch/client/sniff/Sniffer.java
+ scheduledExecutorService.shutdownNow();
+ }
+ }
+
+ /**
+ * Returns a new {@link Builder} to help with {@link Sniffer} creation.
+ */
+ public static Builder builder() {
+ return new Builder();
+ }
+
+ /**
+ * Sniffer builder. Helps creating a new {@link Sniffer}.
+ */
+ public static final class Builder {
+ public static final int DEFAULT_SNIFF_INTERVAL = 60000 * 5; //5 minutes
@nik9000
nik9000 Jun 4, 2016 Contributor

TimeUnit.MINUTES.toSeconds(5)?

@jasontedor
jasontedor Jun 4, 2016 Contributor

I think you mean TimeUnit.MINUTES.toMillis(5)? I also think the fieldname should include that the units are milliseconds.

@nik9000
Contributor
nik9000 commented Jun 4, 2016

forbidden-apis for test classes is currently disabled, find a way to make excludes work so that we can re-enable it

We'll probably want a different list for the clients, but I think you can make SuppressForbidden work just by making your own annotation named that.

@nik9000
Contributor
nik9000 commented Jun 4, 2016

client-sniffer requires license and SHA for client dependency, not sure that's correct

We are supposed to be skipping our own libraries in that check....

Yeah, this stuff can wait until after merge I think...

@nik9000
Contributor
nik9000 commented Jun 4, 2016

@javanna this makes me so happy!

@nik9000
Contributor
nik9000 commented Jun 4, 2016

Maybe instead of :client and :client-sniffer it should be :client:core and :client:sniffer?

@nik9000
Contributor
nik9000 commented Jun 4, 2016

What lengths do we plan to go to to get 1.7 compatibility? How far do we feel like we have to go before we can merge vs before we can release?

@nik9000 nik9000 and 1 other commented on an outdated diff Jun 4, 2016
...main/java/org/elasticsearch/client/RequestLogger.java
+/**
+ * Helper class that exposes static methods to unify the way requests are logged.
+ * Includes trace logging to log complete requests and responses in curl format.
+ */
+final class RequestLogger {
+
+ private static final Log tracer = LogFactory.getLog("tracer");
+
+ private RequestLogger() {
+ }
+
+ /**
+ * Logs a request that yielded a response
+ */
+ static void log(Log logger, String message, HttpUriRequest request, HttpHost host, HttpResponse httpResponse) {
+ logger.debug(message + " [" + request.getMethod() + " " + host + request.getRequestLine().getUri() +
@nik9000
nik9000 Jun 4, 2016 Contributor

Maybe also a if (logger.isDebugEnabled()) around this?

@nik9000 nik9000 and 1 other commented on an outdated diff Jun 4, 2016
...main/java/org/elasticsearch/client/RequestLogger.java
+ traceRequest = "";
+ }
+ tracer.trace(traceRequest);
+ }
+ }
+
+ /**
+ * Creates curl output for given request
+ */
+ static String buildTraceRequest(HttpUriRequest request, HttpHost host) throws IOException {
+ String requestLine = "curl -iX " + request.getMethod() + " '" + host + request.getRequestLine().getUri() + "'";
+ if (request instanceof HttpEntityEnclosingRequest) {
+ HttpEntityEnclosingRequest enclosingRequest = (HttpEntityEnclosingRequest) request;
+ if (enclosingRequest.getEntity() != null) {
+ requestLine += " -d '";
+ HttpEntity entity = new BufferedHttpEntity(enclosingRequest.getEntity());
@nik9000
nik9000 Jun 4, 2016 Contributor

It is so weird to mutate the request to log it! I understand why you do it, but so weird!

@javanna
javanna Jun 4, 2016 Member

right, this is annoying... I could check if it is repeatable or not (some are) but if the entity is not repeatable there is not much I can do about it.

@javanna
javanna Jun 6, 2016 Member

I added an if (entity.isRepeatable == false) plus added tests for both types of entities, to make sure that logging doesn't break things.

@nik9000 nik9000 commented on the diff Jun 4, 2016
...rc/main/java/org/elasticsearch/client/RestClient.java
+ connectionManager.setDefaultMaxPerRoute(10);
+ connectionManager.setMaxTotal(30);
+
+ //default timeouts are all infinite
+ RequestConfig requestConfig = RequestConfig.custom().setConnectTimeout(DEFAULT_CONNECT_TIMEOUT)
+ .setSocketTimeout(DEFAULT_SOCKET_TIMEOUT)
+ .setConnectionRequestTimeout(DEFAULT_CONNECTION_REQUEST_TIMEOUT).build();
+ return HttpClientBuilder.create().setConnectionManager(connectionManager).setDefaultRequestConfig(requestConfig).build();
+ }
+ }
+
+ /**
+ * Listener that allows to be notified whenever a failure happens. Useful when sniffing is enabled, so that we can sniff on failure.
+ * The default implementation is a no-op.
+ */
+ public static class FailureListener {
@nik9000
nik9000 Jun 4, 2016 Contributor

It feels weird to have it as a class instead of an interface.

@javanna
javanna Jun 4, 2016 Member

I tend to agree, I just didn't want to have an interface and a default empty impl. default methods would help :(

@nik9000 nik9000 commented on the diff Jun 4, 2016
...rc/main/java/org/elasticsearch/client/RestClient.java
+ Objects.requireNonNull(host, "host cannot be null");
+ httpHosts.add(host);
+ }
+ this.hosts = Collections.unmodifiableSet(httpHosts);
+ this.blacklist.clear();
+ }
+
+ /**
+ * Sends a request to the elasticsearch cluster that the current client points to.
+ * Selects a host out of the provided ones in a round-robin fashion. Failing hosts are marked dead and retried after a certain
+ * amount of time (minimum 1 minute, maximum 30 minutes), depending on how many times they previously failed (the more failures,
+ * the later they will be retried). In case of failures all of the alive nodes (or dead nodes that deserve a retry) are retried
+ * till one responds or none of them does, in which case an {@link IOException} will be thrown.
+ *
+ * @param method the http method
+ * @param endpoint the path of the request (without host and port)
@nik9000
nik9000 Jun 4, 2016 Contributor

Maybe just call it path?

It'd be nice to have some path building help so we're not just slamming strings together like animals. Vulnerable animals.

@javanna
javanna Jun 4, 2016 Member

this path doesn't contain query_string params though, that is why I called it endpoint. params come in as a map. What did you have in mind?

@nik9000
nik9000 Jun 5, 2016 Contributor

I dunno! I guess we don't have a thing to build the body either so maybe we just say "that is what low level client means, we'll cover that later". Which is fine.

@javanna
javanna Jun 6, 2016 Member

yes this is what the low level client is about, no builders for requests, only raw performRequest method. We will later add api-specific clases to do things like client.index(index, type, id, body). Still not sure what the body will look like, might still be quite raw to leave room for external serialization libraries etc.

@rmuir rmuir added the das awesome label Jun 4, 2016
@javanna
Member
javanna commented Jun 4, 2016

thanks for taking a look @nik9000 :

We'll probably want a different list for the clients, but I think you can make SuppressForbidden work just by making your own annotation named that.

The only problem is the HttpServer we use for testing. I tried with @SuppressForbidden and excludes but neither of them worked for me, no idea why. help! :)

What lengths do we plan to go to to get 1.7 compatibility? How far do we feel like we have to go before we can merge vs before we can release?

I don't know. I don't feel safe enough compiling against java 8 although with target and source 1.7. But we could solve it having CI compiling and running with real java 7 against these modules? not sure it will work though. I guess we require 1.8 for the build itself?

@costin
Member
costin commented Jun 4, 2016

A big +1 for Java 7 compatibility from my side. To keep things in check and make sure we don't end up using Java 8 APIs by accident, we could use something like Animal Sniffer to double check they are not leaking in.
This way the rest client could simply compile on Java8 like the rest of ES.

Another idea though is to make the rest client a separate project from ES mainly because their lifecycles are different. A big advantage of using REST is that the client is not tightly coupled to ES itself and thus it is not forced to upgrade on dot releases.

@dadoonet
Member
dadoonet commented Jun 4, 2016

Another idea though is to make the rest client a separate project from ES

Was thinking the same.

It should be treated as like any other client IMO.

@djschny
Contributor
djschny commented Jun 5, 2016

+++ for separate from ES code base; its the proper way

@nik9000
Contributor
nik9000 commented Jun 5, 2016

This proposal puts it in in the Elasticsearch project because Elasticsearch is required to test it and because it is required to test Elasticsearch. Running the REST tests does that for both of them. And the http client is used to test a few places inside Elasticsearch. And it'll be used in the reindex module for reindex-from-remote.

I agree the two projects probably deserve different release cycles though. I agree it doesn't make a lot sense to version the initial release of this 5.0.0-beta3. Nor does it make sense to release versions of it every time we release an Elasticsearch version. They just might not be changes. And sometimes we might want to release the client without Elasticsearch. Releasing Elasticsearch is a big effort but I expect we'll want to release the client more frequently at first.

Speaking of releases, we'll probably also want to have a version matrix for the java client too one day.

I don't know how to make sense of any of this. The second half of my comment is all "it should be its own project" stuff but the first have makes it pretty clear that "it isn't separate". I dunno what the solution is. Gradle should let us technically organize it and release it how we like. I think we'll just have to pick the least painful way for all involved.

@s1monw
Contributor
s1monw commented Jun 5, 2016 edited

+++ for separate from ES code base; its the proper way

we will have to share code eventually - moving in to a different code base now will only complicate things. Don't put statements like this in such an issue they are just distraction and bike-shedding. We get a client in now to solve a ton of problems. I don't know yet if we will have different release cycles, we can but we don't have to there is nothing forcing us except of potential scenarios. In reality clients are released to:

  • fix bugs (we do that too in ES very very frequently)
  • add support for APIs (ES has to be released for this)

there is nothing that couples the REST client to an ES release so there is 0 argument to put it into a different codebase. We can have this discussion later if there are good reasons not just bike-shedding.

I agree the two projects probably deserve different release cycles though. I agree it doesn't make a lot sense to version the initial release of this 5.0.0-beta3. Nor does it make sense to release versions of it every time we release an Elasticsearch version. They just might not be changes. And sometimes we might want to release the client without Elasticsearch. Releasing Elasticsearch is a big effort but I expect we'll want to release the client more frequently at first.

this discussion can happen once we have something committed and working. Our major target here is a replacement for transport client and remove the dependency on the binary protocol and in turn only depend on REST interface. My reasoning for going with java 7 was to allow the server to move faster if we need to and make it interesting for users that have java 7 requirement. We will stick with the core and it's release cycle we don't need to rush anything. stuff will be experimental, we will move as fast as we can just as we do with core.

I don't know how to make sense of any of this. The second half of my comment is all "it should be its own project" stuff but the first have makes it pretty clear that "it isn't separate". I dunno what the solution is. Gradle should let us technically organize it and release it how we like. I think we'll just have to pick the least painful way for all involved.

we don't need to be perfect on the first shot. It's experimental, we can move things out later but at this stage we need to move fast and get stuff out there to gain feedback lets not complicate things because the code is not as well organized as it could.

@djschny
Contributor
djschny commented Jun 5, 2016

Don't put statements like this in such an issue they are just distraction and bike-shedding.

It was not meant to be a distraction, it was to register feedback for that direction. My statement did not imply that it had to be done as part of this request and that was not the intention. How else would the team recommend for the community to engage on discussion of this nature if not here?

@nik9000
Contributor
nik9000 commented Jun 5, 2016

I'm happy with this direction and think we should do it. We can always move things around later. I vote @javanna crams some NORELEASEs in it and merges it and we open issues for things like the version number (#18741), and whatever else we see. We can resolve them as we can resolve them. Getting this in soon lets more folks work in parallel to improve stuff and lets me start using it for stuff like reindex from remote. We're early enough in the 5.0 release cycle still this should be safe.

I'm happy to work on the build issues if @rjernst is willing to review my horrible gradle hackery.

@nik9000
Contributor
nik9000 commented Jun 5, 2016

So I just looked into running animal sniffer. I think that might be enough. I mean, we seem to be doing it ok in 2.x. Right?

Using this (so far very easy) gradle plugin, running animal sniffer is like a 3 line build change:
https://gist.github.com/nik9000/0b150b8a186db8216d52aeb7a9e09fa7

It can wait until after the merge but I've added this here so if we want to use it we can.

@KodrAus KodrAus referenced this pull request in elastic-rs/elastic Jun 5, 2016
Closed

Add Client Sniffer #122

@nik9000
Contributor
nik9000 commented Jun 6, 2016

It can wait until after the merge

It should wait until after the merge because it doesn't pass. But that is fine. We can fix it!

@javanna
Member
javanna commented Jun 6, 2016 edited

I addressed some comments, but before we discuss NORELEASEs and we shove this PR in, I think it needs some serious cycles of review, too few comments till now :)

The SHA problem with the client submodule is a blocker, as it needs a new SHA at each change which is not feasible. I am looking into it and reached out to Nik for help. Same problem with licenses.

What do others think about animal-sniffer? shall I integrate it into the build?

@s1monw s1monw was assigned by javanna Jun 6, 2016
@nik9000
Contributor
nik9000 commented Jun 6, 2016

What do others think about animal-sniffer? shall I integrate it into the build?

I'd do it after I think. It saw quite a few small things that needed to change but nothing jumped out at me as scary.

@s1monw s1monw and 1 other commented on an outdated diff Jun 7, 2016
...main/java/org/elasticsearch/client/DeadHostState.java
+ this.failedAttempts = 1;
+ this.deadUntil = System.nanoTime() + MIN_CONNECTION_TIMEOUT_NANOS;
+ }
+
+ DeadHostState(DeadHostState previousDeadHostState) {
+ long timeoutNanos = (long)Math.min(MIN_CONNECTION_TIMEOUT_NANOS * 2 * Math.pow(2, previousDeadHostState.failedAttempts * 0.5 - 1),
+ MAX_CONNECTION_TIMEOUT_NANOS);
+ this.deadUntil = System.nanoTime() + timeoutNanos;
+ this.failedAttempts = previousDeadHostState.failedAttempts + 1;
+ }
+
+ /**
+ * Returns the timestamp (nanos) till the host is supposed to stay dead without being retried.
+ * After that the host should be retried.
+ */
+ long getDeadUntil() {
@s1monw
s1monw Jun 7, 2016 Contributor

should we put the unit in the method name?

@s1monw s1monw and 1 other commented on an outdated diff Jun 7, 2016
...sticsearch/client/ElasticsearchResponseException.java
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.elasticsearch.client;
+
+import java.io.IOException;
+
+/**
+ * Exception thrown when an elasticsearch node responds to a request with a status code that indicates an error.
+ * Note that the response body gets passed in as a string and read eagerly, which means that the ElasticsearchResponse object
+ * is expected to be closed and available only to read metadata like status line, request line, response headers.
+ */
+public class ElasticsearchResponseException extends IOException {
@s1monw
s1monw Jun 7, 2016 Contributor

I really wonder if we need Elasticsearch in the name? can't we just use ResponseException the package is implicit?

@javanna
javanna Jun 9, 2016 Member

Agreed, renaming it and also ElasticsearchResponse.

@s1monw s1monw and 1 other commented on an outdated diff Jun 7, 2016
...rc/main/java/org/elasticsearch/client/RestClient.java
+ * Requests can be traced by enabling trace logging for "tracer". The trace logger outputs requests and responses in curl format.
+ */
+public final class RestClient implements Closeable {
+
+ private static final Log logger = LogFactory.getLog(RestClient.class);
+ public static ContentType JSON_CONTENT_TYPE = ContentType.create("application/json", Consts.UTF_8);
+
+ private final CloseableHttpClient client;
+ //we don't rely on default headers supported by HttpClient as those cannot be replaced, plus it would get hairy
+ //when we create the HttpClient instance on our own as there would be two different ways to set the default headers.
+ private final Header[] defaultHeaders;
+ private final long maxRetryTimeout;
+ private final AtomicInteger lastHostIndex = new AtomicInteger(0);
+ private volatile Set<HttpHost> hosts;
+ private final ConcurrentMap<HttpHost, DeadHostState> blacklist = new ConcurrentHashMap<>();
+ private volatile FailureListener failureListener = new FailureListener();
@s1monw
s1monw Jun 7, 2016 Contributor

failureListener should not be modifiable? we only accept one and that's it. if somebody want's to make it updateable that is simple via delegates?

@javanna
javanna Jun 9, 2016 Member

you mean adding it as a constructor argument? I think that doesn't work given that Sniffer (which implements FailureListener) needs the RestClient at construction time?

@s1monw s1monw and 1 other commented on an outdated diff Jun 7, 2016
...rc/main/java/org/elasticsearch/client/RestClient.java
+ * @param headers the optional request headers
+ * @return the response returned by elasticsearch
+ * @throws IOException in case of a problem or the connection was aborted
+ * @throws ClientProtocolException in case of an http protocol error
+ * @throws ElasticsearchResponseException in case elasticsearch responded with a status code that indicated an error
+ */
+ public ElasticsearchResponse performRequest(String method, String endpoint, Map<String, String> params,
+ HttpEntity entity, Header... headers) throws IOException {
+ URI uri = buildUri(endpoint, params);
+ HttpRequestBase request = createHttpRequest(method, uri, entity);
+ setHeaders(request, headers);
+ //we apply a soft margin so that e.g. if a request took 59 seconds and timeout is set to 60 we don't do another attempt
+ long retryTimeout = Math.round(this.maxRetryTimeout / (float)100 * 98);
+ IOException lastSeenException = null;
+ long startTime = System.nanoTime();
+ Iterator<HttpHost> hostIterator = nextHost();
@s1monw
s1monw Jun 7, 2016 Contributor

can this be a for(HttpHost host : nextHost())ifnextHost()would returnIterable`?

@s1monw s1monw and 1 other commented on an outdated diff Jun 7, 2016
...rc/main/java/org/elasticsearch/client/RestClient.java
+ * @throws ElasticsearchResponseException in case elasticsearch responded with a status code that indicated an error
+ */
+ public ElasticsearchResponse performRequest(String method, String endpoint, Map<String, String> params,
+ HttpEntity entity, Header... headers) throws IOException {
+ URI uri = buildUri(endpoint, params);
+ HttpRequestBase request = createHttpRequest(method, uri, entity);
+ setHeaders(request, headers);
+ //we apply a soft margin so that e.g. if a request took 59 seconds and timeout is set to 60 we don't do another attempt
+ long retryTimeout = Math.round(this.maxRetryTimeout / (float)100 * 98);
+ IOException lastSeenException = null;
+ long startTime = System.nanoTime();
+ Iterator<HttpHost> hostIterator = nextHost();
+ while (hostIterator.hasNext()) {
+ HttpHost host = hostIterator.next();
+
+ if (lastSeenException != null) {
@s1monw
s1monw Jun 7, 2016 Contributor

this is the retry case? can you leave a comment or factor this into a method that has a good name?

@s1monw s1monw and 1 other commented on an outdated diff Jun 7, 2016
...rc/main/java/org/elasticsearch/client/RestClient.java
+ RequestLogger.log(logger, "request failed", request, host, response);
+ String responseBody;
+ try {
+ if (elasticsearchResponse.getEntity() == null) {
+ responseBody = null;
+ } else {
+ responseBody = EntityUtils.toString(elasticsearchResponse.getEntity());
+ }
+ } finally {
+ elasticsearchResponse.close();
+ }
+ ElasticsearchResponseException elasticsearchResponseException = new ElasticsearchResponseException(
+ elasticsearchResponse, responseBody);
+ lastSeenException = addSuppressedException(lastSeenException, elasticsearchResponseException);
+ //clients don't retry on 500 because elasticsearch still misuses it instead of 400 in some places
+ if (statusCode == 502 || statusCode == 503 || statusCode == 504) {
@s1monw
s1monw Jun 7, 2016 Contributor

would switch / case look nicer here? just a nit...

@javanna
javanna Jun 9, 2016 Member

will do that

@javanna @javanna javanna remove TODO around using /_cat/nodes rather than /_nodes, test compat…
…ibility with 2.x

 _cat/nodes returns the http address only in 5.x. Would be nice to use as we could drop the jackson dependency, but we care more about being compatible with 2.x.
 Not compatible with previous versions as the format of returned http addresses have changed since 2.0.

Also fixed test bug that caused sporadic failures.
791db1f
@tstibbs
Contributor
tstibbs commented Jun 8, 2016

I'm really excited about a java rest client, but not sure if I am missing something here - it doesn't seem to add very much above what you get from just using apache http client?

I realise this is referred to as a "low level" client - long term are we going to see something that implements the org.elasticsearch.client.Client interface and uses the IndexRequest and IndexResponse classes? This would make it much easier to migrate.

@s1monw
Contributor
s1monw commented Jun 8, 2016

I'm really excited about a java rest client, but not sure if I am missing something here - it doesn't seem to add very much above what you get from just using apache http client?

you are right, it will only provide sniffing etc. It will also be tested and have defaults etc.

I realise this is referred to as a "low level" client - long term are we going to see something that implements the org.elasticsearch.client.Client interface and uses the IndexRequest and IndexResponse classes? This would make it much easier to migrate.

we can't do this at this point and it will never implement that interface It would pull in all ES dependencies and you would have no win what so ever. We will build sugar on top of the low level client especially for CRUD actions (search etc) but not in this issue. This is purely the infrastructure to build on top of it. Imagine all the API work this PR would be unreviewable!

Trust me we will invest a lot into this client and it will remove the transport client etc. and make REST the primary interface. We will go very incremental and wise on this development. We might change out mind X times down the road but this is a feature for users and it should have good APIs, they will just come later.

@javanna
Member
javanna commented Jun 9, 2016

I have addressed all comments I got up until now. Here is what's left:

  • ignore our own libraries in dependency licenses check and sha check (in progress)
  • make @SuppressForbidden work for forbiddenApisTest (in progress)
  • move tests to RandomizedRunner rather than LuceneTestCase. Are we sold on having a client-test project shared by client and client-sniffer? I wish there were better options than yet another module.
  • deeper java7 compatibility check. Shall we go for animal sniffer? I didn't get much feedback on that yet.
@s1monw
Contributor
s1monw commented Jun 9, 2016 edited

deeper java7 compatibility check. Shall we go for animal sniffer? I didn't get much feedback on that yet.

++

move tests to RandomizedRunner rather than LuceneTestCase. Are we sold on having a client-test project shared by client and client-sniffer? I wish there were better options than yet another module.

is there a way to have it all in a single module and have dedicated builds to build the individual jars? maybe @rjernst has ideas

@s1monw
Contributor
s1monw commented Jun 9, 2016

@javanna I wonder if we can merge all in one module and make the dependencies on jackson etc optional?

@nik9000
Contributor
nik9000 commented Jun 9, 2016

@javanna I wonder if we can merge all in one module and make the dependencies on jackson etc optional?

I'm not sure it is worth the effort to have fewer modules. The cost of a modules isn't super high compared to the complexity of optional dependencies, building multiple jars from the same code, and making sure the dependencies are right for all of those jars. Those problems are the things that having multiple modules is meant to solve.

@s1monw s1monw and 1 other commented on an outdated diff Jun 10, 2016
...main/java/org/elasticsearch/client/sniff/Sniffer.java
+ this.sniffIntervalMillis = sniffIntervalMillis;
+ this.sniffAfterFailureDelayMillis = sniffAfterFailureDelayMillis;
+ this.scheduledExecutorService = Executors.newScheduledThreadPool(1);
+ scheduleNextRun(0);
+ }
+
+ synchronized void scheduleNextRun(long delayMillis) {
+ if (scheduledExecutorService.isShutdown() == false) {
+ try {
+ if (scheduledFuture != null) {
+ //regardless of when the next sniff is scheduled, cancel it and schedule a new one with updated delay
+ this.scheduledFuture.cancel(false);
+ }
+ logger.debug("scheduling next sniff in " + delayMillis + " ms");
+ this.scheduledFuture = this.scheduledExecutorService.schedule(this, delayMillis, TimeUnit.MILLISECONDS);
+ } catch(Throwable t) {
@s1monw
s1monw Jun 10, 2016 Contributor

maybe just catch exception (lets not start with throwable in a new project :)

@s1monw s1monw and 1 other commented on an outdated diff Jun 10, 2016
...main/java/org/elasticsearch/client/sniff/Sniffer.java
+ }
+
+ void sniffOnFailure(HttpHost failedHost) {
+ sniff(failedHost, sniffAfterFailureDelayMillis);
+ }
+
+ void sniff(HttpHost excludeHost, long nextSniffDelayMillis) {
+ if (running.compareAndSet(false, true)) {
+ try {
+ List<HttpHost> sniffedNodes = hostsSniffer.sniffHosts();
+ if (excludeHost != null) {
+ sniffedNodes.remove(excludeHost);
+ }
+ logger.debug("sniffed nodes: " + sniffedNodes);
+ this.restClient.setHosts(sniffedNodes.toArray(new HttpHost[sniffedNodes.size()]));
+ } catch (Throwable t) {
@s1monw
s1monw Jun 10, 2016 Contributor

same here

@s1monw s1monw and 1 other commented on an outdated diff Jun 10, 2016
...main/java/org/elasticsearch/client/sniff/Sniffer.java
+ scheduleNextRun(nextSniffDelayMillis);
+ running.set(false);
+ }
+ }
+ }
+
+ synchronized void shutdown() {
+ scheduledExecutorService.shutdown();
+ try {
+ if (scheduledExecutorService.awaitTermination(1000, TimeUnit.MILLISECONDS)) {
+ return;
+ }
+ } catch (InterruptedException e) {
+ Thread.currentThread().interrupt();
+ }
+ scheduledExecutorService.shutdownNow();
@s1monw
s1monw Jun 10, 2016 Contributor

I think we should move the shutdowNow into the try block since if we are interrupted we don't wanna run that? I also wonder if we rather want to declare throws InterruptedException and let the caller deal with it. I think this is wise since we are a client lib here

@javanna
javanna Jun 10, 2016 Member

about shutdown throwing InterruptedException. the method is called as part of close so we will have to deal with it ourselves I'm afraid anyways? Unless we require people to call shutdown separately. WDYT?

@s1monw s1monw commented on the diff Jun 10, 2016
...org/elasticsearch/client/sniff/HostsSnifferTests.java
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import static org.hamcrest.CoreMatchers.containsString;
+import static org.hamcrest.CoreMatchers.equalTo;
+
+@SuppressForbidden(reason = "uses sun HttpServer")
+public class HostsSnifferTests extends LuceneTestCase {
+
+ private int sniffRequestTimeout;
+ private HostsSniffer.Scheme scheme;
+ private SniffResponse sniffResponse;
+ private HttpServer httpServer;
@s1monw
s1monw Jun 10, 2016 Contributor

I wonder what the deal is with this sun http server. It is annotated with @Exported which means it will be around for at least 2 releases of the JDK so 4 years? I think it should be a general exception in our test policy instead of @SuppressForbidden

https://docs.oracle.com/javase/8/docs/jdk/api/javac/tree/jdk/Exported.html

@javanna
javanna Jun 10, 2016 Member

fine with me...but I can only use jdk-signatures (rather than any es-* signature), which is for both test and prod. Doesn't seem like the exclusion should go there. Maybe these few suppress are better than introducing a new jdk signatures file for tests only?

@s1monw
s1monw Jun 10, 2016 Contributor

I just tired this and added the HttpServer to a test in core and forbidden APIs don't fail. not sure what's going on but we allow this in core?

@s1monw
s1monw Jun 10, 2016 Contributor

it's also in non of the signature files

@javanna
javanna Jun 10, 2016 Member

what makes it forbidden is jdk-non-portable. this check is not based on a signature file but rather on heuristic. Opened policeman-tools/forbidden-apis#103 .

@uschindler
uschindler Jun 10, 2016 edited Contributor

The reason is that it is in the jdk-non-portable signature. It is in fact not portable, because other JDKs may not have that class. The @Export just says that its non-private on Oracle JDKs, but it is not part of any Java standard.

You have 2 ways to solve:

  • Use jdk-internal as signature (subset) instead of jdk-non-portable
  • Add @SuppressForbidden (I'd recommend that)

Here is the contents of the Oracle blacklist; see also the comment on top of file:
https://github.com/policeman-tools/forbidden-apis/blob/master/src/main/resources/de/thetaphi/forbiddenapis/signatures/jdk-internal-1.8.txt

@s1monw s1monw and 1 other commented on an outdated diff Jun 10, 2016
...main/java/org/elasticsearch/client/DeadHostState.java
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.elasticsearch.client;
+
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Holds the state of a dead connection to a host. Keeps track of how many failed attempts were performed and
+ * when the host should be retried (based on number of previous failed attempts).
+ * Class is immutable, a new copy of it should be created each time the state has to be changed.
+ */
+class DeadHostState {
@s1monw
s1monw Jun 10, 2016 Contributor

can be final?

@javanna javanna referenced this pull request in policeman-tools/forbidden-apis Jun 10, 2016
Closed

com.sun.net.httpserver should not be forbidden #103

@uschindler uschindler commented on an outdated diff Jun 10, 2016
client-sniffer/build.gradle
+ testCompile "org.codehaus.mojo:animal-sniffer-annotations:1.15"
+ signature "org.codehaus.mojo.signature:java17:1.0@signature"
+}
+
+compileJava.options.compilerArgs << '-target' << '1.7' << '-source' << '1.7' << '-Xlint:all,-path,-serial,-options'
+compileTestJava.options.compilerArgs << '-target' << '1.7' << '-source' << '1.7'
+
+forbiddenApisMain {
+ //client does not depend on core, so only jdk signatures should be checked
+ signaturesURLs = [PrecommitTasks.getResource('/forbidden/jdk-signatures.txt')]
+}
+
+forbiddenApisTest {
+ //we are excluding jdk-non-portable to allow for com.sun.net.httpserver.* usage
+ //TODO remove this line once https://github.com/policeman-tools/forbidden-apis/issues/103 gets solved
+ bundledSignatures = ['jdk-unsafe', 'jdk-deprecated', 'jdk-system-out']
@uschindler
uschindler Jun 10, 2016 Contributor

You should use jdk-internal instead of jdk-non-portablefor this special case. But keep in mind that this whole module may not work on non-Oracle JDKs (and its non-portable because of that).

@uschindler
uschindler Jun 10, 2016 Contributor

You should also use groovy logic to make this dynamically:

bundledSignatures -= 'jdk-non-portable'
bundledSignatures += 'jdk-internal'
@soran1991

Hi Luka, I urgently need to speak to you in regards to a comment you posted a while ago on stack overflow. The post was on finding the perimeter of an image. Can you please get in touch with me. Many thanks.

@uschindler uschindler and 1 other commented on an outdated diff Jun 10, 2016
client-sniffer/build.gradle
+ compile "commons-codec:commons-codec:${versions.commonscodec}"
+ compile "commons-logging:commons-logging:${versions.commonslogging}"
+ compile "com.fasterxml.jackson.core:jackson-core:${versions.jackson}"
+
+ testCompile "com.carrotsearch.randomizedtesting:randomizedtesting-runner:${versions.randomizedrunner}"
+ testCompile "junit:junit:${versions.junit}"
+ testCompile "org.hamcrest:hamcrest-all:${versions.hamcrest}"
+ testCompile "org.apache.lucene:lucene-test-framework:${versions.lucene}"
+ testCompile "org.apache.lucene:lucene-core:${versions.lucene}"
+ testCompile "org.apache.lucene:lucene-codecs:${versions.lucene}"
+ testCompile "org.elasticsearch:securemock:${versions.securemock}"
+ testCompile "org.codehaus.mojo:animal-sniffer-annotations:1.15"
+ signature "org.codehaus.mojo.signature:java17:1.0@signature"
+}
+
+compileJava.options.compilerArgs << '-target' << '1.7' << '-source' << '1.7' << '-Xlint:all,-path,-serial,-options'
@uschindler
uschindler Jun 10, 2016 Contributor

Shouldn't forbiddenapis not also use version 1.7 signatures? With Java 9 it might otherwise fail, because there could be signatures in 1.8 that don't resolve in Java 7.

targetCompatibility = 1.7
@javanna
javanna Jun 13, 2016 Member

yes you are right, thanks @uschindler !

@s1monw s1monw and 1 other commented on an outdated diff Jun 13, 2016
...main/java/org/elasticsearch/client/DeadHostState.java
+
+ private static final long MIN_CONNECTION_TIMEOUT_NANOS = TimeUnit.MINUTES.toNanos(1);
+ private static final long MAX_CONNECTION_TIMEOUT_NANOS = TimeUnit.MINUTES.toNanos(30);
+
+ static final DeadHostState INITIAL_DEAD_STATE = new DeadHostState();
+
+ private final int failedAttempts;
+ private final long deadUntilNanos;
+
+ private DeadHostState() {
+ this.failedAttempts = 1;
+ this.deadUntilNanos = System.nanoTime() + MIN_CONNECTION_TIMEOUT_NANOS;
+ }
+
+ DeadHostState(DeadHostState previousDeadHostState) {
+ long timeoutNanos = (long)Math.min(MIN_CONNECTION_TIMEOUT_NANOS * 2 * Math.pow(2, previousDeadHostState.failedAttempts * 0.5 - 1),
@s1monw
s1monw Jun 13, 2016 Contributor

what is the heuristic here and where does it come from?

@javanna
javanna Jun 13, 2016 Member

it's a back-off mechanism that all the language clients have. We keep track of how many times a certain node fails consecutively, and the higher that number is the longer we will wait to retry that node again. Minimum is 1 minute (for a node the only failed once), maximum is 30 minutes (for a node that failed many consecutive times)

@s1monw
s1monw Jun 13, 2016 Contributor

can you explain it in the code like this?

@s1monw s1monw commented on the diff Jun 13, 2016
...main/java/org/elasticsearch/client/RequestLogger.java
+import org.apache.http.HttpHost;
+import org.apache.http.HttpResponse;
+import org.apache.http.client.methods.HttpUriRequest;
+import org.apache.http.entity.BufferedHttpEntity;
+import org.apache.http.entity.ContentType;
+import org.apache.http.util.EntityUtils;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+
+/**
+ * Helper class that exposes static methods to unify the way requests are logged.
+ * Includes trace logging to log complete requests and responses in curl format.
@s1monw
s1monw Jun 13, 2016 Contributor

can you comment here why we do log in curl format? why is this useful and why is this class added at all

@javanna
javanna Jun 13, 2016 Member

It is useful for debugging. The curl format helps manually sending those requests via curl. trace logging is a feature that all the language clients provide.

@s1monw
s1monw Jun 13, 2016 Contributor

no I mean put this in for the future reader :)

@s1monw s1monw commented on the diff Jun 13, 2016
...rc/main/java/org/elasticsearch/client/RestClient.java
+ onSuccess(host);
+ return response;
+ }
+ RequestLogger.log(logger, "request failed", request, host, httpResponse);
+ String responseBody;
+ try {
+ if (response.getEntity() == null) {
+ responseBody = null;
+ } else {
+ responseBody = EntityUtils.toString(response.getEntity());
+ }
+ } finally {
+ response.close();
+ }
+ lastSeenException = addSuppressedException(lastSeenException, new ResponseException(response, responseBody));
+ switch(statusCode) {
@s1monw
s1monw Jun 13, 2016 Contributor

should we do this to all codes >= 500 ? I think so ?

@javanna
javanna Jun 13, 2016 Member

It's a conscious decision among the client developers. none of the language clients retry on 500. see elastic/elasticsearch-net#2061 .

@s1monw
s1monw Jun 13, 2016 Contributor

why retry we fail?

@javanna
javanna Jun 13, 2016 Member

we retry because codes like 502, 503 and 504 are seen like errors that could happen on one single node, not a request error. Then it is very likely that the request will succeed on the next node.

@s1monw
s1monw Jun 13, 2016 Contributor

we do not retry we fail that node - that is what I am saying I am not sure I think we should fail that node and not call it success if we see a 500?

@javanna
javanna Jun 13, 2016 Member

There are two scenarios in our clients:

  • request error: return error back, don't retry, call onSuccess
  • node failure: retry all nodes, call onFailure at each failure (can happen with IOExceptions or 502, 503, 504).

500 is not seen as other 5xx codes like 502, 503 or 504, but like a request error. With 500 we do not retry, "connection" wise it's a success, we simply return the error and expect further action done by users. The assumption is that retrying won't fix the problem. This is what I am told all the language clients do, maybe somebody from @elastic/es-clients would like to better explain this decision?

@HonzaKral
HonzaKral Jun 13, 2016 Member

The reasoning behind the 500 behavior is that there are no guarantees and no information on which to base a decision. There is no place in elasticsearch (please correct me if I am wrong) that returns 500 willingly, it's just a catch-all for "something went wrong" (I think of it as RuntimeException) so we just ignore it.

@s1monw
s1monw Jun 14, 2016 Contributor

@javanna I think what causes the confusion here is the naming. I makes no sense to call onSuccess if we got a 500. I think we should rename onSuccess to something better. I think a method boolean onResponse(int httpCode, HttpHost host) would make it clear. if it returns true we retry and if false we fail. that way we have all logic contained. then onFailure and onSuccess can go away and I don't get confused?

@javanna
javanna Jun 21, 2016 Member

I agree on the confusion due to naming, fixing.

@s1monw s1monw and 1 other commented on an outdated diff Jun 13, 2016
...rc/main/java/org/elasticsearch/client/RestClient.java
+ //last resort: if there are no good hosts to use, return a single dead one, the one that's closest to being retried
+ List<Map.Entry<HttpHost, DeadHostState>> sortedHosts = new ArrayList<>(blacklist.entrySet());
+ Collections.sort(sortedHosts, new Comparator<Map.Entry<HttpHost, DeadHostState>>() {
+ @Override
+ public int compare(Map.Entry<HttpHost, DeadHostState> o1, Map.Entry<HttpHost, DeadHostState> o2) {
+ return Long.compare(o1.getValue().getDeadUntilNanos(), o2.getValue().getDeadUntilNanos());
+ }
+ });
+ HttpHost deadHost = sortedHosts.get(0).getKey();
+ logger.trace("resurrecting host [" + deadHost + "]");
+ return Collections.singleton(deadHost);
+ }
+
+ List<HttpHost> rotatedHosts = new ArrayList<>(filteredHosts);
+ //TODO is it possible to make this O(1)? (rotate is O(n))
+ Collections.rotate(rotatedHosts, rotatedHosts.size() - lastHostIndex.getAndIncrement());
@s1monw
s1monw Jun 13, 2016 Contributor

just use an array and iterate over it with a different start offset?

@javanna
javanna Jun 13, 2016 Member

wouldn't that still be 0(n) though? I mean we would create a new array whose first element is at position lastHostIndex.getAndIncrement() in the original one. Something very similar to what rotate does if not the same?

@s1monw
s1monw Jun 13, 2016 Contributor

rotate copies data, I think you don't need to do this if you use something like RotatedList we have in core... maybe it's ok to copy it for this purpose? But I think we can just stick with the collection one we have. if you would not use a set it would be cheaper

@javanna
javanna Jun 13, 2016 Member

ok I will keep it as-s and remove the TODO for now

@s1monw s1monw commented on the diff Jun 13, 2016
...rc/main/java/org/elasticsearch/client/RestClient.java
+ * Receives as an argument the host that was used for the successful request.
+ */
+ private void onSuccess(HttpHost host) {
+ DeadHostState removedHost = this.blacklist.remove(host);
+ if (logger.isDebugEnabled() && removedHost != null) {
+ logger.debug("removed host [" + host + "] from blacklist");
+ }
+ }
+
+ /**
+ * Called after each failed attempt.
+ * Receives as an argument the host that was used for the failed attempt.
+ */
+ private void onFailure(HttpHost host) throws IOException {
+ while(true) {
+ DeadHostState previousDeadHostState = blacklist.putIfAbsent(host, DeadHostState.INITIAL_DEAD_STATE);
@s1monw
s1monw Jun 13, 2016 Contributor

why do we have this blacklist here? Can we just take the node out of the rotation and wait until somebody calls setHotsts to bring it back?

@javanna
javanna Jun 13, 2016 Member

in case there is no sniffer, and not everybody will use it, that means that once a node has failed once it will never be used again. We need to differentiate between nodes that have failed once or twice and nodes that are failing a lot. That is why we have a blacklist and failing nodes have a state that tells us when they need to be retried again.

@s1monw
s1monw Jun 13, 2016 Contributor

hmm should we have a default "sniffer" then that does that?

@javanna
javanna Jun 13, 2016 Member

but the blacklist also works between consecutive setHosts (by default happens every 5 minutes). You do not want to keep on sending requests to a node that has failed n consecutive times, but you do want to retry a node that has failed only once I think?

@javanna
javanna Jun 13, 2016 edited Member

What I'm saying is that it makes sense to re-evaluate at each node failure, not at each sniff run.

@s1monw
s1monw Jun 13, 2016 Contributor

I am not sure if feels backwards to have the blacklist in here and check for every run?

@javanna
javanna Jun 13, 2016 Member

That's because we have the back-off. If we didn't check at every request, we would need a background thread that sets the hosts back depending on when each node failed last and how many consecutive times. Plus if all the nodes are blacklisted, we still resurrect one of them and try, while if we take them out, that wouldn't be possible. I feel like this host handling should happen within the client, not externally.

@HonzaKral
HonzaKral Jun 13, 2016 Member

Also a lot of people use a load balancer in front of http (either just to cross network boundaries or to add things like simple http auth or path/method filtering), they might have multiple of them for HA but sniffing won't be available in that case so we cannot rely on it being an option.

@s1monw
s1monw Jun 14, 2016 Contributor

@HonzaKral they failure listener is not bound to a sniffer it can be anything. nothing todo with HA or anything that you are confusing into this conversation.

@javanna I agree that background threads are not good. I think that is a design smell and we might wanna take a step back and rethink it for a moment. I would love if the decision to retry a node or not could be detached from the client. Something like public boolean onFailure(int httpCode, HttpHost host) then internally we can keep the blacklist as it is? WDYT

@javanna
javanna Jun 21, 2016 Member

discussed this with Simon and we decided to merge as-is for now and see whether in the future we still want to take out the retry policy and make it pluggable.

@s1monw s1monw and 1 other commented on an outdated diff Jun 13, 2016
...rc/main/java/org/elasticsearch/client/RestClient.java
+ if (previousDeadHostState == null) {
+ logger.debug("added host [" + host + "] to blacklist");
+ break;
+ }
+ if (blacklist.replace(host, previousDeadHostState, new DeadHostState(previousDeadHostState))) {
+ logger.debug("updated host [" + host + "] already in blacklist");
+ break;
+ }
+ }
+ failureListener.onFailure(host);
+ }
+
+ /**
+ * Sets a {@link FailureListener} to be notified each and every time a host fails
+ */
+ public synchronized void setFailureListener(FailureListener failureListener) {
@s1monw
s1monw Jun 13, 2016 Contributor

this should be passed as a constructor argument and should not be mutable

@s1monw
s1monw Jun 13, 2016 Contributor

it should also be required I guess?

@javanna
javanna Jun 13, 2016 Member

I replied to the same comment here. I am not sure how we this would work given that Sniffer depends on RestClient and not the other way around. The Sniffer, when registered, has to set itself as a failure listener. How can it do that if failure listener is required when creating the RestClient?

@s1monw
s1monw Jun 13, 2016 Contributor

so to me this is like a chicken egg problem if you use a sniffer on the same rest client as you add hosts to. They should use different policies for handing failures no? I mean you use one for sniffing and one for searching etc. If we want to use the same i think we should rather have the sniffer have a one to many relationship so you can use one sniffer with multiple clients? I am ok with one to one but it should be the sniffer that gets the client injected?

@javanna
javanna Jun 13, 2016 edited Member

I don't know, the FailureListener is really something that I don't expect users to extend, but we needed it to not make RestClient depend on Sniffer. Sniffer is the only implementor as it optionally does sniffing on failure. Besides that, RestClient already does the right things when it comes to handling failures and I am not sure I would want users to play with Failure listeners.
It is indeed a chicken-egg problem. Do we create the Sniffer first, or the RestClient first? I think we want to create the RestClient first, and then have it injected to the Sniffer. But then how do we register the FailureListener to the RestClient? It can't really be immutable can it?

@s1monw
s1monw Jun 14, 2016 Contributor

ok do we wanna make it SetOnce.java ?

@s1monw s1monw and 1 other commented on an outdated diff Jun 13, 2016
...va/org/elasticsearch/client/RestClientIntegTests.java
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import static org.elasticsearch.client.RestClientTestUtil.getAllStatusCodes;
+import static org.elasticsearch.client.RestClientTestUtil.getHttpMethods;
+import static org.elasticsearch.client.RestClientTestUtil.randomStatusCode;
+import static org.hamcrest.CoreMatchers.equalTo;
+
+/**
+ * Integration test to check interaction between {@link RestClient} and {@link org.apache.http.client.HttpClient}.
+ * Works against a real http server, one single host.
+ */
+@IgnoreJRERequirement
@s1monw
s1monw Jun 13, 2016 Contributor

can you just leave a comment why we added this annotation?

@javanna
javanna Jun 13, 2016 Member

sure, we could potentially have our own annotation (that could require a reason) like SuppressForbidden, but I thought it's overkill.

@s1monw
Contributor
s1monw commented Jun 13, 2016

I left some more comments - awesome job! I think we are close here!

I think from here we have 2 immediate followups:

  • work on adding async version of the client for serach etc.
  • work on adding a higher level API to make it easier for the users to use the client.

For the latter I think it should be again it's own module and we might even start with just depending on core for things like SearchSourceBuilder? I really think it could take us very far if we start redesigning a client interface just for the basic functions like CRUD and reusing the builders we have in core? We can even make this dependency optional and allow people to use it if they need to?

@javanna
Member
javanna commented Jun 14, 2016

I addressed or replied to all comments.

As for compatibility 1.7, I think what we have now is as good as it gets: we compile with target and source 1.7, use animal-sniffer to detect usage of 1.8 apis, and tests depend on lucene-test version 5 that is compatible with 1.7. Not sure if this last bit can cause problems in our builds, looking for feedback on that.

There's also a couple of points that are still under discussion about blacklisting nodes and making FailureListener immutable in RestClient. Waiting for some resolution on that too.

@s1monw
Contributor
s1monw commented Jun 14, 2016

@javanna I left some comments to resolve the blockers. LGTM otherwise!!!

@akonczak akonczak referenced this pull request in spring-projects/spring-data-elasticsearch Jun 14, 2016
Open

DATAES-220 - First version of Jest Implementation #147

javanna added some commits Jun 14, 2016
@javanna @javanna javanna Rename RequestLogger#log methods to distinguish between the two
One method is to log a request that yielded a response, the other one for a failed request
1932f6b
@javanna @javanna javanna remove message parameter from RequestLogger methods
This prevents useless string allocation.
cf93e90
@s1monw s1monw commented on an outdated diff Jun 14, 2016
...main/java/org/elasticsearch/client/sniff/Sniffer.java
+public final class Sniffer extends RestClient.FailureListener implements Closeable {
+
+ private static final Log logger = LogFactory.getLog(Sniffer.class);
+
+ private final boolean sniffOnFailure;
+ private final Task task;
+
+ private Sniffer(RestClient restClient, HostsSniffer hostsSniffer, long sniffInterval,
+ boolean sniffOnFailure, long sniffAfterFailureDelay) {
+ this.task = new Task(hostsSniffer, restClient, sniffInterval, sniffAfterFailureDelay);
+ this.sniffOnFailure = sniffOnFailure;
+ restClient.setFailureListener(this);
+ }
+
+ @Override
+ public void onFailure(HttpHost host) throws IOException {
@s1monw
s1monw Jun 14, 2016 Contributor

I think this should be a 3rd class that can even be implemented by a user. It would also allow us to make the FailureListener a ctor argument. All we need to to is to add a public void snif(HttpHost exclude) that the user can call. That way the can just have a FailureListener they can register on client creation and that is called on failure and therefor delegates to the sniffer? WDYT

@rjernst rjernst and 1 other commented on an outdated diff Jun 14, 2016
client-sniffer/build.gradle
+ compile "com.fasterxml.jackson.core:jackson-core:${versions.jackson}"
+
+ testCompile "com.carrotsearch.randomizedtesting:randomizedtesting-runner:${versions.randomizedrunner}"
+ testCompile "junit:junit:${versions.junit}"
+ testCompile "org.hamcrest:hamcrest-all:${versions.hamcrest}"
+ //we use the last lucene-test version that is compatible with java 1.7
+ testCompile "org.apache.lucene:lucene-test-framework:5.5.1"
+ testCompile "org.apache.lucene:lucene-core:5.5.1"
+ testCompile "org.apache.lucene:lucene-codecs:5.5.1"
+ testCompile "org.elasticsearch:securemock:${versions.securemock}"
+ testCompile "org.codehaus.mojo:animal-sniffer-annotations:1.15"
+ signature "org.codehaus.mojo.signature:java17:1.0@signature"
+}
+
+compileJava.options.compilerArgs << '-target' << '1.7' << '-source' << '1.7' << '-Xlint:all,-path,-serial,-options'
+compileTestJava.options.compilerArgs << '-target' << '1.7' << '-source' << '1.7'
@rjernst
rjernst Jun 14, 2016 Member

The targe and source should already be added by setting project.targetCompatibility and project.sourceCompatibility?

@javanna
javanna Jun 14, 2016 Member

makes sense, I will remove the compilerArgs then

@rjernst
Member
rjernst commented Jun 14, 2016

and tests depend on lucene-test version 5 that is compatible with 1.7. Not sure if this last bit can cause problems in our builds, looking for feedback on that

This should be ok, as long as randomized runner doesn't break compat (ie so both 5 and 6 use the same randomized runner apis). The gradle changes look ok to me.

As a follow up I think we should do what Nik recommended, which was to move these under a client dir. I think then we can have the rest api spec in there as well. Remember the project name in gradle does not need to match what we call it in maven, but we also need to be careful: I would not call it :client:core because I have seen issues with intellij and even with gradle itself where it gets confused when projects have the same name (even if their full path is different).

@javanna @javanna javanna Build: targetCompatibility and sourceCompatibility are enough to make…
… sure we compile client and client-sniffer with target and source 1.7
70fb67c
@nik9000
Contributor
nik9000 commented Jun 14, 2016

@javanna I just merged #18730 so you can have the naming conventions test when you pick up changes from master.

@nik9000
Contributor
nik9000 commented Jun 14, 2016

As a follow up I think we should do what Nik recommended

I'm awesome!

I would not call it :client:core

Oh...

intellij and even with gradle itself where it gets confused when projects have the same name

Lame! I fixed Eclipse to generate projects so they are named their path a while back. It is sooooo much nicer that way. All the tests grouped together, all the modules, etc.

@rjernst
Member
rjernst commented Jun 15, 2016

I fixed Eclipse to generate projects so they are named their path a while back.

We can possibly do this (although intellij already has this ability, it is just you have to "group" subprojects, and I don't know of a native way to do this in the gradle intellij setup). However, that doesn't solve the bugs inside gradle itself. One issue I saw was when having 2 different projects named core, project substitutions were messed up, creating havoc in the build. I tried to reproduce in isolation, but was unsuccessful.

@uschindler uschindler and 1 other commented on an outdated diff Jun 15, 2016
client/build.gradle
@@ -44,9 +44,6 @@ dependencies {
signature "org.codehaus.mojo.signature:java17:1.0@signature"
}
-compileJava.options.compilerArgs << '-target' << '1.7' << '-source' << '1.7' << '-Xlint:all,-path,-serial,-options'
@uschindler
uschindler Jun 15, 2016 Contributor

This should also work with forbiddenapis, because it inherits targetCompatibility. Just do "gradle --info forbiddenApis" and check whats printed out while loading signatures files.

@javanna
javanna Jun 15, 2016 Member

gotcha, I confirm it works, removing the additional targetCompatibility from forbiddenapis config.

javanna added some commits Jun 15, 2016
@javanna @javanna javanna Build: remove explicit targetCompatibility from forbiddenapis config
targetCompatibility set to the project is enough.
7c8013f
@javanna @javanna javanna Merge branch 'master' into feature/http_client ace3a7b
@javanna @javanna javanna Build: do not load integ test class if --skip-integ-tests-in-disguise…
… is specified in NamingConventionsCheck

Projects that don't depend on elasticsearch-test fail otherwise because org.elasticsearch.test.EsIntegTestCase (default integ test class) is not in the classpath. They should provide their onw integ test base class, but having integration tests should not be mandatory. One can simply set skipIntegTestsInDisguise to true to prevent loading of integ test class.
8c60374
@javanna @javanna javanna Merge branch 'master' into feature/http_client af93533
@javanna
Member
javanna commented Jun 17, 2016

@nik9000 I made the naming conventions check work, there was a problem because the default integ test class is not in the classpath. Can you check 8c60374 please? Not sure it deserves its own PR cause the change makes sense only once this PR is in.

@nik9000 nik9000 commented on the diff Jun 17, 2016
...va/org/elasticsearch/test/NamingConventionsCheck.java
@@ -87,9 +92,9 @@ public static void main(String[] args) throws IOException {
assertNoViolations("Found inner classes that are tests, which are excluded from the test runner", check.innerClasses);
assertNoViolations("Pure Unit-Test found must subclass [" + check.testClass.getSimpleName() + "]", check.pureUnitTest);
assertNoViolations("Classes ending with [Tests] must subclass [" + check.testClass.getSimpleName() + "]", check.notImplementing);
- if (!skipIntegTestsInDisguise) {
- assertNoViolations("Subclasses of ESIntegTestCase should end with IT as they are integration tests",
- check.integTestsInDisguise);
+ if (skipIntegTestsInDisguise == false) {
@nik9000
nik9000 Jun 17, 2016 Contributor

Maybe add the integTestClass != null here too?

@javanna
javanna Jun 17, 2016 Member

it can't happen to have it null when skipIntegTestsInDisguise is false no?

@nik9000 nik9000 commented on the diff Jun 17, 2016
...va/org/elasticsearch/test/NamingConventionsCheck.java
@@ -143,7 +148,7 @@ public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IO
String className = filename.substring(0, filename.length() - ".class".length());
Class<?> clazz = loadClassWithoutInitializing(packageName + className);
if (clazz.getName().endsWith("Tests")) {
- if (integTestClass.isAssignableFrom(clazz)) {
+ if (skipTestsInDisguised == false && integTestClass.isAssignableFrom(clazz)) {
@nik9000
nik9000 Jun 17, 2016 Contributor

I'd just check if integTestClass != null rather than make a new boolean.

@javanna
javanna Jun 17, 2016 Member

sorry, I disagree, I didn't add a new boolean but just made work the existing one, no need to load the class if those checks are disabled.

@nik9000
nik9000 Jun 17, 2016 Contributor

I'm ok with merging as is and hacking on it later. I don't like that the boolean which means "I don't care if integ tests sneak into the regular test phase" also means "there aren't integ tests so don't bother looking" but there are worse problems in the world.

@javanna
javanna Jun 21, 2016 Member

sure... I see it differently though. we simply don't need to load the class if we don't do any check. We maybe could simply rename the option to "skipIntegTestCheck" or something. let's postpone this though.

@nik9000
nik9000 Jun 21, 2016 Contributor

++

@nik9000 nik9000 and 1 other commented on an outdated diff Jun 17, 2016
client-sniffer/build.gradle
@@ -61,8 +61,11 @@ forbiddenApisTest {
//JarHell is part of es core, which we don't want to pull in
jarHell.enabled=false
-//NamingConventionCheck is part of test-framework, which we don't want to pull in as it depends on es core
-namingConventions.enabled=false
+
+namingConventions {
+ //we don't have integration tests
+ skipIntegTestInDisguise = true
@nik9000
nik9000 Jun 17, 2016 Contributor

I'd prefer settings integTestClass to null or defaulting it to null and setting it our stuff somewhere else in the build.

@javanna
javanna Jun 17, 2016 Member

tried defaulting it to null, but ended up having to set it in too many projects. Seems like we should rather disable in the client exception only? I went for setting it to null here explicitly but those null values become trickier to handle and understand, I went for the boolean that we already had cause it's more explicit and considered it a bug that when the check is disabled we still load the class.

@nik9000
Contributor

performs

@nik9000

Leftover? If not wrap in an if (log.isDebugEnabled())?

Contributor

Ah. Just moved.

javanna added some commits Jun 21, 2016
@javanna @javanna javanna Merge branch 'master' into feature/http_client 886cb37
@javanna @javanna javanna rename onSuccess to onResponse
That makes it a bit clearer that it's about the response and whether we decide if it was a good one or a failure (based on status code)
f0b6abe
@javanna
Member
javanna commented Jun 21, 2016

@s1monw I pushed a couple more commits that addressed your last comments. Can you double check please?

@s1monw
Contributor
s1monw commented Jun 22, 2016

LGTM

@javanna javanna merged commit f0f4db0 into elastic:master Jun 22, 2016

1 check passed

CLA Commit author is a member of Elasticsearch
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment