Adds Elasticsearch 5.x support via storage type: elasticsearch-http #1403

Merged
merged 1 commit into from Nov 15, 2016

Projects

None yet

2 participants

@adriancole
Contributor
adriancole commented Nov 15, 2016 edited

This allows use of Elasticsearch 5.x and its notable ingest pipeline
feature when using zipkin-storage-elasticsearch-http.

It is important to note that zipkin-storage-elasticsearch remains
pinned to ES 2.x libraries as they are compile incompatible with 5.x.
In other words, you must use http if you want to use ES 5 (for now).

Version detection is implemented in order to choose the correct index
template format for the major version number. The implementation of
such is string manip, as it was less work than making a new type.

This adds a new parameter ES_PIPELINE which allows you to manipulate
the json sent by Zipkin collector before it is indexed. This could be
used for many things including cleaning service names or adding ingest
timestamps.

Integration tests run version 2.x on CircleCI and 5.x on Travis

Fixes #1312
Fixes #1318

@adriancole
Contributor

cc @openzipkin/elasticsearch and @dragontree101 @dan-tr @shakuzen and @mansu particularly (as they had a lot of interest here)

@adriancole
Contributor

circleci died trying to start elasticsearch 5, so swapped and have travis running it (which has more resources)

@adriancole
Contributor
adriancole commented Nov 15, 2016 edited

here's the playbook I used to test this.

Install elasticsearch 5 by downloading and running it

$ curl -SL https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.0.0.tar.gz | tar xz
$ elasticsearch-*/bin/elasticsearch

Add a custom pipeline to it

$ curl -X PUT -s localhost:9200/_ingest/pipeline/zipkin -d '{
  "description" : "add collector_timestamp_millis",
  "processors" : [
    {
      "set" : {
        "field": "collector_timestamp_millis",
        "value": "{{_ingest.timestamp}}"
      }
    }
  ]
}'

Start zipkin, pointed at that pipeline

$ STORAGE_TYPE=elasticsearch ES_HOSTS=http://localhost:9200 ES_PIPELINE=zipkin java -jar zipkin-server/target/zipkin-*exec.jar

Add some traces (ex using one of the examples), then check the resulting index

$ curl -s localhost:9200/zipkin-2016-11-15/_search|jq '.hits.hits[]._source.collector_timestamp_millis'
@@ -53,9 +53,14 @@ public static HttpClientBuilder create(OkHttpClient client) {
return this;
}
- /** Default true. true implies that spans will be gzipped before transport. */
@adriancole
adriancole Nov 15, 2016 Contributor

this was dead code!

try (ResponseBody responseBody = response.body()) {
if (response.isSuccessful()) {
set(convert(responseBody));
} else {
setException(new IllegalStateException("response failed: " + response));
}
+ } catch (Throwable t) {
@adriancole
adriancole Nov 15, 2016 edited Contributor

running against mixed versions increases the sorts of exception we can get, so tightened this and backfilled test cases.

.travis.yml
@@ -41,7 +41,7 @@ before_install:
- echo "https://$GH_TOKEN:@github.com" > .git/credentials
# Manually install elasticsearch until https://github.com/travis-ci/apt-source-whitelist/issues/190
@shakuzen
shakuzen Nov 15, 2016 edited Contributor

I was curious and looked at that issue. It looks like ES 2.x is whitelisted now.

@adriancole
adriancole Nov 15, 2016 Contributor

I'll change that comment, since we've a different reason for using curl (to get v5)

@adriancole adriancole Adds Elasticsearch 5.x support via storage type: elasticsearch-http
This allows use of Elasticsearch 5.x and its notable ingest pipeline
feature when using `zipkin-storage-elasticsearch-http`.

It is important to note that `zipkin-storage-elasticsearch` remains
pinned to ES 2.x libraries as they are compile incompatible with 5.x.
In other words, you must use http if you want to use ES 5 (for now).

Version detection is implemented in order to choose the coirrect index
template format for the major version number. The implementation of
such is string manip, as it was less work than making a new type.

This adds a new parameter `ES_PIPELINE` which allows you to manipulate
the json sent by Zipkin collector before it is indexed. This could be
used for many things including cleaning service names or adding ingest
timestamps.

Integration tests run version 2.x on CircleCI and 5.x on Travis
11a6c48
@adriancole adriancole merged commit 1c39f51 into master Nov 15, 2016

0 of 3 checks passed

ci/circleci Your tests are queued behind your running builds
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
continuous-integration/travis-ci/push The Travis CI build is in progress
Details
@adriancole adriancole deleted the es-5 branch Nov 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment