-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrates AWS Elasticsearch Storage to armeria/jackson #135
Conversation
self tracing is a little bit messed up, but I don't think that's related to this. I think main thing was sorting out the contents of the jar. While this works, we shouldn't merge until we are on a final version. There are a few dents to knock out upstream such as logging the health check. Plus we shouldn't release zipkin 2.16 until the next armeria release I think, as that could cleanup some code here also. |
by self-tracing messed up I mean that there are no child spans of the "accept-spans" call. There should be a bulk http POST request |
autoconfigure/pom.xml
Outdated
@@ -72,8 +72,15 @@ | |||
<configuration> | |||
<!-- https://github.com/spring-projects/spring-boot/issues/3426 transitive exclude doesn't work --> | |||
<excludeGroupIds> | |||
io.zipkin.zipkin2,io.zipkin.reporter2,org.springframework.boot,org.springframework,com.fasterxml.jackson.core,com.google.auto.value,com.google.guava,com.squareup.moshi,com.squareup.okio,com.squareup.okhttp | |||
com.github.openzipkin.zipkin,io.zipkin.reporter2,org.springframework.boot,org.springframework,com.fasterxml.jackson.core,com.google.auto.value,com.google.guava,org.reactivestreams,com.google.code.findbugs,javax.annotation,org.slf4j,io.netty,io.micrometer,org.hdrhistogram,org.latencyutils |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this first entry needs to change back to io.zipkin.zipkin2
before merge. basically a global replace back needs ot happen
...rc/main/java/zipkin/autoconfigure/storage/elasticsearch/aws/ElasticsearchDomainEndpoint.java
Outdated
Show resolved
Hide resolved
...h-aws/src/main/java/zipkin/autoconfigure/storage/elasticsearch/aws/AWSSignatureVersion4.java
Outdated
Show resolved
Hide resolved
|
||
@Override public HttpResponse execute(ClientRequestContext ctx, HttpRequest req) { | ||
return HttpResponse.from( | ||
req.aggregateWithPooledObjects(ctx.eventLoop(), ctx.alloc()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if there are issues with self-tracing, it's because we need to do our contextAwareEventLoop
trick here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haven't handled this one yet
...h-aws/src/main/java/zipkin/autoconfigure/storage/elasticsearch/aws/AWSSignatureVersion4.java
Outdated
Show resolved
Hide resolved
...h-aws/src/main/java/zipkin/autoconfigure/storage/elasticsearch/aws/AWSSignatureVersion4.java
Outdated
Show resolved
Hide resolved
...h-aws/src/main/java/zipkin/autoconfigure/storage/elasticsearch/aws/AWSSignatureVersion4.java
Outdated
Show resolved
Hide resolved
@anuraaga I noticed that we made some utilities for moshi which aren't relevant when using jackson. This helped cleanup the code a good bit I think. Thanks for the other feedback as well. PTAL |
@@ -13,18 +13,16 @@ | |||
*/ | |||
package zipkin.autoconfigure.storage.elasticsearch.aws; | |||
|
|||
import zipkin2.internal.Nullable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got rid of all internal type usage in this PR
.url(baseUrl.newBuilder("2015-01-01/es/domain").addPathSegment(domain).build()) | ||
.tag("get-es-domain") | ||
.build(); | ||
AggregatedHttpRequest.of(HttpMethod.GET, "/2015-01-01/es/domain/" + domain); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: name this "get-es-domain" like we did upstream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not worth it - IIUC, this is always going to be client-initiated and not sampled.
try { | ||
AggregatedHttpResponse res = client.execute(describeElasticsearchDomain).aggregate().join(); | ||
status = res.status(); | ||
// As the domain endpoint is read only once per startup. We don't worry about pooling etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anuraaga let me know if this is true or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true
ps you have karma so feel free to update the branch. I have to look at other things a bit |
...h-aws/src/main/java/zipkin/autoconfigure/storage/elasticsearch/aws/AWSSignatureVersion4.java
Outdated
Show resolved
Hide resolved
...h-aws/src/main/java/zipkin/autoconfigure/storage/elasticsearch/aws/AWSSignatureVersion4.java
Outdated
Show resolved
Hide resolved
.url(baseUrl.newBuilder("2015-01-01/es/domain").addPathSegment(domain).build()) | ||
.tag("get-es-domain") | ||
.build(); | ||
AggregatedHttpRequest.of(HttpMethod.GET, "/2015-01-01/es/domain/" + domain); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not worth it - IIUC, this is always going to be client-initiated and not sampled.
try { | ||
AggregatedHttpResponse res = client.execute(describeElasticsearchDomain).aggregate().join(); | ||
status = res.status(); | ||
// As the domain endpoint is read only once per startup. We don't worry about pooling etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true
...rc/main/java/zipkin/autoconfigure/storage/elasticsearch/aws/ElasticsearchDomainEndpoint.java
Outdated
Show resolved
Hide resolved
Endpoint endpoint = Endpoint.of( | ||
"search-zipkin-2rlyh66ibw43ftlk4342ceeewu.ap-southeast-1.es.amazonaws.com", | ||
server.httpPort()) | ||
.withIpAddr("127.0.0.1"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool
actually I think on first request it would be server initiated.. we can check. |
oops..
|
stil didn't work but I can't even read the screen anymore. good time to stop. thanks for chasing up |
actually it did work, just it got an exception.. I need to move the other fix here about the slacker resolution :P |
finally! ok this works, so we can now do cleanups etc. as long as we super-verify that this continues to work I'm also ok with re-enabling the health check stuff. thx again for the epic support @anuraaga |
During openzipkin/zipkin-aws#135 we found that Armeria will add a second `host` header even if one is already set in the request headers before sending them. This can break behavior and I suppose must break some HTTP spec too. I wasn't sure whether to respect or ignore the user's setting, but with this change, if a user themselves set an authority and host to different values, we'll leave the host as is. While this is a bad user, I guess we should trust they did it on purpose. I had to write the unit test in Jetty since during the conversion from HTTP/1 headers to Armeria headers, the duplicate hosts got normalized.
|
running wrk and trying to do health check I get 503..
|
sorry this is health check and also prometheus. I'll try to figure out what's up.. |
with logging on, you can see the health request go back and forth, but the health check itself is impatient I guess? request/response for domain endpoint
health check relating to the above call
|
hmm.. trying a different domain.. at last the prometheus endpoint can show the endpoints, which I can hit, but of course these are signed urls..
|
if I disable client-side health-check, it is back.. run ES_AWS_DOMAIN=zipkin ES_HTTP_LOGGING=BODY STORAGE_TYPE=elasticsearch java -Dloader.path='elasticsearch-aws.jar,elasticsearch-aws.jar!/lib' -Dspring.profiles.active=elasticsearch-aws -cp zipkin.jar org.springframework.boot.loader.PropertiesLauncher --zipkin.storage.elasticsearch.health-check.enabled=false check $ curl -s localhost:9411/health
{"status":"UP","zipkin":{"status":"UP","details":{"ElasticsearchStorage{initialEndpoints=aws://ap-southeast-1/zipkin, index=zipkin}":{"status":"UP"}}}} |
…aws into elasticsearch-armeria-adrian
...h-aws/src/main/java/zipkin/autoconfigure/storage/elasticsearch/aws/AWSSignatureVersion4.java
Outdated
Show resolved
Hide resolved
works, though on a bad network a ton of timeout messages and an interesting one about http/2 reset. I think we should consider adding
|
retested against released binaries. still works |
another epic thanks @anuraaga |
During openzipkin/zipkin-aws#135 we found that Armeria will add a second `host` header even if one is already set in the request headers before sending them. This can break behavior and I suppose must break some HTTP spec too. I wasn't sure whether to respect or ignore the user's setting, but with this change, if a user themselves set an authority and host to different values, we'll leave the host as is. While this is a bad user, I guess we should trust they did it on purpose. I had to write the unit test in Jetty since during the conversion from HTTP/1 headers to Armeria headers, the duplicate hosts got normalized.
Obviates #134 which I've lost write access to :P