Skip to content

Tracing#136

Merged
protocol7 merged 1 commit intomasterfrom
tracing
Jun 18, 2019
Merged

Tracing#136
protocol7 merged 1 commit intomasterfrom
tracing

Conversation

@protocol7
Copy link
Contributor

This PR is based on top of #135 so please review that first.

@coveralls
Copy link

coveralls commented Jun 15, 2019

Coverage Status

Coverage remained the same at ?% when pulling 4058b7a on tracing into 84a6104 on master.

@protocol7
Copy link
Contributor Author

@dmichel1 tracing PR as mentioned, would be awesome if you could take a look

@protocol7 protocol7 force-pushed the tracing branch 4 times, most recently from e8c5014 to 8f581a9 Compare June 15, 2019 12:10
@protocol7 protocol7 mentioned this pull request Jun 15, 2019
Copy link

@dmichel1 dmichel1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few comments.

can you add how to use this to the readme as well?

exciting!

requireNonNull(future);
requireNonNull(operation);

final io.opencensus.trace.Tracer tracer = Tracing.getTracer();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah cool, will fix.

ocSpan.putAttribute("component", stringAttributeValue("folsom"));
ocSpan.putAttribute("peer.service", stringAttributeValue("memcache"));
ocSpan.putAttribute("operation", stringAttributeValue(operation));
if (key != null) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should adding the key as an annotation be configurable? I'm thinking there might be scenarios where key is "sensitive."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this, which is why I opted not to include the value. But perhaps this is worth considering also with the key.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

knowing what key is potentially causing an issue could definitely be valuable during troubleshooting so I can see the benefit of leaving it here/enabled by default...

what about adding another attribute for the size of the payload being sent to memcached? that could also be valuable during troubleshooting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestions! Pushed an update for addressing both of these. There's now a builder to configure if you want keys/values included in traces. Also added value size.


final io.opencensus.trace.Tracer tracer = Tracing.getTracer();

final io.opencensus.trace.Span ocSpan =

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the trickiest part of tracing especially with async operations happening...

with this implicit context propagation traces spans might not be linked when they go from a say an apollo thread pool -> memcached thread pool.

if this is indeed the case here, the span can be added as parameter to these methods and passed in with the memcached operation. This would then become tracer.spanBuilderWithExplicitParent()

https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/Span.md#how-span-interacts-with-context

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this assumes that the context is propagated to the calling thread/gRPC Context. Once in Folsom-land, this should works as expected. But, this would be equally true for tracing in say Bigtable/Spanner, right (I've only looked at the Spanner client and it seems to use the same pattern as here)? If so, I believe we should make sure we propagate the tracing context correctly at least out of the box in Apollo, is this the case today?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be propagating the context in apollo correctly as the traces to any gRPC based service (bigtable/pubsub/spanner) are linked already. have you tried testing this code with an apollo services yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't yet, but got a service hopefully lined up

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this pattern is generally used for many libraries, Bigtable does it similarly. It may be useful to allow passing in the parent span instead of solely relying on getCurrentSpan - that way if the calling code is doing something unusual there's a way to keep the context correctly linked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hexedpackets do you have a reference handy to how this is done in Bigtable, would be curious as to how they opted to design it? I tried to find, but not sure I'm looking in the right place.

For now, I would not want to change the Folsom API in order to support tracing, but if we see more requests from users on setting explicit parent spans I think we should look closer at how to do this in the future.

Copy link

@hexedpackets hexedpackets Jun 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, I think the use case of needing to pass in the parent span should be pretty rare.

Bigtable creates a span when the operation is instantiated and then closes it in a callback after the operation ends. I don't think this is passed around, but various functions in the class add tags and annotations to the span when they're run.

SetRequest request = SetRequest.casSet(byteKey, valueBytes, ttl, cas);
CompletionStage<MemcacheStatus> future = rawMemcacheClient.send(request);
metrics.measureSetFuture(future);
tracer.span("folsom.set", future, "set", key);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to do explicit propagation this would have to look something like...

public CompletionStage<MemcacheStatus> set(String key, V value, int ttl, long cas, Span span) {
...
tracer.span("folsom.set", future, "set", key, span);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid polluting the Folsom API if at all possible :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 - agreed

@protocol7
Copy link
Contributor Author

Fix pushed which addresses comments so far, including an updated README

Copy link

@dmichel1 dmichel1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple more comments and then let's try this out.

README.md Outdated
</dependency>

If you want to use one of the metric libraries, make sure you use the same version as the main artifact.
<!-- optional if you want to expose folsom metrics with yammer -->

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/folsom metrics with yammer/traces with OpenCensus/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

if (includeValues) {
span.putAttribute("value_hex", stringAttributeValue(HEX.encode(value)));
}
span.putAttribute("value_size", longAttributeValue(value.length));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include the unit here. e.g. value_size_bytes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

public class OpenCensus {

private static final boolean DEFAULT_INCLUDE_KEYS = true;
private static final boolean DEFAULT_INCLUDE_VALUES = false;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned about someone enabling the value attribute. It could be a really big payload. I think since folks have the ability to see the key if they really care about the value they can fetch it themselves.

I would be in favor of removing the value attribute entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine as long as this is default to off, but I'll add a note about this in the documentation.

This add support for tracing memcache operations, including an
implementation for OpenCensus. This change opts to only trace high level
operations for the sake of simplicity. It will for example not trace
operations to individual nodes (e.g. for delete all operations). As we
learn more about tracing needs, we might change this to include more
fine-grained operations.

The OpenCensus implementation attempts to use the OpenTracing
conventions for attributes:
https://opentracing.io/specification/conventions/
@protocol7 protocol7 merged commit bb78316 into master Jun 18, 2019
@protocol7 protocol7 deleted the tracing branch June 18, 2019 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants