Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate an http.route tag #1874

Closed
codefromthecrypt opened this issue Jan 7, 2018 · 29 comments · Fixed by openzipkin/brave#602
Closed

Investigate an http.route tag #1874

codefromthecrypt opened this issue Jan 7, 2018 · 29 comments · Fixed by openzipkin/brave#602

Comments

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Jan 7, 2018

I'm going to recommend we make a standard tag of http.route, a way to help people make good low-cardinality names or tags from http requests. Main idea is that instead of parsing http.url which can be trouble, allow frameworks to provide http.route which might not always be the same format, but would always be variable and parameter free. It also has a higher likelihood of matching 255 char indexing constraints, which are common on storage backends.

Here are some examples of http route values for the same endpoint:

  • vert.x web route "/items/:itemId"
  • Play "/items/$id<[0-9]+>"
  • Spring WebMVC request mapping "/items/{itemId}"

Until we have something like this, it is almost nonsense to answer some folks who want to push spans into metrics (see #1869). For example, in prometheus, counters etc will reset on each small deviation of tags, a problem far worse than ours wrt cardinality.

Example framework relationships:

In interop tests, we can't necessarily test for equality of this value as it will be framework specific. However, we can test for cardinality! Ex we make an endpoint with a path variable and execute it against two values of that. Then, we test http.path tag is different while http.template is the same. Ideally the http template is already "washed", but those derived from paths might need normalization. For example, you may need to ensure there's only one leading slash.

Besides a lookup tag, http.template can be used to create a better name for the span. For example, if a template is available, it could be a better choice than the http method. Implementation-wise, this implies a span rename feature, as templates are not always known before-the-fact. That said a primary use case is navigating to other systems like metrics, whether that's directly via the http.template tag or something derived from it.

For example, requests that don't match any template need to be eventually handled. For example, if the http.response_code tag is present and http.template is absent, you might have a heuristic to create a bucketed tag like 'REDIRECT' to share with metrics. Doing so allows bot-like behavior from poisoning metrics. This sort of logic exists in micrometer.

Thanks for input on this, notably @jkschneider and @mikewrighton. CC'ing the "firehose" crowd @bplotnick @kaisen @drolando and @narayaruna, and others interested in collaborating like @ivantopo

@codefromthecrypt
Copy link
Member Author

(PS as a clarification, this is not a propagated tag, rather one sent out-of-band and possibly used as an aggregating dimension for metrics)

@SergeyKanzhelev
Copy link

Do you expect span's name be the same as http.template in the most cases? Or they semantically different?

@bplotnick
Copy link

I'm not opposed to this, but 1) i'm not sure about the name and 2) i don't know if it's necessary

I'm not sure about the name, because I don't know if a template (assuming we're talking URI templates here) is sufficient. Here's an example:

We have this exact problem with our swagger RPC endpoints. Using the petstore example, GET /pet/{petId} will show up as /pet/1234 in the http.uri. So 1) it is difficult to know that that corresponds to the /pet/{petId} path template, and 2) it is impossible to know that it corresponds to the GET version of that vs the DELETE or POST versions.

We solve the second problem by making the span name equal to the method and the path in pyramid_zipkin. However, this doesn't solve the path parameter problem (the uri template).

The Correct Way to solve this would be to use any identifier guaranteed to uniquely identify the endpoint, which in the swagger case is operationId. I think that this should just be set as the span name as I think @SergeyKanzhelev is alluding to. This is why I'm saying that it is probably not necessary.

@jkschneider
Copy link

jkschneider commented Jan 8, 2018

The Correct Way to solve this would be to use any identifier guaranteed to uniquely identify the endpoint

Effectively I think that is what http.template is in this proposal. Making the parameterized URI the de facto identifier across many frameworks makes your traces/metrics look somewhat uniform across the org (i.e. those services using Swagger and those that are not).

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Jan 8, 2018 via email

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Jan 8, 2018 via email

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Jan 8, 2018 via email

@jkschneider
Copy link

Just want to make a note that Spring's HandlerMapping.BEST_MATCHING_PATTERN_ATTRIBUTE does not normalize away extraneous slashes.

@ivantopo
Copy link

ivantopo commented Jan 9, 2018

Hello there! Happy to see this moving 😄

Is the scope of this fixed to HTTP-related Spans? I can totally see a similar tag/mechanism working to gather metrics on JDBC call spans and anything else that looks like it (Cassandra driver, etc). This happens to be a pretty common topic when people are starting to instrument their apps.. they usually end up with either too little granularity or a crazy explosion of metrics unless some better naming is manually applied. Anything we can do to provide better default names out of the box is a instant win to all users!

PS: Play Framework does a really nice job at exposing the route templates through tags, that might be worth adding in the description.

@codefromthecrypt
Copy link
Member Author

updated with play info. thx @ivantopo cc @jcchavezs

@codefromthecrypt
Copy link
Member Author

just started on this openzipkin/brave#602

@codefromthecrypt
Copy link
Member Author

https://github.com/adriancole/play-zipkin-tracing/compare/integration-tests...adriancole:http-template is what I believe would be the way to attack this in play. However, the following always shows up empty..

  def apply(nextFilter: (RequestHeader) => Future[Result])(req: RequestHeader): Future[Result] = {
    var template = req.attrs.get(Router.Attrs.HandlerDef).map(_.path)

@codefromthecrypt
Copy link
Member Author

Had a chat about this in census, and @rakyll suggests the name http.route, which makes sense to me as almost always I'm using the word route to describe what this is. Anyone feel it shouldn't be this?

"http.route"              | Matched request URL route   | "/users/:userID"     

@jcchavezs
Copy link
Contributor

+1 to http.route

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Feb 15, 2018 via email

@jkschneider
Copy link

need to make constants like redirect or not_found

To elaborate a bit: not shunting 404s or 302s to a fixed tag name essentially becomes a DOS-style attack vector on your monitoring system. I guess this is true for any "user-provided" tag value. Probably more importantly, it's just easier to dimensionally drill down on auth failures and such.

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Feb 17, 2018 via email

@codefromthecrypt
Copy link
Member Author

maybe it is best to go ahead and map "redirection" and "not_found" (lowercase as span names are lowercase) by default as the http.route value. That way, we know the difference between unsupported and no route found (because if a template is possible, you always get a value, even if it is "redirection" or "not_found")

@codefromthecrypt codefromthecrypt changed the title Investigate an http.template tag Investigate an http.route tag Feb 17, 2018
@codefromthecrypt
Copy link
Member Author

Thinking about it, I don't like doing the redirection or not_found mapping as a part of the template tag. I would prefer to have the adapter know if it can parse a route (which is only possible when there is an api for that. For example, there's no way to do this in netty layer). Someone using this can then decide if they want to for example, overwrite the span name with one from a template, and skip overhead trying if there's no chance a route will be known.

@jkschneider
Copy link

Thinking about it, I don't like doing the redirection or not_found mapping as a part of the template tag

As long as we can reasonably achieve consistency, even if that's just a handshake.

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Feb 20, 2018

Chatting about the "http.route" tag signal. Some people do span adjustments after the fact, like with sparkstreaming.

Relying on code conventions to tell if http.route was processable or not can be a problem, if doing span renaming downstream. For example, we'd like to consistently apply a naming pattern, even if it is not found or redirected.

Ex zipkin, in a framework that doesn't support routes, it could be more sensible to default to the span name to http method vs have http method or not_found, for consistency's sake. Knowing a route is possible allows us to be consistent: use a template based span namer or another one, depending on what's supported.

The easiest path out is to use "" as a signal that there is route support, but a route didn't match (for any reason). Use "http.status_code" to clarify. This is similar to how we used the marker tag key "lc" permitting empty values.

Indexing impact is limited. For example, optimized backends write one index entry per trace each time "http.route" -> "" is stored. Of course, it isn't required to be stored ever: it isn't a MUST use empty string on no match, but SHOULD. If frameworks that know about routing write empty string, downstream decorators can use that as input to naming or bucketing policy.

Make sense?

@codefromthecrypt
Copy link
Member Author

offline feedback is that the whole point of "http.route" is both fixed cardinality and grouping. For grouping to work, we want routes that are effectively the same, to in fact be the same. Here are a couple things on that.

  • Require a leading slash - prefixing with slash allows you firstly to identify a root path as different than empty string. More importantly it allows you to group "users/:userId" with "/users/:userId" which some frameworks can be inconsistent about (due to user defined template expressions)
  • Try to normalize paths - For example, if a framework somehow can allow an accident of "//users/:userId" to strip out the extra slash. This is to help with grouping, and tests "should" be able to smoke this out.

Seems like only one leading slash can be a MUST as it is cheap to enforce. Path normalization is a SHOULD, as you could go forever on this one. The intent MUST :P be clear, which is that for this tag to be effective, things like this should happen.

@codefromthecrypt
Copy link
Member Author

revised docs etc here. tomorrow will merge after fixing up tests

openzipkin/brave@103e5ad

in a nutshell, this sets the default span name policy to use the http route where possible (coercing empty to not_found or redirected). Remember in zipkin span names are lowercase that's why not_found. This is taking lessons learned from micrometer. To make this work in practice, the http.method is now tagged by default (otherwise you wouldn't know the http method associated with /users/:userId

will also integration test it tomorrrow with sleuth and ratpack

codefromthecrypt pushed a commit to openzipkin/zipkin-api that referenced this issue Feb 22, 2018
See openzipkin/zipkin#1874
See also brave's recent pull request for more docs we might want to import:
openzipkin/brave#602
@codefromthecrypt
Copy link
Member Author

openzipkin/zipkin-api#42 for the thrift constant

codefromthecrypt pushed a commit that referenced this issue Feb 22, 2018
Especially when using http route based span names, we have to be careful
to not pick a bad name. In shared spans, for example, the UI prefers
server-side's name of the service. However, we have no name priority
logic, so it is arbitrary. This ensures the server's span name wins in a
race.

See #1874
codefromthecrypt pushed a commit that referenced this issue Feb 22, 2018
Especially when using http route based span names, we have to be careful
to not pick a bad name. In shared spans, for example, the UI prefers
server-side's name of the service. However, we have no name priority
logic, so it is arbitrary. This ensures the server's span name wins in a
race.

See #1874
codefromthecrypt pushed a commit to openzipkin/zipkin-api that referenced this issue Feb 22, 2018
See openzipkin/zipkin#1874
See census's definition census-instrumentation/opencensus-specs#49
See also brave's recent pull request for more docs we might want to import:
openzipkin/brave#602
codefromthecrypt pushed a commit to openzipkin/zipkin-api that referenced this issue Feb 22, 2018
See openzipkin/zipkin#1874
See census's definition census-instrumentation/opencensus-specs#49
See also brave's recent pull request for more docs we might want to import:
openzipkin/brave#602
@codefromthecrypt
Copy link
Member Author

@bplotnick digging around on something I dropped that you mentioned.. which is what addresses the request path templating. It seems like pyramid should be able to avail the route pattern somehow (even if it doesn't now). I'm looking at https://docs.pylonsproject.org/projects/pyramid/en/latest/narr/urldispatch.html#route-pattern-syntax. In spark web I've had to request the ability to see this, as they hadn't exposed it prior (eventhough it is reachable).

On your other points, I do think that http.route isn't necessary, as in necessary for all, but it is a nice choice that is still http abstraction.. and certainly helps some or else they wouldn't have done similar ad-hoc. As not necessary, I mean a symbolic route name or function name can also be a nice tag and/or span name. Just such are more pinned to the application abstraction vs http, which could be a pro or con depending on POV. For example, regardless of syntax.. /users/{userId} /users/:userId /users/* etc people who don't know the app probably can figure out what the route means in context of the http request. Similarly constants like not_found redirected are fairly intuitive and explain the http layer reasonably well. Importantly, presence of http.route does not mean don't add another identifier like a generated one or otherwise.. for example, in java code there are often other tags for the method the request dispatched to.

Does this make sense? Not trying to convince you, just trying to put forth rationale and why some might use http.route (either as a span name or aggregation tag), even if not everyone will choose to.

@codefromthecrypt
Copy link
Member Author

ps to address the differentiation problem that both @bplotnick and @jcchavezs (offline) mentioned, in brave I'm changing the default span name function to include essentially "${http.method} ${http.route}"

so, for example, the following are all valid. Note that the http.route requires a slash prefix, so naturally differentiates from the constants.

get /users/:userId
post redirected
post /users
post not_found

@codefromthecrypt
Copy link
Member Author

to clarify above.. a span naming convention is separate from the http.route value. For example, if you want to drill down by path or template, you should add the tags separately (especially knowing that span names are implicitly downcased)

codefromthecrypt pushed a commit to openzipkin/brave that referenced this issue Feb 23, 2018
Span names like "/users/:userId" now include the method like
"delete /users/:userId"

This changes the route-based naming convention to include the http
method and fixes some glitches not detected before. To support this,
we need to make the http method visible in response parsing phase.

See openzipkin/zipkin#1874 (comment)
@bplotnick
Copy link

@adriancole Apologies for not responding to your comment and thank you for continuing with this regardless. What you proposed and implemented is perfect. You're absolutely right that pyramid provides this and I'll add this into pyramid_zipkin asap. I can't wait to delete some hacks 😄

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Feb 26, 2018 via email

abesto pushed a commit to abesto/zipkin that referenced this issue Sep 10, 2019
…nzipkin#1924)

Especially when using http route based span names, we have to be careful
to not pick a bad name. In shared spans, for example, the UI prefers
server-side's name of the service. However, we have no name priority
logic, so it is arbitrary. This ensures the server's span name wins in a
race.

See openzipkin#1874
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants