Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Makes Lens more tolerant of missing localEndpoint.serviceName or .name #2834

Merged
merged 2 commits into from
Oct 8, 2019

Conversation

codefromthecrypt
Copy link
Member

See #2829

@devinsba
Copy link
Member

devinsba commented Oct 7, 2019

Should we add the offending trace to the folder of sample data?

@codefromthecrypt
Copy link
Member Author

horrible-trace.json? :D

@codefromthecrypt
Copy link
Member Author

Now unknown instead of undefined in trace list

Screenshot 2019-10-07 at 11 35 06 AM

Now, no more white screen rendering this really bad trace:

Screenshot 2019-10-07 at 11 34 43 AM

@codefromthecrypt
Copy link
Member Author

PS it is tempting to make the span-cleaner fill empty names and such, but it actually would spread debt into other things like clock skew and dependency linker which looks for presence of service names. Instead, I made it a goal to make the white screen go away.

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Oct 7, 2019

PPS here, Kong should have used an annotation, not an empty span except for timestamp and name. While definitely bad data, I wasn't able to figure out how to get the warning about "no rows found" to go away.

Screenshot 2019-10-07 at 11 39 38 AM

Copy link
Member Author

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bungle @jeremyjpj0916 the comments on this part of the review are some hints about data problems for Kong/kong-plugin-zipkin#49 Hope they help.

"parentId": "d65495313f944801",
"id": "d5cb70daead478dc",
"name": "kong.header_filter",
"timestamp": 1570005203864000
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should have been an annotation

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expand on this a little bit? The Zipkin docs make it very clear that the timestamp field exists, and also say:

The bottom-line is that choosing not to record Span.timestamp and duration will result in less accurate data and less functionality. Since it is very easy to record these authoritatively before reporting, all Zipkin instrumentation should do it or ask someone to help them do it.

Wouldn't removing the timestamp field making Zipkin less accurate?

Also: can I name an annotation timestamp?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what's missing here is that a span can include annotations.. annotations have a required timestamp. So instead of making a sub-span that only includes a name and a timestamp, make an annotation with that name and timestamp.

ex

{
    "traceId": "6cf5624e3452d1bc202a57dad35921c0",
    "parentId": "ed395cb866d02665",
    "id": "d65495313f944801",
    "kind": "CLIENT",
    "name": "get /users/{userId}",
    "timestamp": 1570005203829000,
    "duration": 35000,
--snip--
    "annotations": [
       {"name": "kong.header_filter", "timestamp": 1570005203864000}
     ]

"parentId": "d65495313f944801",
"id": "80b8efd52d66af27",
"name": "kong.body_filter",
"timestamp": 1570005203864000
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should have been an annotation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are spans in Kong; I recall that when they execute in under a millisecond then they appear as a 0-length span in zipkin.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minimum duration is 1us, so prefer to use that vs rounding down to zero, especially if precision is millis (1000us) on the kong side. Zero implies incomplete.

The main thing is to differentiate annotations (timestamped events on a timeline) from spans (some wrapped thing with a duration).

"parentId": "ed395cb866d02665",
"id": "2a49858b1f1d506b",
"name": "kong.rewrite",
"timestamp": 1570005203829000
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should have been an annotation

"traceId": "6cf5624e3452d1bc202a57dad35921c0",
"parentId": "d65495313f944801",
"id": "3e53779bdde7c111",
"name": "kong.access",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is missing localEndpoint.serviceName or this should be an annotation, and the duration moved to where it is missing..

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By looking at the examples in Zipkin's source code, I think this means that the span should look like so:

  {
    "traceId": "6cf5624e3452d1bc202a57dad35921c0",
    "parentId": "d65495313f944801",
    "id": "3e53779bdde7c111",
    "name": "kong.access",
    "localEndpoint": {
      "serviceName": "<name of the Kong Service matched for this Request>",
    },
    "timestamp": 1570005203829000,
    "duration": 16000
  },

I think I can try to add that. But I have questions:

  • Is localEndpoint.serviceName also needed in the other spans, or just on this one?
  • What does "the duration moved to where it is missing" mean? That the other spans should also have a duration? That the duration field on this particular span should be put in a different place on this span (where)? Or is it both?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the client span should be complete, that's the most important one, but all spans should be complete. The subspans, some of them have duration. It was unclear to me why these are modeled as spans vs events.

all spans should have the exact same localEndpoint value if coming from the same node. Most of the time folks just take the link local IP, the advertised port and service name and treat that as a constant.

"tags": {
"kong.route": "aaa20d62-813a-428f-aa40-447324cd30af",
"kong.service": "872eae11-4571-490b-86d2-59834b93f870",
"peer.hostname": ""
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't add empty hostname

"traceId": "6cf5624e3452d1bc202a57dad35921c0",
"id": "ed395cb866d02665",
"kind": "SERVER",
"name": "kong.request",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad span name. prefer route-based ex get /users/{userId} or even get as kong is already tagged

"parentId": "ed395cb866d02665",
"id": "d65495313f944801",
"kind": "CLIENT",
"name": "kong.proxy",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad span name. prefer route-based ex get /users/{userId} or even get as kong isn't helpful

what you are describing here, seems to be better as a tag

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this apply to the other span names as well? (kong.access, kong.header_filter, kong.body_filter, kong.rewrite, kong.access) or is it just kong.proxy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the client span should be semantically correct for http. It is better to have a coherent client span than any of the other spans mentioned. As you noticed, they didn't render and actually made the diagram more confusing vs if they didn't exist at all. My advice is probably use one client span and render the rest with annotations if it is commonly the case you cannot record a duration of precision less than 1ms.

"component": "kong",
"http.method": "GET",
"http.status_code": "200",
"http.url": "",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't add empty url

"port": 3771
},
"tags": {
"component": "kong",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conventionally zipkin uses "lc" for "local component"

"parentId": "d65495313f944801",
"id": "862d0f978033cc53",
"name": "kong.balancer",
"timestamp": 1570005203845000,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing duration. probably better as an annotation named kong.balancer.try1 as that should be bounded, unless there's a chance of infinite tries. That or make this the client span if this is the actual wire request.. hard to tell what's going on (might need a wiki)

@codefromthecrypt
Copy link
Member Author

verified one of the envoy traces renders properly eventhough it is missing the local service name:

Screenshot 2019-10-08 at 8 01 50 AM

@codefromthecrypt codefromthecrypt merged commit 7838a33 into master Oct 8, 2019
@codefromthecrypt codefromthecrypt deleted the tolerate-bad-data branch October 8, 2019 13:42
Copy link

@kikito kikito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @adriancole, I am trying to fix the issues reported on the Kong plugin. Thanks for reporting the problems to us. I left some questions on this PR, do you think you could answer those?

"parentId": "d65495313f944801",
"id": "d5cb70daead478dc",
"name": "kong.header_filter",
"timestamp": 1570005203864000
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expand on this a little bit? The Zipkin docs make it very clear that the timestamp field exists, and also say:

The bottom-line is that choosing not to record Span.timestamp and duration will result in less accurate data and less functionality. Since it is very easy to record these authoritatively before reporting, all Zipkin instrumentation should do it or ask someone to help them do it.

Wouldn't removing the timestamp field making Zipkin less accurate?

Also: can I name an annotation timestamp?

"traceId": "6cf5624e3452d1bc202a57dad35921c0",
"parentId": "d65495313f944801",
"id": "3e53779bdde7c111",
"name": "kong.access",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By looking at the examples in Zipkin's source code, I think this means that the span should look like so:

  {
    "traceId": "6cf5624e3452d1bc202a57dad35921c0",
    "parentId": "d65495313f944801",
    "id": "3e53779bdde7c111",
    "name": "kong.access",
    "localEndpoint": {
      "serviceName": "<name of the Kong Service matched for this Request>",
    },
    "timestamp": 1570005203829000,
    "duration": 16000
  },

I think I can try to add that. But I have questions:

  • Is localEndpoint.serviceName also needed in the other spans, or just on this one?
  • What does "the duration moved to where it is missing" mean? That the other spans should also have a duration? That the duration field on this particular span should be put in a different place on this span (where)? Or is it both?

"parentId": "ed395cb866d02665",
"id": "d65495313f944801",
"kind": "CLIENT",
"name": "kong.proxy",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this apply to the other span names as well? (kong.access, kong.header_filter, kong.body_filter, kong.rewrite, kong.access) or is it just kong.proxy?

kikito added a commit to Kong/kong-plugin-zipkin that referenced this pull request Oct 16, 2019
kikito added a commit to Kong/kong-plugin-zipkin that referenced this pull request Oct 25, 2019
kikito added a commit to Kong/kong-plugin-zipkin that referenced this pull request Oct 25, 2019
kikito added a commit to Kong/kong-plugin-zipkin that referenced this pull request Oct 30, 2019
kikito added a commit to Kong/kong-plugin-zipkin that referenced this pull request Oct 31, 2019
kikito added a commit to Kong/kong-plugin-zipkin that referenced this pull request Nov 5, 2019
gszr pushed a commit to Kong/kong that referenced this pull request Oct 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants