Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service Topology view for a single trace #2622

Open
connectwithnara opened this issue Jun 3, 2019 · 5 comments
Open

Service Topology view for a single trace #2622

connectwithnara opened this issue Jun 3, 2019 · 5 comments
Labels

Comments

@connectwithnara
Copy link

The waterfall diagram in Zipkin UI is useful to understand the latency distribution for a trace. However the view is not very friendly to perceive the services involved in that trace. It will be useful to have another tab 'Topology' and render the topology of the services involved in the trace.

At Netflix we trace 100% of the requests that is enabled for failure injection. The users will find it useful to understand what services were involved in a failure injected request.

@codefromthecrypt
Copy link
Member

Thanks for raising this, Nara. I'm positive it has been mentioned at least for a year (including by me!), but never formally in an issue.

So the idea is to take the trace we already have (in fact it is already assembled into a tree internally), then run a linker to aggregate the service dependencies. At that point, it is only changing the data sent to the dependencies page, possibly with labeling of the trace ID, similar to our "show trace" functionality.

I'd be happy to port the dependency linker logic to javascript. @tacigar @zeagord could one of you take a stab at UI/UX on this?

cc also @bulicekj as you may have similar requests in haystack.

@codefromthecrypt
Copy link
Member

naver pinpoint also has this feature.

@codefromthecrypt
Copy link
Member

Some notes from discussion with @tacigar

I have service graph now, but I cannot tell if there is a problem with one node behavior
if I want to understand behavior of a single machine in a service, I cannot do this today.
this is because the edges are aggregatate (parent, child, count), based on service name, not any other information in the endpoint such as IP address.

This can make certain problems difficult to understand, such as if one node in a service is running the wrong code, or if a new version of code only deployed to one node is causing a problem.

So, if I have a way to classify by another means, I can identify this type of behavior. For example, IP address. {(parent, ip), (child, ip), count) the dependency link has one more qualifier than before, for this example IP address.

There are now many more nodes.. because what was before just service-service is now (service,ip) -> (service, ip). Inside one trace, this could be fine because maybe not that many combinations

IP is just an example, user may want to explore by site tag like cluster or department or something else. The difference between this and normal dependency graph is we generate the links in javascript. This means we can aggregate by anything, not just service but also custom tag (like http.route).

@RestfulBlue
Copy link

i also hope such functionality will appear,
for example jaeger also has that feature
https://miro.medium.com/max/2625/1*W6OGeCA1unSqQfPIZq7VGg.png

codefromthecrypt pushed a commit that referenced this issue Aug 3, 2019
This should allow integrations including:

* single trace aggregates
* aggregates across the results of the current query

Right now, this is a direct and complete port of the Java code. However,
we could do more later once this is integrated. Here are some options:

* include references to each span in the dependency link
  * Ex. trace/span id key so that you can jump directly to the row.
* full path tracing (future)
  * Ex. instead of parent/child, complete path from root to each leaf.

See #2701
See #2622
See #2230
@codefromthecrypt
Copy link
Member

#2731 is a first step as it ports the basic dependency linker used in spark jobs to javascript

codefromthecrypt pushed a commit that referenced this issue Aug 8, 2019
This should allow integrations including:

* single trace aggregates
* aggregates across the results of the current query

Right now, this is a direct and complete port of the Java code. However,
we could do more later once this is integrated. Here are some options:

* include references to each span in the dependency link
  * Ex. trace/span id key so that you can jump directly to the row.
* full path tracing (future)
  * Ex. instead of parent/child, complete path from root to each leaf.

See #2701
See #2622
See #2230
abesto pushed a commit to abesto/zipkin that referenced this issue Sep 10, 2019
This should allow integrations including:

* single trace aggregates
* aggregates across the results of the current query

Right now, this is a direct and complete port of the Java code. However,
we could do more later once this is integrated. Here are some options:

* include references to each span in the dependency link
  * Ex. trace/span id key so that you can jump directly to the row.
* full path tracing (future)
  * Ex. instead of parent/child, complete path from root to each leaf.

See openzipkin#2701
See openzipkin#2622
See openzipkin#2230
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants