-
Notifications
You must be signed in to change notification settings - Fork 16
fix for performance observed during enrichment process #100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
f0d87b2
c6a8118
4a07319
62d9fb0
499c1aa
d2fd150
6ae6b39
f2831e1
bd93b39
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| package org.hypertrace.traceenricher.trace.util; | ||
|
|
||
| import org.hypertrace.core.datamodel.StructuredTrace; | ||
| import org.hypertrace.core.datamodel.shared.StructuredTraceGraph; | ||
| import org.slf4j.Logger; | ||
| import org.slf4j.LoggerFactory; | ||
|
|
||
| public class StructuredTraceGraphBuilder { | ||
| private static final Logger LOG = LoggerFactory.getLogger(StructuredTraceGraphBuilder.class); | ||
|
|
||
| private static ThreadLocal<StructuredTraceGraph> cachedGraph = new ThreadLocal<>(); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do we want it specific to a Thread? Can we use Supplier instead?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently, StructureTraceGraph's state is changing while processing a trace, and we want to re-construct whenever the state affecting its changes. So, we choose thread-local to share the state in a given kstream thread.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @kotharironak We use Suppliers in a similar way to cache the results https://mkyong.com/java8/java-8-supplier-examples/ Though I am not sure, if you can change the entry in supplier based on a certain condition, if the graph structure can potentially change
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand that it is a quick optimisation. My question is eventually should we instead of caching the StructuredTraceGraph at thread level, pass around the instance of this graph between enrichers? Anyways the graph is going to be changed for each Trace. We can just create this graph once for every trace at the entry point and other enrichers can use this graph. Any enrichment task can then update the StructuredTraceGraph if there is a need.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently, it is another way around, the |
||
| private static ThreadLocal<StructuredTrace> cachedTrace = new ThreadLocal<>(); | ||
|
|
||
| public static StructuredTraceGraph buildGraph(StructuredTrace trace) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @kotharironak Basic question, Why buildGraph is called for every span. Can we call only once for Trace instead of every span ?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @pavan-traceable Where does the buildGraph get called for every span?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thats my understanding from this slack chat https://hypertrace.slack.com/archives/G01BLU6M52N/p1609913769019700?thread_ts=1609841154.015200&cid=G01BLU6M52N.. I may be wrong
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @pavan-traceable buildGraph is called at 3 places, |
||
| // trace doesn't exist | ||
| if (cachedTrace.get() == null) { | ||
| LOG.info("Building structured trace graph. Reason: no cached trace"); | ||
| StructuredTraceGraph graph = StructuredTraceGraph.createGraph(trace); | ||
| cachedTrace.set(StructuredTrace.newBuilder(trace).build()); | ||
| cachedGraph.set(graph); | ||
| return graph; | ||
| } | ||
|
|
||
| // is processed and cached are same trace? | ||
| if (!cachedTrace.get().getCustomerId().equals(trace.getCustomerId()) || | ||
| !cachedTrace.get().getTraceId().equals(trace.getTraceId())) { | ||
| LOG.info("Building structured trace graph. Reason: cached trace and current trace doesn't not match"); | ||
| StructuredTraceGraph graph = StructuredTraceGraph.createGraph(trace); | ||
| cachedTrace.set(StructuredTrace.newBuilder(trace).build()); | ||
| cachedGraph.set(graph); | ||
| return graph; | ||
| } | ||
|
|
||
| // trace internally changed | ||
| if(cachedTrace.get().getEntityList().size() != trace.getEntityList().size() || | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When this scenario will occur ?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we pull this out in a separate boolean function? say
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about the set of attributes and enriched attributes? If the attributes and enriched attributes change, the graph structure will stay the same, but will we get the latest set of attributes from the graph nodes? Basically, Does it create a copy of the Event, or just keeps the reference to it?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is keeping reference. ApiTraceGraph is using
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For |
||
| cachedTrace.get().getEventList().size() != trace.getEventList().size() || | ||
| cachedTrace.get().getEntityEdgeList().size() != trace.getEntityEdgeList().size() || | ||
| cachedTrace.get().getEntityEventEdgeList().size() != trace.getEntityEventEdgeList().size() || | ||
| cachedTrace.get().getEventEdgeList().size() != trace.getEventEdgeList().size()) { | ||
| LOG.info("Building structured trace graph. Reason: cached trace and current trace have different size"); | ||
| StructuredTraceGraph graph = StructuredTraceGraph.createGraph(trace); | ||
| cachedTrace.set(StructuredTrace.newBuilder(trace).build()); | ||
| cachedGraph.set(graph); | ||
| return graph; | ||
| } | ||
|
|
||
| return cachedGraph.get(); | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add unit test for this class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a utility wrapper method with a set of ordered condition. I think, you are referring here to test around the structure trace graph itself. Can handle it as part of #105. Can you add more detail to that ticket too?