New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial Request Tracing Implementation #1093
Conversation
- Clean up tests - Remove unnecessary integration tests and replace with unit - Clean up code formatting - Other random things
Auto wires in Brave tracing Adds hooks for grip tracing in agent and server
…thin Genie server and agent - Trace propagator to be able to put trace information from server to agent and extract it - Tag adapter for modifying tags applied to spans that will allow inheritors to adapt tags based on need (e.g. internal netflix tag modifications) - Trace cleanup class for synchronously flushing brave spans before agent shut down - auto configuration to expose components as beans to be used
- Extract Trace information from environment at startup if possible otherwise a new trace will be instantiated - close spans at agent termination - Add annotations for agent state machine execution - Add tag for the genie job id when pertinent - Add tag for the agent command that is being executed
- Propagate trace information from agent launchers into environment variables for agent processes - Annotate job launch process to show phases - Tag job launch span with job id
Exported as GENIE_B3_TRACE_ID_HIGH, GENIE_B3_TRACE_ID_LOW, GENIE_B3_PARENT_SPAN_ID and GENIE_B3_SAMPLED environment variables Can be used by downstream user job clients to continue to propagate the trace (e.g. to spark or presto, etc)
Codecov Report
@@ Coverage Diff @@
## master #1093 +/- ##
============================================
- Coverage 90.88% 90.79% -0.10%
- Complexity 3646 3669 +23
============================================
Files 451 457 +6
Lines 14098 14230 +132
Branches 986 1000 +14
============================================
+ Hits 12813 12920 +107
- Misses 850 869 +19
- Partials 435 441 +6
Continue to review full report at Codecov.
|
* | ||
* @see TraceContext.Builder#traceId(long) | ||
*/ | ||
static final String GENIE_JOB_B3_TRACE_ID_LOW_KEY = "GENIE_B3_TRACE_ID_LOW"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why call it 'trace_id_low' (instead of 'trace_id'), when B3 just calls it traceId?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went back and forth on this a few times (and it's certainly not set in stone). Ideally I'd have just done GENIE_B3_TRACE_ID
and used the hex string however the builder for TraceContext
doesn't expose the method which would allow me to reconstitute the context from that single value. Hence I had to break it up and just felt it was a little clearer to use LOW
and HIGH
rather than absence of LOW
representing low. That felt to me like a historical thing where they went from supporting only 64 bit identifiers to both that and 128 bit.
process.waitFor(KILL_CHECK_INTERVAL_MS, TimeUnit.MILLISECONDS); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: these 2 method moves seem unnecessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Intellij auto formatting. oh well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending 2 minor comments. Thanks for the good work!
@@ -63,7 +63,7 @@ | |||
@Bean | |||
@Lazy | |||
@ConditionalOnMissingBean(AgentHeartBeatService.class) | |||
public AgentHeartBeatService agentHeartBeatService( | |||
public GrpcAgentHeartBeatServiceImpl agentHeartBeatService( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious why changing to *Impl
instead of the interface? Will this change tie it to a specific implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general for Spring it is a better practice to make the return type as specific as possible. This allows it to be used in the most injection scenarios. If for example someone had a @ConditionalOnBean(GrpcAgentHeartBeatServiceImpl)
on some other bean definition because they explicitly needed that one even though technically there was a bean of this type available Spring would only know it as AgentHeartBeatService
and thus that condition would fail.
It doesn't tie anything to a specific implementation just makes the type more concrete in the spring context.
Initial effort into supporting request tracing in Genie
Part of larger internal platform tracing effort as PoC