Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing trace context to signalfx #509

Merged
merged 2 commits into from Jul 16, 2018
Merged

Conversation

aubrey-stripe
Copy link
Contributor

@aubrey-stripe aubrey-stripe commented Jul 16, 2018

Summary

  • Preallocating buffers for global metrics
  • Using child contexts for more accurate grouping of trace spans
  • Minimizing contention in destination refresh for forwarding metrics
  • Switching forwarding batch (sent and received) metrics from gauge to counter for accurate representation of batches.

Motivation

Attempting to get better visibility out of veneur-proxy forwarding process.

Test plan

No additional tests required. Will deploy to a subset in QA for tests when ready.

Rollout/monitoring/revert plan

After deployment traces in the SignalFx sink will be reordered to reflect parent child span processes.

@aubrey-stripe aubrey-stripe changed the title WIP - Passing trace context to signalfx Passing trace context to signalfx Jul 16, 2018
@stripe-ci
Copy link

Gerald Rule: Copy Observability on Veneur, Unilog, Falconer pull requests

cc @stripe/observability
cc @stripe/observability-stripe

@@ -47,15 +47,16 @@ func (c *collection) submit(ctx context.Context, cl *trace.Client) error {
errorCh := make(chan error, len(c.pointsByKey)+1)

submitOne := func(client dpsink.Sink, points []*datapoint.Datapoint) {
span, ctx := trace.StartSpanFromContext(ctx, "")
span, childCtx := trace.StartSpanFromContext(ctx, "")
span.SetTag("datapoint_count", len(points))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be best done as a Log...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems fine to do it this way - i can imagine interesting correlations that we'd lose sight of if we didn't have the information available in tracing.

proxy.go Outdated
// We do this after we've fetched info so we don't hold the lock during long
// queries, timeouts or errors. The flusher can lock the mutex and prevent us
// from updating at the same time.
func updateRing(destinations []string, ring *consistent.Consistent, mtx *sync.Mutex) {
mtx.Lock()
defer mtx.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i realize this was here before, but if we care about performance enough to be talking about "At the last moment" in this segment, then better not to use a defer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@aubrey-stripe
Copy link
Contributor Author

I've found another change that I'd like to add here. The gauges tracking batch size in forwarding should be using counters. Since these aren't being globalized we can just use the rollup designation to get either an easy sum of all metrics forwarded or average of batch size. With a gauge, this just capture the last batch size of a gauge window.

@@ -47,15 +47,16 @@ func (c *collection) submit(ctx context.Context, cl *trace.Client) error {
errorCh := make(chan error, len(c.pointsByKey)+1)

submitOne := func(client dpsink.Sink, points []*datapoint.Datapoint) {
span, ctx := trace.StartSpanFromContext(ctx, "")
span, childCtx := trace.StartSpanFromContext(ctx, "")
span.SetTag("datapoint_count", len(points))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems fine to do it this way - i can imagine interesting correlations that we'd lose sight of if we didn't have the information available in tracing.

@aubrey-stripe aubrey-stripe merged commit d282e9d into master Jul 16, 2018
@aubrey-stripe aubrey-stripe deleted the aubrey/sfx-http-spans branch July 16, 2018 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants