Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracer: enable stats flushing when Flush() is called #1661

Merged
merged 13 commits into from
Jan 19, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions ddtrace/tracer/metrics.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ type statsdClient interface {
Count(name string, value int64, tags []string, rate float64) error
Gauge(name string, value float64, tags []string, rate float64) error
Timing(name string, value time.Duration, tags []string, rate float64) error
Flush() error
Close() error
}

Expand Down
8 changes: 8 additions & 0 deletions ddtrace/tracer/metrics_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ type testStatsdClient struct {
waitCh chan struct{}
n int
closed bool
flushed int
}

type testStatsdCall struct {
Expand Down Expand Up @@ -122,6 +123,13 @@ func (tg *testStatsdClient) addMetric(ct callType, tags []string, c testStatsdCa
return nil
}

func (tg *testStatsdClient) Flush() error {
tg.mu.Lock()
defer tg.mu.Unlock()
tg.flushed++
return nil
}

func (tg *testStatsdClient) Close() error {
tg.closed = true
return nil
Expand Down
2 changes: 2 additions & 0 deletions ddtrace/tracer/tracer.go
Original file line number Diff line number Diff line change
Expand Up @@ -311,6 +311,8 @@ func (t *tracer) worker(tick <-chan time.Time) {
case done := <-t.flush:
t.statsd.Incr("datadog.tracer.flush_triggered", []string{"reason:invoked"}, 1)
t.traceWriter.flush()
t.statsd.Flush()
t.stats.flushAndSend(time.Now(), withoutCurrentBucket)
// TODO(x): In reality, the traceWriter.flush() call is not synchronous
// when using the agent traceWriter. However, this functionnality is used
// in Lambda so for that purpose this mechanism should suffice.
Expand Down
27 changes: 26 additions & 1 deletion ddtrace/tracer/tracer_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1932,9 +1932,20 @@ func (w *testTraceWriter) Flushed() []*span {

func TestFlush(t *testing.T) {
tr, _, _, stop := startTestTracer(t)
defer stop()

tw := newTestTraceWriter()
tr.traceWriter = tw
defer stop()

ts := &testStatsdClient{}
tr.statsd = ts

transport := newDummyTransport()
c := newConcentrator(&config{transport: transport}, 500000)
lievan marked this conversation as resolved.
Show resolved Hide resolved
tr.stats = c
c.Start()
defer c.Stop()

tr.StartSpan("op").Finish()
timeout := time.After(time.Second)
loop:
Expand All @@ -1950,9 +1961,23 @@ loop:
time.Sleep(time.Millisecond)
}
}
as := &aggregableSpan{
key: aggregation{
Name: "http.request",
},
// Start must be older than latest bucket to get flushed
Start: time.Now().UnixNano() - 3*500000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason to choose 3*500000?

Copy link
Contributor Author

@lievan lievan Jan 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the bucketSize for the stats concentrator here is set to 500000, setting the start time for the span to be time.Now().UnixNano() - 3*500000 ensures that this span does not belong to the current bucket (which does not get flushed when t.stats.flushAndSend(time.Now(), withoutCurrentBucket) is called in tracer.go). If the start time for the span is set to time.Now().UnixNano() - 250000, for example, this test case will fail.

I chose the bucketSize of 500000 copying these unit tests in stats_test.go

To increase code clarity I can set some variable bucketSize to 500000 and use that throughout! (Edit: or just use defaultStatsBucketSize as Katie suggested)

Duration: 1,
}
c.add(as)

assert.Len(t, tw.Flushed(), 0)
assert.Equal(t, ts.flushed, 0)
lievan marked this conversation as resolved.
Show resolved Hide resolved
assert.Len(t, transport.Stats(), 0)
tr.flushSync()
assert.Len(t, tw.Flushed(), 1)
assert.Equal(t, ts.flushed, 1)
lievan marked this conversation as resolved.
Show resolved Hide resolved
assert.NotZero(t, transport.Stats())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assert something more strong than just NotZero? Or is the result of transport.Stats() not consistent here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think assert.Len(t, transport.Stats(), 1) also works! I can make that change.

}

func TestTakeStackTrace(t *testing.T) {
Expand Down