Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestDisplayMetrics flaky #127

Open
elboulangero opened this issue Sep 12, 2021 · 2 comments
Open

TestDisplayMetrics flaky #127

elboulangero opened this issue Sep 12, 2021 · 2 comments

Comments

@elboulangero
Copy link

Hi!

During automated builds of go-metrics 0.3.4 by the Debian CI infrastructure, we noticed a test failure on one of our armhf builder (arm high-float). The issue disappeared after a retry.

All the logs are at https://ci.debian.net/packages/g/golang-github-armon-go-metrics/testing/armhf/, and the two logs of interest here are:

The lines of interest in the logs:

=== RUN   TestDisplayMetrics
    inmem_endpoint_test.go:29: bad: [0xd28240 0xd28400]
--- FAIL: TestDisplayMetrics (0.00s)

Looking deeper in the various logs, I noticed that this issue appeared also on the x86_64 architecture:

@dnephin
Copy link
Contributor

dnephin commented Sep 12, 2021

Thank you for the bug report!

Looking at this test, I believe what's happening is that we use https://pkg.go.dev/time#Time.Truncate to split time into intervals. If some of the test data is added before the interval boundary, and some of it is added after the boundary, we'll end up with two intervals for this test, and the test expects only one.

One way to fix this might be to increase the interval in that test slightly from 10ms to 20ms (to give us more buffer). Then when we get the first interval, wait for it to finish (the done chan was added recently in #125), and add all the metrics immediately after it finishes:

interval := 20 * time.Millisecond
inm := NewInmemSink(interval, 50*time.Millisecond)
<- inm.getInterval().done
// add all the sample metrics

data := inm.Data()
// now we expect 2 intervals in data instead of one.
...

As long as all the metrics can be added in 20ms and we call DisplayMetrics before the next interval, the test should not longer flake.

Edit: Ah but done is only closed when something calls getInterval, so we'll also need a goroutine to call that constantly to get the signal. Maybe using a time.After to wait on the interval boundary would be easier than using the done channel.

There are probably other options for fixing it as well.

@elboulangero
Copy link
Author

Edit: Ah but done is only closed when something calls getInterval, so we'll also need a goroutine to call that constantly to get the signal. Maybe using a time.After to wait on the interval boundary would be easier than using the done channel.

I can confirm that. I tried the change you suggested above, and the test gets stuck, and finally times out after 10 minutes.

Logs for reference:

=== RUN   TestDisplayMetrics
panic: test timed out after 10m0s

goroutine 7 [running]:
testing.(*M).startAlarm.func1()
	/usr/lib/go-1.16/src/testing/testing.go:1700 +0xe5
created by time.goFunc
	/usr/lib/go-1.16/src/time/sleep.go:180 +0x45

goroutine 1 [chan receive]:
testing.(*T).Run(0xc000001b00, 0x66fce6, 0x12, 0x6809f8, 0x48f0a6)
	/usr/lib/go-1.16/src/testing/testing.go:1239 +0x2da
testing.runTests.func1(0xc000001980)
	/usr/lib/go-1.16/src/testing/testing.go:1511 +0x78
testing.tRunner(0xc000001980, 0xc00010dde0)
	/usr/lib/go-1.16/src/testing/testing.go:1193 +0xef
testing.runTests(0xc00000e0d8, 0x7f0040, 0x2a, 0x2a, 0xc047e14ce3db0d3d, 0x8bb2e16fb0, 0x7f54c0, 0x66e426)
	/usr/lib/go-1.16/src/testing/testing.go:1509 +0x2fe
testing.(*M).Run(0xc000154000, 0x0)
	/usr/lib/go-1.16/src/testing/testing.go:1417 +0x1eb
main.main()
	_testmain.go:127 +0x138

goroutine 6 [chan receive]:
github.com/armon/go-metrics.TestDisplayMetrics(0xc000001b00)
	/<<PKGBUILDDIR>>/obj-x86_64-linux-gnu/src/github.com/armon/go-metrics/inmem_endpoint_test.go:18 +0xe5
testing.tRunner(0xc000001b00, 0x6809f8)
	/usr/lib/go-1.16/src/testing/testing.go:1193 +0xef
created by testing.(*T).Run
	/usr/lib/go-1.16/src/testing/testing.go:1238 +0x2b3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants