Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use mock clock, unflake tests #813

Merged
merged 4 commits into from
Nov 19, 2021
Merged

Use mock clock, unflake tests #813

merged 4 commits into from
Nov 19, 2021

Conversation

abhinav
Copy link
Collaborator

@abhinav abhinav commented Nov 18, 2021

This PR resolves flakiness of some tests in Fx
by abstracting our use of time operations behind a Clock interface
and using a mock clock for the relevant tests.

This adds an internal/fxclock package that defines the Clock interface.
This interface is a compliant subset of "benbjohnson/clock".Clock.
This intentionally does not use benbjohnson/clock directly outside of tests
to keep it a test-only dependency.

The new internal-only WithClock option
accepts and plumbs an implementation of fxclock.Clock.
Note that this option is declared in app_internal_test.go.
It's not part of the public API,
but it's accessible to external tests in the top-level Fx directory.

Finally, this switches to using the Clock for time operations where relevant,
and updates tests that rely on sleep/time-based blocking to use a mock clock.
Now,

  • time.Now is only present in fxclock.System.
  • time.Since is only present in fxclock.System.
  • time.Sleep is only present in fxclock.System.
  • context.WithTimeout is present in fxclock.System,
    fxtest.App.Require{Start,Stop}, and example_test.go.
    The latter two because fxclock.Clock is not part of our public API.

Resolves #799
Ref: GO-1001

@abhinav abhinav requested a review from sywhang November 18, 2021 20:02
@abhinav
Copy link
Collaborator Author

abhinav commented Nov 18, 2021

Fixing data race in benbjohnson/clock#42.
Temporarily pinning to the fork while the fix is in review to validate the fix further.

@codecov
Copy link

codecov bot commented Nov 18, 2021

Codecov Report

Merging #813 (85a3278) into master (dbe9bdb) will decrease coverage by 0.20%.
The diff coverage is 100.00%.

❗ Current head 85a3278 differs from pull request most recent head cf09130. Consider uploading reports for the commit cf09130 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #813      +/-   ##
==========================================
- Coverage   99.25%   99.04%   -0.21%     
==========================================
  Files          24       25       +1     
  Lines         941      946       +5     
==========================================
+ Hits          934      937       +3     
- Misses          6        7       +1     
- Partials        1        2       +1     
Impacted Files Coverage Δ
app.go 99.00% <100.00%> (-0.66%) ⬇️
fxtest/lifecycle.go 100.00% <100.00%> (ø)
internal/fxclock/clock.go 100.00% <100.00%> (ø)
internal/lifecycle/lifecycle.go 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dbe9bdb...cf09130. Read the comment docs.

go.mod Outdated
github.com/stretchr/testify v1.7.0
go.uber.org/dig v1.12.0
go.uber.org/goleak v1.1.11
go.uber.org/multierr v1.5.0
go.uber.org/zap v1.16.0
golang.org/x/sys v0.0.0-20210903071746-97244b99971b
)

replace github.com/benbjohnson/clock => github.com/abhinav/clock v1.3.1-0.20211118204451-38de6f8a55b9
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably will not land this replace directive.
I'd like to wait a few days for the change to be merged upstream.

Copy link
Contributor

@sywhang sywhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code changes looks good to me, and yeah, waiting on merging this until the upstream change gets merged is probably better than pinning the fork :)

app_internal_test.go Outdated Show resolved Hide resolved
@abhinav
Copy link
Collaborator Author

abhinav commented Nov 18, 2021

This PR is now WIP while I'm digging further into this.

@abhinav abhinav changed the title Use mock clock, unflake tests WIP: Use mock clock, unflake tests Nov 18, 2021
@abhinav abhinav added the wip label Nov 18, 2021
Add an internal fxclock package that provides a minimal Clock interface.

We will use this to abstract away Fx's use of time operations to help
address test flakiness issues.
@abhinav abhinav force-pushed the abg/onstop-race branch 2 times, most recently from e54a0b3 to 00653cb Compare November 19, 2021 17:44
Add a WithClock option to fx.App to inject an fxclock
and use this clock for all time operations in App and Lifecycle.

This option is only available from tests inside the same directory.
We're intentionally not exposing this as part of our public API for now.
Use a mock clock for the tests in TestAppStart and TestAppStop
that rely on blocking or sleeping.

These tests were previously using a small timeout and blocking on
`ctx.Done` indefinitely, or using `time.Sleep` directly.

With a mock clock, we can use larger, more real timeouts
and simply advance time to an even larger value.

Besides making the tests more deterministic,
this also makes them faster.

Before:

```
$ go test -run TestAppStart -count 1
PASS
ok      go.uber.org/fx  1.066s

$ go test -run TestAppStart -count 1
PASS
ok      go.uber.org/fx  1.060s
```

After:

```
$ go test -run TestAppStart -count 1
PASS
ok      go.uber.org/fx  0.083s

$ go test -run TestAppStart -count 1
PASS
ok      go.uber.org/fx  0.064s
```

(TestAppStop was not affected similarly
because it used a very small timeout to begin with.)
@abhinav abhinav force-pushed the abg/onstop-race branch 2 times, most recently from 29082d1 to 85a3278 Compare November 19, 2021 17:55
The OnStop_timeout test was flaky
because of the very small StartTimeout.

We would sometimes hit a condition where:
App.Run calls App.Start with a timeout of 1 millisecond.
This takes just long enough that App.Start gives up.
As a result, App.Run does not call App.Stop,
which causes the test to fail
because we don't see the Stop event.

Resolve the flakiness with the use of a mock clock.
Now, time doesn't advance while the start hook is running,
so we can be certain that the really long OnStop hook is invoked
and has a chance to time out.

Note that the subtests for this test share hooks,
but they can't share mock clocks (`clock.Add` is not thread-safe)
so we instead use "hook builders" that build hooks
given the mock clock.
@abhinav abhinav changed the title WIP: Use mock clock, unflake tests Use mock clock, unflake tests Nov 19, 2021
@abhinav
Copy link
Collaborator Author

abhinav commented Nov 19, 2021

@sywhang @r-hang
This is good to go now.
We're also no longer pinned to the fork of clock—I was able to resolve that by not sharing the clock between subtests.

@abhinav abhinav removed the wip label Nov 19, 2021
@abhinav abhinav merged commit 9377304 into master Nov 19, 2021
@abhinav abhinav deleted the abg/onstop-race branch November 19, 2021 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

OnStop_timeout test occasionally fails
3 participants