-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fxtest: Enforceable Timeouts on Lifecycle #1203
Conversation
In Fx, if starting/stopping the application takes longer than the context's timeout, the fx.App will bail early, regardless of the status of the currently running hooks. This prevents stalling an application when hooks (accidentally) block forever. In order to test hook behavior, Fx provides fxtest.Lifecycle to interact with. This Lifecycle is a simple wrapper around the actual fx Lifecycle type, meaning it does not check for timeout and bail early like fx.App does. This is an issue because: * It allows for long-running hooks (which should be considered bugs) to pass tests. * It allows for tests to completely stall for hooks that block forever. See uber-go#1180 for more details. This PR adds an option that can be passed to `fxtest.NewLifecycle` to cause it to immediately fail when context expires, similar to fx.App. ``` lc := fxtest.NewLifecycle(fxtest.EnforceTimeout(true)) lc.Append(fx.StartHook(func() { for {} })) ctx, _ := context.WithTimeout(context.Background(), time.Second) err := lc.Start(ctx) // Will return deadline exceeded after a second ``` This PR doesn't provide a way to test timeouts using `RequireStart` and `RequireStop`. However, since those don't take contexts anyways, my assumption is that usage of those APIs represents an intentional decision to not care about timeouts by the test writer. However, if people feel differently, we can instead do something like expose two new APIs `RequireStartWithContext(ctx)` and `RequireStopWithContext(ctx)` or something (names obviously up for discussion).
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1203 +/- ##
==========================================
- Coverage 98.57% 98.51% -0.06%
==========================================
Files 34 34
Lines 2871 2900 +29
==========================================
+ Hits 2830 2857 +27
- Misses 35 36 +1
- Partials 6 7 +1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that it's a bit unexpected that RequireStart and RequireStop don't enforce the timeout, but I also understand, and agree with why—where would the timeout come from? We can expand those APIs in a backwards compatible manner in the future if necessary (e.g. with an option to control their timeouts, or the WithContext method variants which may be more obvious, if a bit verbose).
} | ||
|
||
var _ fx.Lifecycle = (*Lifecycle)(nil) | ||
|
||
// NewLifecycle creates a new test lifecycle. | ||
func NewLifecycle(t TB) *Lifecycle { | ||
func NewLifecycle(t TB, opts ...LifecycleOption) *Lifecycle { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking, this is considered a breaking change per Go compatibility guarantees.
Adding variadic options to a function that didn't previously accept them still changes the signature.
It breaks cases like passing a reference to the function, for example.
I'll defer to y'all about whether you'd like to take the risk, since passing fastest.NewLifecycle
would be a pretty niche use.
Alternatively, we can add one of NewLifecycleWith
, NewLifecycleWithOptions
, or NewStrictLifecycle
instead. (The last defaulting to enforceTimeout = true. In that case, we could drop the EnforceTimeout option completely, and instead just add Option
s for future expansion.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - makes sense. I did consider that it is technically a breaking change, but I'm assuming that the impact radius is minimal/non-existent. To justify this assumption, I did run Uber's entire unit test suite with this PR and did not see any issues, so I'm personally comfortable with this change to avoid adding multiple NewLifecycleX
APIs. Especially since any issues that do exist will (or at least, should) be isolated to test code.
That said, I'm open to adding a new API instead if somebody feels strongly about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel strongly about this; will defer to other maintainers.
I think this is fine as-is. Passing fxtest.NewLifecycle
as a function parameter is a niche enough use case that I expect zero breakage.
This isn't authoritative, but just searching public code on GitHub, there are no references to fxtest.NewLifecycle
not followed by a (
, so they're always invocations, and won't be broken.
https://github.com/search?q=%2Ffxtest%5C.NewLifecycle%5B%5E%5C%28%5D%2F+path%3A%2F.go%24%2F&type=code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 @sywhang or @tchung1118 - do you have strong opinions on this? Otherwise I will merge this as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine as-is, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion against this either but I do prefer making a slightly breaking change in this API rather than overloading the API surface with a variant of NewLifecycle
given how niche the set of use cases that break from this.
b194cdd
to
a0a2fcd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with one minor behavioral improvement suggested
go func() { | ||
c <- fn(ctx) | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last thing: Just for the sake of easier/faster cleanup, we should wrap the context with a WithCancel here so that when this function returns because of a timeout, the context that fn(ctx)
is operating on is also closed, and the function returns early (if possible).
go func() { | |
c <- fn(ctx) | |
}() | |
ctx, cancel := context.WithCancel(ctx) | |
defer cancel() | |
go func() { | |
c <- fn(ctx) | |
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, shouldn't the context that fn
is operating on already be closed if its deadline is exceeded? Or are you referring to a case where fn
is only returning early when canceled and not exceeded deadline for some reason? I can add this to cover this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 You're right. My mistake.
The only way this function returns early is when the context has expired.
Thank you for all your helpful review @abhinav! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
In Fx, if starting/stopping the application takes longer than the context's timeout, the fx.App will bail early, regardless of the status of the currently running hooks.
This prevents stalling an application when hooks (accidentally) block forever.
In order to test hook behavior, Fx provides fxtest.Lifecycle to interact with. This Lifecycle is a simple wrapper around the actual fx Lifecycle type, meaning it does not check for timeout and bail early like fx.App does. This is an issue because:
See #1180 for more details.
This PR adds an option that can be passed to
fxtest.NewLifecycle
to cause it to immediately fail when context expires, similar to fx.App.This PR doesn't provide a way to test timeouts using
RequireStart
andRequireStop
. However, since those don't take contexts anyways, my assumption is that usage of those APIs represents an intentional decision to not care about timeouts by the test writer.However, if people feel differently, we can instead do something like expose two new APIs
RequireStartWithContext(ctx)
andRequireStopWithContext(ctx)
or something (names obviously up for discussion).