New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tune actor app health check to become healthy sooner #7022
Conversation
Please, let's wait for release-1.12 to be merged into master before making more changes into pkg/actors: #7017 |
import "k8s.io/utils/clock" | ||
|
||
// WithClock sets a custom clock (for mocking time). | ||
func WithClock(clock clock.WithTicker) Option { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep this without the build tag. The overhead is basically none. We had to recently un-do a similar thing in dapr/kit as it made writing unit tests for other packages harder (dapr/kit#62)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this we should not pollute implementation code with test code. The unit
build tag is a standard test tag used throughout the Dapr project where any testing code is able to make use of the WithClock
option.
We had to recently un-do a similar thing in dapr/kit as it made writing unit tests for other packages harder
Why was this is the case? Surely the same argument can be made for interface mocks as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the linked PR is says:
There are use cases where having those methods available even outside of a unit test is helpful, such as when the objects are instantiated with a clock that could be mocked in the unit test for the parent method.
Why would you want to use a mocked clock outside of a unit test?
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #7022 +/- ##
==========================================
+ Coverage 62.21% 62.33% +0.12%
==========================================
Files 240 240
Lines 22059 22055 -4
==========================================
+ Hits 13723 13747 +24
+ Misses 7191 7157 -34
- Partials 1145 1151 +6 ☔ View full report in Codecov by Sentry. |
bf27f83
to
44cff88
Compare
f6e4801
to
d10a483
Compare
94840ee
to
17031c8
Compare
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
74252bd
to
4970695
Compare
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
* Tune actor app health check to become healthy sooner Signed-off-by: joshvanl <me@joshvanl.dev> * Reset failure count to 0 on a healthy actor app health check Signed-off-by: joshvanl <me@joshvanl.dev> * Move wait group to just before go routine Signed-off-by: joshvanl <me@joshvanl.dev> * Change failure threshold `int32` -> `int` Signed-off-by: joshvanl <me@joshvanl.dev> * Fix actors health checker Signed-off-by: joshvanl <me@joshvanl.dev> * Returns `errors.Join` for actors `Close` procedure Signed-off-by: joshvanl <me@joshvanl.dev> --------- Signed-off-by: joshvanl <me@joshvanl.dev> Co-authored-by: Dapr Bot <56698301+dapr-bot@users.noreply.github.com> Co-authored-by: Yaron Schneider <schneider.yaron@live.com>
…)" This reverts commit 9cd87e2. Signed-off-by: Elena Kolevska <elena@kolevska.com> # Conflicts: # pkg/actors/actors.go
The current implementation of the actor health check means that it takes a significant amount of time before an app is checked for its healthy status (5s) and the actor sub system becomes ready. This amount of time is not appropriate for actor apps which are generally going to be stateless unto themselves will little dependencies- besides Dapr(!).
This PR updates the implementation to have two modes of health check- a unhealthy status check and a healthy status check. On startup and after the initial delay of 1 second, the checker enters the unhealthy check mode which checks the app health every half second. On success, the checker reports a healthy status and moves onto the healthy status mode. The healthy status mode checks the health every 3 seconds and requires 4 consecutive unhealthy status checks before reporting the unhealthy status and moving to the unhealthy status mode again.
This dramatically speeds up actor app startup times which can be seen from the integration tests. There are more improvements we can do to the actor subsystem within placemenet, but that will be followup PRs.
The actor placement client handler now responds immediately to a shutdown signal, rather than waited idling for a static
time.Sleep
duration, speeding up shutdown times of actor apps.