Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tune actor app health check to become healthy sooner #7022

Merged
merged 31 commits into from Jan 16, 2024

Conversation

JoshVanL
Copy link
Contributor

The current implementation of the actor health check means that it takes a significant amount of time before an app is checked for its healthy status (5s) and the actor sub system becomes ready. This amount of time is not appropriate for actor apps which are generally going to be stateless unto themselves will little dependencies- besides Dapr(!).

This PR updates the implementation to have two modes of health check- a unhealthy status check and a healthy status check. On startup and after the initial delay of 1 second, the checker enters the unhealthy check mode which checks the app health every half second. On success, the checker reports a healthy status and moves onto the healthy status mode. The healthy status mode checks the health every 3 seconds and requires 4 consecutive unhealthy status checks before reporting the unhealthy status and moving to the unhealthy status mode again.

This dramatically speeds up actor app startup times which can be seen from the integration tests. There are more improvements we can do to the actor subsystem within placemenet, but that will be followup PRs.

The actor placement client handler now responds immediately to a shutdown signal, rather than waited idling for a static time.Sleep duration, speeding up shutdown times of actor apps.

@JoshVanL JoshVanL requested review from a team as code owners October 11, 2023 11:06
@JoshVanL JoshVanL added the autoupdate DaprBot will keep the Pull Request up to date with master branch label Oct 11, 2023
@ItalyPaleAle
Copy link
Contributor

Please, let's wait for release-1.12 to be merged into master before making more changes into pkg/actors: #7017

pkg/actors/actors.go Show resolved Hide resolved
import "k8s.io/utils/clock"

// WithClock sets a custom clock (for mocking time).
func WithClock(clock clock.WithTicker) Option {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this without the build tag. The overhead is basically none. We had to recently un-do a similar thing in dapr/kit as it made writing unit tests for other packages harder (dapr/kit#62)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this we should not pollute implementation code with test code. The unit build tag is a standard test tag used throughout the Dapr project where any testing code is able to make use of the WithClock option.

We had to recently un-do a similar thing in dapr/kit as it made writing unit tests for other packages harder

Why was this is the case? Surely the same argument can be made for interface mocks as well.

Copy link
Contributor Author

@JoshVanL JoshVanL Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the linked PR is says:

There are use cases where having those methods available even outside of a unit test is helpful, such as when the objects are instantiated with a clock that could be mocked in the unit test for the parent method.

Why would you want to use a mocked clock outside of a unit test?

pkg/actors/health/health.go Show resolved Hide resolved
pkg/actors/health/health.go Outdated Show resolved Hide resolved
pkg/actors/health/health.go Show resolved Hide resolved
pkg/actors/health/health.go Show resolved Hide resolved
pkg/actors/health/health.go Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Oct 12, 2023

Codecov Report

Attention: 33 lines in your changes are missing coverage. Please review.

Comparison is base (ae644a2) 62.21% compared to head (c1a3b86) 62.33%.

Files Patch % Lines
pkg/actors/actors.go 30.76% 13 Missing and 5 partials ⚠️
pkg/actors/health/health.go 88.59% 9 Missing and 4 partials ⚠️
pkg/actors/placement/placement.go 75.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7022      +/-   ##
==========================================
+ Coverage   62.21%   62.33%   +0.12%     
==========================================
  Files         240      240              
  Lines       22059    22055       -4     
==========================================
+ Hits        13723    13747      +24     
+ Misses       7191     7157      -34     
- Partials     1145     1151       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

artursouza
artursouza previously approved these changes Nov 15, 2023
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
pkg/actors/actors.go Outdated Show resolved Hide resolved
@yaron2 yaron2 merged commit 9cd87e2 into dapr:master Jan 16, 2024
19 of 22 checks passed
whytem pushed a commit to whytem/dapr that referenced this pull request Jan 22, 2024
* Tune actor app health check to become healthy sooner

Signed-off-by: joshvanl <me@joshvanl.dev>

* Reset failure count to 0 on a healthy actor app health check

Signed-off-by: joshvanl <me@joshvanl.dev>

* Move wait group to just before go routine

Signed-off-by: joshvanl <me@joshvanl.dev>

* Change failure threshold `int32` -> `int`

Signed-off-by: joshvanl <me@joshvanl.dev>

* Fix actors health checker

Signed-off-by: joshvanl <me@joshvanl.dev>

* Returns `errors.Join` for actors `Close` procedure

Signed-off-by: joshvanl <me@joshvanl.dev>

---------

Signed-off-by: joshvanl <me@joshvanl.dev>
Co-authored-by: Dapr Bot <56698301+dapr-bot@users.noreply.github.com>
Co-authored-by: Yaron Schneider <schneider.yaron@live.com>
elena-kolevska added a commit to elena-kolevska/dapr that referenced this pull request Jan 24, 2024
…)"

This reverts commit 9cd87e2.

Signed-off-by: Elena Kolevska <elena@kolevska.com>

# Conflicts:
#	pkg/actors/actors.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autoupdate DaprBot will keep the Pull Request up to date with master branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants