Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Healthz overhaul #7518

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

Healthz overhaul #7518

wants to merge 9 commits into from

Conversation

JoshVanL
Copy link
Contributor

@JoshVanL JoshVanL commented Feb 9, 2024

Currently, the healthz endpoint on Dapr services do not represent the actual health of the service as they are mostly only spinning up a HTTP server and immediately returning a 200 OK. This results in services receiving traffic before they are ready to actually serve requests. This issue manifests in integration tests where, for example, no metrics are available on the metrics endpoint even though the service reports as healthy and therefore fails the test.

This PR overhauls the healthz package to allow registering health targets during the service New process chain. Each module marks their target as healthy and the healthz server reports the overall health of the service. Although not perfect, service healthz endpoints should now be more representative of the actual health of the program.

@JoshVanL JoshVanL requested review from a team as code owners February 9, 2024 01:43
Copy link

codecov bot commented Feb 9, 2024

Codecov Report

Attention: Patch coverage is 50.50000% with 99 lines in your changes are missing coverage. Please review.

Project coverage is 61.81%. Comparing base (091a204) to head (869ed2e).
Report is 6 commits behind head on master.

Current head 869ed2e differs from pull request most recent head 14bad27

Please upload reports for the commit 14bad27 to get more accurate results.

Files Patch % Lines
pkg/metrics/exporter.go 28.57% 20 Missing ⚠️
pkg/operator/operator.go 0.00% 17 Missing ⚠️
...g/sentry/server/validator/kubernetes/kubernetes.go 0.00% 12 Missing ⚠️
pkg/placement/raft/server.go 0.00% 10 Missing ⚠️
pkg/api/grpc/server.go 58.82% 7 Missing ⚠️
pkg/injector/service/injector.go 22.22% 7 Missing ⚠️
pkg/metrics/options.go 45.45% 6 Missing ⚠️
pkg/actors/actors.go 0.00% 2 Missing and 2 partials ⚠️
pkg/security/fake/fake.go 0.00% 3 Missing ⚠️
pkg/security/security.go 78.57% 2 Missing and 1 partial ⚠️
... and 6 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7518      +/-   ##
==========================================
+ Coverage   61.39%   61.81%   +0.42%     
==========================================
  Files         265      247      -18     
  Lines       22609    22477     -132     
==========================================
+ Hits        13880    13894      +14     
+ Misses       7579     7417     -162     
- Partials     1150     1166      +16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@JoshVanL JoshVanL marked this pull request as draft February 10, 2024 01:20
@JoshVanL JoshVanL added the autoupdate DaprBot will keep the Pull Request up to date with master branch label Feb 12, 2024
@JoshVanL JoshVanL marked this pull request as ready for review February 12, 2024 12:32
@JoshVanL JoshVanL added this to the v1.14 milestone Mar 5, 2024
@JoshVanL JoshVanL force-pushed the healthz-overhaul branch 2 times, most recently from d356488 to 8f2ef5f Compare March 12, 2024 16:46
@yaron2
Copy link
Member

yaron2 commented Mar 26, 2024

@JoshVanL please resolve conflicts

@JoshVanL
Copy link
Contributor Author

@yaron2 fixed

pkg/placement/tables.go Show resolved Hide resolved
pkg/operator/operator.go Outdated Show resolved Hide resolved
@JoshVanL JoshVanL force-pushed the healthz-overhaul branch 2 times, most recently from ca58813 to 92b8f10 Compare March 28, 2024 23:51
@yaron2
Copy link
Member

yaron2 commented Apr 15, 2024

/ok-to-test

1 similar comment
@yaron2
Copy link
Member

yaron2 commented Apr 15, 2024

/ok-to-test

@dapr-bot
Copy link
Collaborator

dapr-bot commented Apr 15, 2024

Dapr E2E test

🔗 Link to Action run

Commit ref: bf7d98b

✅ Build succeeded for linux/amd64

  • Image tag: dapre2e483c43facfl
  • Test image tag: dapre2e483c43facfl

✅ Infrastructure deployed

Cluster Resource group name Azure region
Linux Dapr-E2E-dapre2e483c43facfl westus3
Windows Dapr-E2E-dapre2e483c43facfw westus3
Linux/arm64 Dapr-E2E-dapre2e483c43facfla eastus

✅ Build succeeded for windows/amd64

  • Image tag: dapre2e483c43facfw
  • Test image tag: dapre2e483c43facfw

❌ Tests failed on windows/amd64

Please check the logs for details on the error.

❌ Tests failed on linux/amd64

Please check the logs for details on the error.

@dapr-bot
Copy link
Collaborator

dapr-bot commented Apr 15, 2024

Dapr E2E test

🔗 Link to Action run

Commit ref: bf7d98b

✅ Build succeeded for linux/amd64

  • Image tag: dapre2e9cf975a8e0l
  • Test image tag: dapre2e9cf975a8e0l

✅ Infrastructure deployed

Cluster Resource group name Azure region
Linux Dapr-E2E-dapre2e9cf975a8e0l westus3
Windows Dapr-E2E-dapre2e9cf975a8e0w westus3
Linux/arm64 Dapr-E2E-dapre2e9cf975a8e0la eastus

✅ Build succeeded for windows/amd64

  • Image tag: dapre2e9cf975a8e0w
  • Test image tag: dapre2e9cf975a8e0w

✅ Tests succeeded on windows/amd64

  • Image tag: dapre2e9cf975a8e0w
  • Test image tag: dapre2e9cf975a8e0w

✅ Tests succeeded on linux/amd64

  • Image tag: dapre2e9cf975a8e0l
  • Test image tag: dapre2e9cf975a8e0l

@artursouza
Copy link
Member

/ok-to-test

2 similar comments
@artursouza
Copy link
Member

/ok-to-test

@artursouza
Copy link
Member

/ok-to-test

@dapr-bot
Copy link
Collaborator

dapr-bot commented Apr 16, 2024

Dapr E2E test

🔗 Link to Action run

Commit ref: bf7d98b

✅ Build succeeded for linux/amd64

  • Image tag: dapre2e9b184e5841l
  • Test image tag: dapre2e9b184e5841l

✅ Infrastructure deployed

Cluster Resource group name Azure region
Linux Dapr-E2E-dapre2e9b184e5841l westus3
Windows Dapr-E2E-dapre2e9b184e5841w westus3
Linux/arm64 Dapr-E2E-dapre2e9b184e5841la eastus

✅ Build succeeded for windows/amd64

  • Image tag: dapre2e9b184e5841w
  • Test image tag: dapre2e9b184e5841w

❌ Tests failed on windows/amd64

Please check the logs for details on the error.

✅ Tests succeeded on linux/amd64

  • Image tag: dapre2e9b184e5841l
  • Test image tag: dapre2e9b184e5841l

@dapr-bot
Copy link
Collaborator

dapr-bot commented Apr 16, 2024

Dapr E2E test

🔗 Link to Action run

Commit ref: bf7d98b

✅ Build succeeded for linux/amd64

  • Image tag: dapre2e5c7a0f63e1l
  • Test image tag: dapre2e5c7a0f63e1l

✅ Infrastructure deployed

Cluster Resource group name Azure region
Linux Dapr-E2E-dapre2e5c7a0f63e1l westus3
Windows Dapr-E2E-dapre2e5c7a0f63e1w westus3
Linux/arm64 Dapr-E2E-dapre2e5c7a0f63e1la eastus

✅ Build succeeded for windows/amd64

  • Image tag: dapre2e5c7a0f63e1w
  • Test image tag: dapre2e5c7a0f63e1w

✅ Tests succeeded on windows/amd64

  • Image tag: dapre2e5c7a0f63e1w
  • Test image tag: dapre2e5c7a0f63e1w

✅ Tests succeeded on linux/amd64

  • Image tag: dapre2e5c7a0f63e1l
  • Test image tag: dapre2e5c7a0f63e1l

@dapr-bot
Copy link
Collaborator

dapr-bot commented Apr 16, 2024

Dapr E2E test

🔗 Link to Action run

Commit ref: bf7d98b

✅ Build succeeded for linux/amd64

  • Image tag: dapre2eba03c361f7l
  • Test image tag: dapre2eba03c361f7l

✅ Infrastructure deployed

Cluster Resource group name Azure region
Linux Dapr-E2E-dapre2eba03c361f7l westus3
Windows Dapr-E2E-dapre2eba03c361f7w westus3
Linux/arm64 Dapr-E2E-dapre2eba03c361f7la eastus

✅ Build succeeded for windows/amd64

  • Image tag: dapre2eba03c361f7w
  • Test image tag: dapre2eba03c361f7w

❌ Tests failed on windows/amd64

Please check the logs for details on the error.

✅ Tests succeeded on linux/amd64

  • Image tag: dapre2eba03c361f7l
  • Test image tag: dapre2eba03c361f7l

@JoshVanL JoshVanL force-pushed the healthz-overhaul branch 2 times, most recently from d491f2a to 85ea707 Compare May 6, 2024 22:36
Currently, the healthz endpoint on Dapr services do not represent the
actual health of the service as they are mostly only spinning up a HTTP
server and immediately returning a 200 OK. This results in services
receiving traffic before they are ready to actually serve requests. This
issue manifests in integration tests where, for example, no metrics are
available on the metrics endpoint even though the service reports as
healthy and therefore fails the test.

This PR overhauls the healthz package to allow registering health
targets during the service `New` process chain. Each module marks their
target as healthy and the healthz server reports the overall health of
the service. Although not perfect, service healthz endpoints should now
be more representative of the actual health of the program.

Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autoupdate DaprBot will keep the Pull Request up to date with master branch
Projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants