Skip to content

Add friendly error messages to health check failures#14072

Open
Copilot wants to merge 9 commits intomainfrom
copilot/display-friendly-error-messages
Open

Add friendly error messages to health check failures#14072
Copilot wants to merge 9 commits intomainfrom
copilot/display-friendly-error-messages

Conversation

Copy link
Contributor

Copilot AI commented Jan 23, 2026

Description

Health checks were displaying raw exception stack traces instead of actionable error messages. Dashboard users saw multi-line exception details like System.Threading.Tasks.TaskCanceledException: The operation was canceled. ---> System.IO.IOException: Unable to read data from the transport connection... with no clear indication of what failed.

The root cause was that the third-party UriHealthCheck from AspNetCore.HealthChecks.Uris returns unhealthy results with exception information in the Description field rather than the Exception property. The initial implementation only wrapped results when an exception object was present, allowing most unhealthy results to pass through with raw stack traces.

Changes

HTTP endpoint health checks (ExternalServiceBuilderExtensions.cs):

  • Created StaticUriHealthCheck and enhanced ParameterUriHealthCheck to wrap UriHealthCheck from AspNetCore.HealthChecks.Uris
  • Fixed: Now wraps ALL unhealthy results from UriHealthCheck, not just those with exception objects
  • Provides friendly error messages for common failure scenarios:
    • Timeouts: "Request to {url} timed out."
    • Explicit cancellation: "Health check for {url} was canceled."
    • HTTP errors with status: "Request to {url} returned {code} {status}."
    • Connection failures: "Failed to connect to {url}."
    • Generic failures: "Health check failed for {url}."

Component health checks - Added descriptive error messages for 6 components:

  • Seq: "Request to {url} returned {statusCode}" with proper URI construction and explicit cancellation handling
  • NATS: "Failed to connect to NATS server" / "Connecting to NATS server..." (only for unhealthy/degraded states)
  • Milvus: "Failed to connect to Milvus server"
  • Qdrant: "Failed to connect to Qdrant server"
  • Azure AI Search: "Failed to connect to Azure AI Search service"
  • Azure Web PubSub: "Failed to connect to Azure Web PubSub service"

Additional improvements from PR review feedback:

  • Fixed ParameterUriHealthCheck to create fresh UriHealthCheckOptions per invocation to avoid duplicate URI accumulation
  • Simplified XML documentation for internal classes to follow Aspire conventions
  • Distinguished between timeout and explicit user cancellation in error messages
  • Removed descriptions from healthy states to maintain backward compatibility

Exception objects remain attached to HealthCheckResult for detailed debugging when needed.

Example

Before:

System.Threading.Tasks.TaskCanceledException: The operation was canceled.
---> System.IO.IOException: Unable to read data from the transport connection...

After:

Health check failed for http://192.0.2.1/.

or

Request to http://example.com/api/ timed out.

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
    • No
  • Does the change require an update in our Aspire docs?
    • Yes
    • No
Original prompt

This section details on the original issue you should resolve

<issue_title>Display more friendly error message for common health checks errors</issue_title>
<issue_description>### Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

This is what is displayed for a health check that is validates whether an endpoint returns 200:

Image

No message is returned so health checks defaults to displaying the exception message. This doesn't look user friendly.

The health check should have a failure message like "Request to xxx timed out" if there is a timeout. Or "Request to xxx returned yyy status code".

We should do the same for other built-in checks: polling an endpoint for 200, a database for a good response, or message queue, etc. The failure message message should explain what has gone wrong in a friendly way. More detail from the exception message is still available from the exception details if the user wants to view it.

Expected Behavior

No response

Steps To Reproduce

No response

Exceptions (if any)

No response

.NET Version info

No response

Anything else?

No response</issue_description>

Comments on the Issue (you are @copilot in this section)

@davidfowl Maybe we should do this [here](https://github.com/Xabaril/AspNetCore.Diagnostics.HealthChecks). That way we don't need to do anything special for these health checks.

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Display more friendly error message for common health checks errors

4 participants