Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAMEL-17426: Prevent MicroProfile Health conflicting health data #6779

Merged
merged 1 commit into from Jan 18, 2022

Conversation

jamesnetherton
Copy link
Contributor

I reworked MP health so that it now depends on SmallRye Health & registers camel checks with the relevant liveness or readiness registry. That way each check can write its own details content and not have the chance to overwrite detail coming from other checks.

To avoid generating a very verbose health output (E.g if there many routes configured), I made the results of the routes and consumers checks to be an aggregated one. And only report the first failure encountered on the occurrence of errors. E.g:

{
    "status": "DOWN",
    "checks": [
        {
            "name": "camel-routes",
            "status": "UP"
        },
        {
            "name": "context",
            "status": "UP",
            "data": {
                "context.name": "camel-example",
                "context.version": "3.15.0-SNAPSHOT",
                "context.status": "Started"
            }
        },
        {
            "name": "camel-consumers",
            "status": "DOWN"
            "data": {
                "failure.endpoint.uri": "foo:bar?option=value",
                "failure.error.count": "1",
                "error.message": "the error message",
                "error.stacktrace": "the stack trace"
            }
        },
    ]
}

* Introduce CamelMicroProfileHealthCheckRegistry
* Register individual Camel health check with SmallRye Health registries
* Aggregate routes & consumers health check results

'WIP - complete
@astefanutti
Copy link
Member

astefanutti commented Jan 18, 2022

Is it OK for users that only the first error gets reported? I could see the case where the user checks the error, fixes it, then expect her/his application to run, but realises another error gets reported?

I realise that may not be trivial to report the list of errors, but that's not impossible :)

@jamesnetherton
Copy link
Contributor Author

jamesnetherton commented Jan 18, 2022

I could see the case where the user checks the error, fixes it, then expect her/his application to run, but realises another error gets reported

True... and this is pretty much a continuation of what happened before where all Camel health check results were aggregated.

I realise that may not be trivial to report the list of errors, but that's not impossible :)

If SmallRye Health made it possible to structure the values of the data element, then you could report additional error elements pretty easily (E.g as maps or arrays). But that's not currently possible.

So I guess you're left with the options of either combining & delimiting all of the error data in a single string (not great for human readability) or not aggregating the results of the routes / consumers checks (which I'd prefer to avoid due to the potential verbosity of reporting them as individual items).

Or have additional data elements like <health-check-id>.error.message & <health-check-id>.error.stacktrace.

@astefanutti
Copy link
Member

So I guess you're left with the options of either combining & delimiting all of the error data in a single string (not great for human readability) or not aggregating the results of the routes / consumers checks (which I'd prefer to avoid due to the potential verbosity of reporting them as individual items).

I think the output is more meant to be processed by clients like a GUI or an operator, rather than consumed by humans. We, developers, consume it, but I think that's because we are developers, even if we also happen to be human beings :).

Or have additional data elements like <health-check-id>.error.message & <health-check-id>.error.stacktrace.

That is what I was thinking, or like error.message[health-check-id].

The first point we would need to address is whether we go for individual or aggregated results. Anyway, this PR is already a great improvement over the current format.

@jamesnetherton jamesnetherton merged commit 4865a1c into apache:main Jan 18, 2022
@jamesnetherton jamesnetherton deleted the CAMEL-17426 branch January 18, 2022 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants