Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove inconsistent, expensive, unprofessional-looking self-health-check function #37

Closed
blueedgenick opened this issue Jan 17, 2020 · 6 comments
Assignees

Comments

@blueedgenick
Copy link

PR #22, shortly prior to the CP5.4 release, added a new self-health-check function to the ksql-server image. This new function is invoked every 5 seconds by default (see https://github.com/confluentinc/ksql-images/blob/master/cp-ksql-server/Dockerfile.deb9#L41-L42) and works by posting a show topics; query to the running ksql server.
There are several problems with this:
1 - it is inconsistent with all the other CP docker images, with the possible exception of Connect which looks to have something similar added. This means that when you bring up a CP stack with docker-compose up and then check it's status with docker-compose up, only ksql-server (and ppossibly Connect) report their status as Up (healthy). This gives the misleading impression that the other platform components are NOT healthy - a check like this needs to be all or none of the images.
2 - running show topics; on a "real" cluster is potentially an expensive and time-consuming operation - certainly more than should be running every 5 seconds from every server.
3 - the "woohoo ksql is up!" message is less than professional looking
4 - the check runs too frequently (every 5 seconds by default) and spams the docker logs with verbose output at default logging levels
5 - this check is undocumented - you have to spelunk this private repo code to figure out how to adjust or disable it.

Example log output:

ksql-server        | [2020-01-17 01:46:29,759] INFO 127.0.0.1 - - [17/Jan/2020:01:46:29 +0000] "POST /ksql HTTP/1.1" 200 4000  81 (io.confluent.rest-utils.requests:62)
ksql-server        | [2020-01-17 01:46:34,982] INFO Received: KsqlRequest{ksql='SHOW TOPICS;', streamsProperties={}, commandSequenceNumber=Optional.empty} (io.confluent.ksql.rest.server.resources.KsqlResource:200)
ksql-server        | [2020-01-17 01:46:34,984] INFO AvroDataConfig values:
ksql-server        |    connect.meta.data = true
ksql-server        |    enhanced.avro.schema.support = false
ksql-server        |    schemas.cache.config = 1000
ksql-server        |  (io.confluent.connect.avro.AvroDataConfig:347)
@apurvam
Copy link
Contributor

apurvam commented Jan 24, 2020

@rmoff would you mind picking this up? I would recommend running the command 'show streams' every 10 seconds.

the message can be "KSQL is UP".

This way we test that the ksql server is up and don't need to put an unnecessary load on Kafka. Presumably kafka has it's own healthchecks so we don't need to test it through ksql.

@mikebin
Copy link
Contributor

mikebin commented Feb 9, 2020

Another issue with the current healthcheck is that it doesn’t appear to work in environments with RBAC which have basic auth enabled to the REST endpoint.

mikebin added a commit to mikebin/ksql-images that referenced this issue Apr 5, 2020
agavra pushed a commit that referenced this issue Apr 15, 2020
@strubbsel
Copy link

Another issue with the current healthcheck is that it doesn’t appear to work in environments with SSL client auth being enabled

@MichaelDrogalis
Copy link

Just to check: we can close this because we merged #45, right?

@mikebin
Copy link
Contributor

mikebin commented Oct 29, 2020

@MichaelDrogalis - yes, the Docker healthcheck should be gone now.

@apurvam
Copy link
Contributor

apurvam commented Oct 29, 2020

Great, thanks @mikebin

@apurvam apurvam closed this as completed Oct 29, 2020
agavra pushed a commit that referenced this issue Dec 2, 2021
agavra added a commit that referenced this issue Dec 2, 2021
Co-authored-by: Michael Bingham <michael.b.bingham@gmail.com>
agavra pushed a commit that referenced this issue Dec 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants