Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clickhouse healthcheck appears to be broken #1081

Closed
RazerM opened this issue Sep 1, 2021 · 1 comment
Closed

clickhouse healthcheck appears to be broken #1081

RazerM opened this issue Sep 1, 2021 · 1 comment

Comments

@RazerM
Copy link

RazerM commented Sep 1, 2021

Version

21.8.0

Steps to Reproduce

I'm trying to upgrade from 21.6.3 to 21.8.0. It gets stuck at the 'Bootstrapping and migrating Snuba...' step. I debugged this to the clickhouse container remaining in the health: starting state.

In the logs below, you will see timestamped logs from the clickhouse container indiciating it started successfully, but another log showing the healthcheck failed.

Expected Result

Clickhouse starts and the healthcheck works.

Actual Result

The relevant part of the install log:

▶ Bootstrapping and migrating Snuba ...
docker-compose --ansi never run --rm snuba-api bootstrap --no-migrate --force
Creating sentry_onpremise_clickhouse_1 ... ^M
Creating sentry_onpremise_redis_1      ... ^M
Creating sentry_onpremise_zookeeper_1  ... ^M
Creating sentry_onpremise_zookeeper_1  ... done^M
Creating sentry_onpremise_clickhouse_1 ... done^M
Creating sentry_onpremise_redis_1      ... done^M
Creating sentry_onpremise_kafka_1      ... ^M
Creating sentry_onpremise_kafka_1      ... done^M

ERROR: for snuba-api  Container "409b48d4711a" is unhealthy.
Encountered errors while bringing up the project.
An error occurred, caught SIGERR on line 4
Cleaning up...

docker logs <clickhouse container id>:

2021.09.01 12:28:45.761070 [ 1 ] {} <Information> Application: Listening for http://0.0.0.0:8123
2021.09.01 12:28:45.761345 [ 1 ] {} <Information> Application: Listening for connections with native protocol (tcp): 0.0.0.0:9000
2021.09.01 12:28:45.761598 [ 1 ] {} <Information> Application: Listening for replica communication (interserver): http://0.0.0.0:9009
2021.09.01 12:28:46.128102 [ 60 ] {} <Information> default.transactions_local (TTLBlockInputStream): Removed 53 rows with expired TTL from part 90-20210531_199522_260641_8755
2021.09.01 12:28:46.248275 [ 1 ] {} <Information> Application: Listening for MySQL compatibility protocol: 0.0.0.0:9004
2021.09.01 12:28:46.249096 [ 1 ] {} <Information> Application: Available RAM: 7.79 GiB; physical cores: 4; logical cores: 4.
2021.09.01 12:28:46.249140 [ 1 ] {} <Information> Application: Ready for connections.
Include not found: clickhouse_remote_servers
Include not found: clickhouse_compression

docker inspect --format "{{json .State.Health }}" <clickhouse container id> | jq:

{
  "Status": "unhealthy",
  "FailingStreak": 201,
  "Log": [
    {
      "Start": "2021-09-01T12:38:59.377880793Z",
      "End": "2021-09-01T12:38:59.499099003Z",
      "ExitCode": 1,
      "Output": "http://localhost:9000/:\nRemote file does not exist -- broken link!!!\n"
    },
    {
      "Start": "2021-09-01T12:39:02.504286009Z",
      "End": "2021-09-01T12:39:02.620415968Z",
      "ExitCode": 1,
      "Output": "http://localhost:9000/:\nRemote file does not exist -- broken link!!!\n"
    },
    {
      "Start": "2021-09-01T12:39:05.625188692Z",
      "End": "2021-09-01T12:39:05.754310769Z",
      "ExitCode": 1,
      "Output": "http://localhost:9000/:\nRemote file does not exist -- broken link!!!\n"
    },
    {
      "Start": "2021-09-01T12:39:08.758399383Z",
      "End": "2021-09-01T12:39:08.881333045Z",
      "ExitCode": 1,
      "Output": "http://localhost:9000/:\nRemote file does not exist -- broken link!!!\n"
    },
    {
      "Start": "2021-09-01T12:39:11.886920294Z",
      "End": "2021-09-01T12:39:12.007478603Z",
      "ExitCode": 1,
      "Output": "http://localhost:9000/:\nRemote file does not exist -- broken link!!!\n"
    }
  ]
}

The only thing that jumps out at me is that the clickhouse log says "Listening for connections with native protocol (tcp): 0.0.0.0:9000" and the health check is trying to use http?

@RazerM
Copy link
Author

RazerM commented Sep 1, 2021

Sorry – I've just seen the commits to master about this with the port change to fix it...

@RazerM RazerM closed this as completed Sep 1, 2021
@github-actions github-actions bot locked and limited conversation to collaborators Sep 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant