Skip to content

install.sh fails due to slow clickhouse startup process #1033

@pharindoko

Description

@pharindoko

Version

2.1.70

Steps to Reproduce

  1. go to onpremise folder
  2. execute bash -x install.sh --no-user-prompt
  3. wait until clickhouse get`s started in correlation with kafka and snuba
2021-07-18 11:45:33,696 Attempting to connect to Kafka (attempt 0)...
2021-07-18 11:45:35,701 Attempting to connect to Kafka (attempt 1)...
2021-07-18 11:45:37,704 Attempting to connect to Kafka (attempt 2)...
2021-07-18 11:45:37,718 Connected to Kafka on attempt 2
2021-07-18 11:45:37,718 Creating Kafka topics...
++ docker-compose --no-ansi run --rm snuba-api migrations migrate --force
--no-ansi option is deprecated and will be removed in future versions. Use `--ansi never` instead.
Creating sentry_onpremise_snuba-api_run ...
Creating sentry_onpremise_snuba-api_run ... done
Failed to connect to clickhouse:9000
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 260, in connect
    return self._init_connection(host, port)
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 226, in _init_connection
    self.socket = self._create_socket(host, port)
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 221, in _create_socket
    raise err
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 212, in _create_socket
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
Failed to connect to clickhouse:9000
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 260, in connect
    return self._init_connection(host, port)
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 226, in _init_connection
    self.socket = self._create_socket(host, port)
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 221, in _create_socket
    raise err
  File "/usr/local/lib/python3.8/site-packages/clickhouse_driver/connection.py", line 212, in _create_socket
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
Connection to Clickhouse cluster clickhouse:9000 failed (attempt 0)
  1. the connection attempt fail and the reason for this is the slow startup of clickhouse entrypoint.sh
    it does recursively chown through the clickhouse folders
    https://github.com/ClickHouse/ClickHouse/blob/9f529b0c25f1e092651355775355b5cee57c259f/docker/server/entrypoint.sh#L75

  2. using
    tree -a <clickhouse-directory>
    I get 1078 directories, 76755 files, ca. 4.6 GB

  3. Manually starting entrypoint.sh in the docker container takes me minimum 5 Minutes waiting for chown to finish.

  4. Quite interesting that clickhouse knows about that problem ... that`s why they put following code at the top of the entrypoint.sh file
    https://github.com/ClickHouse/ClickHouse/blob/9f529b0c25f1e092651355775355b5cee57c259f/docker/server/entrypoint.sh#L7

  5. What helped to fix this error immediately was to add environment variable CLICKHOUSE_DO_NOT_CHOWN with value 1 into docker-compose file.
    https://github.com/getsentry/onpremise/blob/34812ce837b29f6ff204638b7fd24caeaddb411c/docker-compose.yml#L172
    Means no permissions will be set in this long running process.
    Correct permissions could be handled in another task triggered by install.sh ...

Expected Result

clickhouse starts up immediately as expected in container environments without doing that much background work and connections will not fail due to slow upcoming clickhouse-server.

Actual Result

Traceback (most recent call last):
  File "/usr/src/snuba/snuba/migrations/connect.py", line 30, in check_clickhouse_connections
    check_clickhouse(clickhouse)
  File "/usr/src/snuba/snuba/migrations/connect.py", line 49, in check_clickhouse
    ver = clickhouse.execute("SELECT version()")[0][0]
  File "/usr/src/snuba/snuba/clickhouse/native.py", line 96, in execute
    raise ClickhouseError(e.code, e.message) from e
snuba.clickhouse.errors.ClickhouseError: [210] Connection refused (clickhouse:9000)
Failed to connect to clickhouse:9000

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions