Skip to content

Fix YTsaurus crash loop from non-idempotent admin user creation#96297

Merged
alexey-milovidov merged 1 commit intomasterfrom
fix-ytsaurus-admin-crash-loop
Feb 8, 2026
Merged

Fix YTsaurus crash loop from non-idempotent admin user creation#96297
alexey-milovidov merged 1 commit intomasterfrom
fix-ytsaurus-admin-crash-loop

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

When the YTsaurus scheduler times out during startup (40s default), the container exits and Docker restarts it (restart: always). On restart, --create-admin-user fails because the admin user already exists from the first attempt (data persists via --local-cypress-dir). This creates an infinite crash loop where the /ping endpoint briefly responds between restarts, giving a false "ready" signal.

Fix by wrapping the entrypoint in bash and using a marker file (/tmp/.yt_admin_created) to only pass --create-admin-user on the first startup attempt.

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

When the YTsaurus scheduler times out during startup (40s default),
the container exits and Docker restarts it (`restart: always`). On
restart, `--create-admin-user` fails because the admin user already
exists from the first attempt (data persists via `--local-cypress-dir`).
This creates an infinite crash loop where the `/ping` endpoint briefly
responds between restarts, giving a false "ready" signal.

Fix by wrapping the entrypoint in bash and using a marker file
(`/tmp/.yt_admin_created`) to only pass `--create-admin-user` on
the first startup attempt.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Feb 7, 2026

Workflow [PR], commit [37843c1]

Summary:

job_name test_name status info comment
Integration tests (amd_binary, 5/5) failure
test_restore_db_replica/test.py::test_query_after_restore_db_replica[alter table-with exists table-no restart] FAIL cidb, issue

@clickhouse-gh clickhouse-gh Bot added the pr-ci label Feb 7, 2026
@alexey-milovidov alexey-milovidov self-assigned this Feb 8, 2026
@alexey-milovidov alexey-milovidov merged commit 1854a96 into master Feb 8, 2026
132 of 134 checks passed
@alexey-milovidov alexey-milovidov deleted the fix-ytsaurus-admin-crash-loop branch February 8, 2026 01:53
@robot-ch-test-poll robot-ch-test-poll added the pr-synced-to-cloud The PR is synced to the cloud repo label Feb 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-ci pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants