Skip to content

Fix idempotent startup for Flink quickstart docker-compose#15340

Open
kevinjqliu wants to merge 1 commit intoapache:mainfrom
kevinjqliu:kevinjqliu/add-docker-compose-improvements
Open

Fix idempotent startup for Flink quickstart docker-compose#15340
kevinjqliu wants to merge 1 commit intoapache:mainfrom
kevinjqliu:kevinjqliu/add-docker-compose-improvements

Conversation

@kevinjqliu
Copy link
Contributor

Follow up to #15124, I noticed an issue when rerunning the quickstart docker container again (docker compose -f docker/iceberg-flink-quickstart/docker-compose.yml up -d --build)

Repro

To reproduce, run the quickstart docker container with the command above, then run the flink sql commands using docker exec -it jobmanager ./bin/sql-client.sh.
Rerun the container and these flink sql commands again; CREATE TABLE fails without this PR.
Flink SQL:

CREATE CATALOG iceberg WITH (
    'type' = 'iceberg',
    'catalog-impl' = 'org.apache.iceberg.rest.RESTCatalog',
    'uri' = 'http://iceberg-rest:8181',
    'io-impl' = 'org.apache.iceberg.aws.s3.S3FileIO',
    's3.endpoint' = 'http://minio:9000'
);

CREATE DATABASE IF NOT EXISTS `iceberg`.demo;

CREATE TABLE IF NOT EXISTS `iceberg`.`demo`.sample (
    id   BIGINT   COMMENT 'unique id',
    data STRING   COMMENT 'payload',
    ts   TIMESTAMP(3) COMMENT 'event time'
);

INSERT INTO `iceberg`.`demo`.sample VALUES
    (1, 'alpha',   TIMESTAMP '2026-02-16 10:00:00'),
    (2, 'bravo',   TIMESTAMP '2026-02-16 10:01:00'),
    (3, 'charlie', TIMESTAMP '2026-02-16 10:02:00');

SELECT * FROM `iceberg`.`demo`.sample;

Summary

Fix the Flink quickstart docker-compose.yml so that docker compose up -d --build is safe to rerun without breaking the Iceberg REST catalog.

Problem

The create-bucket init container ran mc rm -r --force minio/warehouse on every execution, wiping all S3 data (metadata JSON, Parquet files, Avro manifests). However, the Iceberg REST catalog's SQLite database persisted inside its running container, leaving it with stale references to deleted metadata files. Any subsequent table operation would fail with:

NotFoundException: Location does not exist: s3://warehouse/demo/sample/metadata/00001-....metadata.json

Changes

  • Idempotent bucket creation: Replace destructive mc rm -r --force + mc mb with mc mb --ignore-existing to create the bucket only if it doesn't exist
  • Prevent re-execution on rerun: Add tail -f /dev/null to keep the create-bucket container alive, so docker compose up treats it as already running
  • Healthcheck-gated startup: Add a healthcheck (mc ls minio/warehouse) to create-bucket and update iceberg-rest to depend on service_healthy, ensuring the bucket is verified to exist before the catalog starts
  • Fix deprecated CLI: Replace mc policy set with mc anonymous set to avoid deprecation warnings
  • Remove redundant retry loop: The until loop in create-bucket is no longer needed since it now depends on minio: service_healthy

Behavior

Command Before After
docker compose up -d --build (first) Works Works
docker compose up -d --build (rerun) Broken — S3 wiped, catalog has stale refs Works — no-op, state preserved
docker compose down && up Works (fresh start) Works (fresh start)

@kevinjqliu
Copy link
Contributor Author

cc @rmoff weird edge case 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant