Fixes for various systems by alexey-milovidov · Pull Request #903 · ClickHouse/ClickBench

alexey-milovidov · 2026-05-14T20:27:42Z

No description provided.

Several systems' load scripts do `sudo mv hits_*.parquet /var/lib/<engine>/user_files/` or `sudo cp hits.csv .../extern/` followed by `chown` to the daemon's user. The mv/cp copies 14-75 GB of data the daemon reads once during INSERT and we delete right after — a complete waste of bytes on disk and time on the wire. Replace with `ln -s` + `chown -h` where the daemon's user-files dir is on a different filesystem from the dataset. `chown -h` chowns the symlink itself rather than following into the (often read-only) original; the underlying dataset is mode 644 anyway, so daemon processes can read through the symlink as their own user. Systems updated: clickhouse, clickhouse-tencent, pg_clickhouse, kinetica, oxla, ursa, arc, cockroachdb. Motivated by the ClickBench playground (Firecracker microVM service) where the dataset is mounted read-only and shared across all VMs; the copy step was the dominant cost on parquet/csv-format systems and pulled 14 GB into the per-VM snapshot golden disk unnecessarily. The change is also benign for the regular benchmark — daemons still read the same bytes, just through a symlink. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ClickBench: fix elasticsearch load.py bytes/str mix VM tweaks for the long tail of failures: - chdb-dataframe / duckdb-dataframe materialize the full hits dataset in process memory and need >32 GB. Default to 48 GB. - Druid / Pinot / similar JVM stacks take 5-10 min to come up (Zookeeper → Coordinator → Broker → Historical, in sequence). The agent's 300 s check-loop wasn't enough; widen to 900 s. elasticsearch/load.py: gzip.open in mode='rt' returns str docs, but bulk_stream yields bytes for ACTION_META_BYTES and str for the doc. requests.adapters.send() calls sock.sendall() on the mixed iterable and crashes with `TypeError: a bytes-like object is required, not 'str'`. Open in 'rb' so docs are bytes — matches the rest of the generator. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Snapshot pipeline: - /opt/clickbench-playground reformatted as XFS so cp --reflink=always can clone golden->working in milliseconds. - _snapshot_disks and _restore_disks switched to reflink (parallel, O(1) extent-list copies). - snapshot.bin no longer compressed; Firecracker mmaps it on restore, pages fault in lazily. - Snapshot is taken with the daemon running: pre-snapshot stop+fstrim +drop_caches is followed by start+check, so restore resumes a live daemon and the first query pays no cold-start cost. - _snapshot_disks runs while VM is paused, before resume. Without this the daemon's post-snapshot kernel writes (journal commits, atime) leaked into the golden disk and surfaced as ext4 EBADMSG on restore. Agent + host wiring: - New /ready endpoint on the in-VM agent; _restore_snapshot waits for /ready (up to 10 min) before reporting state="ready" so slow JVMs like Doris/Druid don't time out on the user's first query. - dockerd restart hook at agent boot — without it docker-using systems fail to launch containers after snapshot restore. - Output streamed and capped at OUTPUT_LIMIT+1 bytes (default 64 KB) with head-style early-kill; default query timeout 600 -> 60 s. - /api/query no longer triggers initial provisioning. Only restore. Initial provision requires explicit /api/admin/provision/<name>. - /api/queries/<name> returns the system's example queries. - _call_agent_provision: no aiohttp idle timeout, 7-day total cap. - ClickHouse-family stays on the internet after snapshot (datalake variants need S3); rest stays offline. Catalog: - paradedb-partitioned (pg_lakehouse removed upstream) and pg_duckdb-motherduck (needs cloud creds) excluded. - ClickHouse + chdb variants emit Pretty format. - ClickBench: trino/presto-datalake javac classpath uses find for AWS SDK / Hadoop jars instead of pinning a stale jar filename. - ClickBench: cedardb/cedardb-parquet/mongodb start scripts hardened (systemctl restart docker, longer wait windows, better diagnostics). - ClickBench: duckdb start scripts scrub stale *.wal. - ClickBench: arc start broadened admin-token regex. UI: - Catalog rendered as horizontal slabs, colored by state. - Per-system result cache (output + timing) keyed by system name. - Example-query selector populated from /api/queries/<name>. - Down systems swap the query pane for a "Last error" pane. - Stats row trimmed to time + truncated marker. - monospace font, no rounded corners, black selected outline. - Spellcheck / autocomplete / Grammarly opt-outs on the textarea. Bootstrap: - install-firecracker.sh: chown only the top-level state dirs, not recursively (a chown -R was descending into a base-rootfs build's loop mount and flipping /etc/sudoers to uid 1000). - install-firecracker.sh checks the state dir supports reflink and exits with an XFS-format hint if not. - download-datasets.sh fetches hits.json.gz (used by parseable).

…uery The bench-vortex binaries (clickbench / query_bench) emit only gh-json timing blobs — no rows — which makes the system useless in the interactive playground. Replace ./query with the datafusion (Parquet) flow: datafusion-cli reads parquet directly via create.sql. - install: cargo install datafusion-cli 49.0.2 (vortex bench build stays for benchmark.sh continuity). - create.sql (new): same shape as datafusion's, points at hits.parquet / partitioned/. - load: symlink parquet files via the playground stub instead of invoking the vortex driver's warmup pass. - query: identical to datafusion / datafusion-partitioned.

Recent Arc debs (25.12+) link against GLIBC_2.38; Ubuntu 22.04 has 2.35 and the daemon dies with libm.so.6 'GLIBC_2.38 not found'. Extract Ubuntu noble's libc6 into /opt/glibc-noble and replace /usr/bin/arc with a wrapper that exec's the real binary via that ld-linux + --library-path. Leaves the rest of the system on 22.04's glibc.

opteryx.query() returns a Cursor / Relation; list(cursor) iterates arrow batches rather than rows in current versions, so the prior script ended up printing nothing for every query. Probe arrow() / to_arrow_table() / fetchall() in turn and render the result as TSV (header + rows) when it's a pyarrow Table, falling back to iteration.

docker compose down --volumes deleted the anonymous volumes that hold byconity's HDFS namenode state and the bench database, so after the playground's pre-snapshot stop and post-snapshot restore the schema was gone. Use docker compose stop instead — it leaves containers (and their volumes) intact.

…ries.sql

drill-embedded crashes with ExceptionInInitializerError / "Could not initialize class RootAllocator" on JDK 17+ when java.base/java.nio (and neighbours) aren't opened to ALL-UNNAMED. Set DRILL_JAVA_OPTS on the docker run so every fresh JVM start in ./query has the right flags. playground: trust duckdb-datalake{,-partitioned} for outbound internet These two read from S3 at query time, same as clickhouse-datalake.

mongod 8.0.23 hardcodes "MongoDB cannot start: Linux kernel versions 6.19 and newer" and exits — the playground's guest kernel is 7.0. The 7.0 release line predates that check and runs fine.

The CLI was invoked with --output-format=NULL, which suppresses the result. Switch to ALIGNED so the playground shows actual rows. The benchmark still gets its timing from the wall-clock around the java -jar invocation.

Apache Doris -> Doris across the doris and doris-parquet template.json, every historical results/*.json, the get-result-json.sh helpers, and the rendered data.generated.js. README.md and install scripts keep 'Apache Doris' since those reference the upstream project name.

…tart' After a snapshot+restore the on-disk monetdbd lock files (.merovingian.lock, .gdk_lock) outlive the process. `monetdbd start` then exits with "another monetdbd is already running" without relaunching, mclient has no server to reach, and ./check spins for the agent's full 15-minute timeout. - If no monetdbd is actually running, wipe the lock files and any stale mserver5 before invoking monetdbd start. - After daemon-side relaunch, `monetdb start test` is also needed to actually start mserver5 for the database; `monetdb release` only un-marks maintenance. - Wait up to 60 s for mclient to go through; bail loudly with status output instead of leaving the agent's blind 15-min poll loop.

dpkg-deb -x preserves the package's internal layout — the deb stores the loader at /usr/lib64/ld-linux-x86-64.so.2, so extraction at $NOBLE_DIR puts it at $NOBLE_DIR/usr/lib64/..., not $NOBLE_DIR/lib64/... arc was failing with 'No such file or directory' on every restart.

don't hang on the credentials timeout DuckDB's S3 driver probes 169.254.169.254 for IAM credentials before each S3 request. The playground's SNI proxy blocks that IP (correctly, it's host metadata); each query then waits the full IMDS timeout before falling through to anonymous access, which the user sees as a hang. - duckdb-datalake: switch from s3://... to the HTTPS URL directly. httpfs reads HTTPS with no credential lookup at all. - duckdb-datalake-partitioned: keep s3:// (httpfs has no HTTPS glob), but add a CREATE SECRET ... TYPE S3 ... KEY_ID '' that short- circuits the credential chain to anonymous and skips IMDS.

Same root cause as the duckdb-datalake hang: ClickHouse's S3 engine probes 169.254.169.254 for IAM credentials before each request. The playground's SNI proxy blocks IMDS (correctly — it's host metadata), and each query waits the full timeout before falling through to anonymous. NOSIGN tells the S3 engine to skip the credential chain entirely and make anonymous requests, which the public bucket accepts.

Upstream bench-vortex no longer exposes a 'clickbench' bin target; only compress, public_bi, query_bench, random_access remain. Match the partitioned variant and build query_bench. The playground's ./query path uses datafusion-cli either way, but install needs to succeed for the system to provision.

CREATE TABLE in ./load failed with "Table replication num should be less than of equal to the number of available BE nodes" because the blind 'sleep 30' after ALTER SYSTEM ADD BACKEND wasn't enough on a fresh cold start — BE registration was still in progress when the load script proceeded. Replace with two active waits: first for FE to accept connections, then poll SHOW BACKENDS until Alive=true. Also make the script idempotent on both fronts (FE up + BE alive).

Two dataframe systems were carrying both files: - queries.sql with SQL equivalents that nothing ran - queries.py with the Python expressions the server actually eval'd plus a BENCH_QUERIES_FILE=queries.py override in benchmark.sh. Drop the unused queries.sql, rename queries.py -> queries.sql, drop the override. The lib/benchmark-common.sh default BENCH_QUERIES_FILE=queries.sql now picks up the Python expressions unchanged. Updated docstrings in server.py to note the new file contains Python (the filename matches the cross-system convention).

…ectly polars/server.py kept a 43-entry list of (sql_string, lambda) tuples and the /query endpoint did a dict lookup. Replace with the polars-dataframe pattern: /query takes a Python expression and eval()s it against the loaded LazyFrame, with hits/pl/date in scope. queries.sql now holds those Python expressions (one per line), same shape as polars-dataframe. The load remains lazy (pl.scan_parquet without collect) so the streaming behaviour that distinguishes polars from polars-dataframe is preserved.

install was bumped from 33.0.0 to 37.0.0 (33.0.0 was retired from the apache mirror) but start/load/data-size still pointed at the old $DRUID_DIR. start launched nothing — directory didn't exist — and the agent timed out the 900 s check loop.

The previous attempt set DRILL_JAVA_OPTS only — that env var is consumed by drillbit.sh, but drill-embedded launches sqlline which reads JAVA_OPTS instead. The RootAllocator's static init still failed on JDK 17+ inside the embedded JVM with ExceptionInInitializerError. Set DRILL_JAVA_OPTS / DRILL_SHELL_JAVA_OPTS / JAVA_OPTS / _JAVA_OPTIONS so whichever path the apache/drill launcher follows picks the flags up. Also widen the list of --add-opens (lang.reflect, util.concurrent, jdk.internal.misc/ref) which Arrow's allocator touches.

Previous fix got the daemon running ('test' DB shows R 100% in monetdb status) but mclient was still failing. Without -h, mclient hunts for a unix socket that doesn't always exist after a restart, and a hostname mismatch can stall TCP. Use -h 127.0.0.1 (both in check and start), and extend the wait to 120 s while logging mclient's actual error on failure for diagnosis.

CREATE SECRET in create.sql is session-scoped: it lives only inside the duckdb process that ran ./load. When ./query reopens hits.db the secret is gone, so DuckDB falls back to the us-east-1 default, ListObjects against the eu-central-1 bucket gets a 301 redirect that the playground's SNI-filter proxy can't follow cleanly, and Q1 on the partitioned variant hangs the full 60 s. Apply the SET s3_region + CREATE OR REPLACE SECRET on every duckdb invocation in ./query so the region is always correct and the credential chain never probes IMDS.

monetdb: Feed SQL via stdin instead of -s 'STMT' on the mclient command line. The previous flag form caused mclient to dump --help (and exit non-zero) instead of running the SELECT 1 health probe, even though the daemon was listening on 127.0.0.1:50000. check + start both switched to echo | mclient. siglens: siglens 1.0.54's go.mod requires Go 1.21+. Ubuntu's `golang` package is 1.18 — go mod tidy fails. Install go 1.22.7 from the official tarball into /usr/local/go and prepend to PATH. starrocks: FE was reporting "FE saved address not match backend address" after snapshot+restore because `hostname -i` could return different IPs between runs. Pin priority_networks=127.0.0.1/32 in both fe.conf and be.conf, register the BE under 127.0.0.1:9050, and drop any stale BE entry before re-adding. Stable rendezvous across restarts. tidb: cluster_info reports a tiflash row before the placement driver is ready to accept SET TIFLASH REPLICA. ./load failed with "tiflash server count: 0" on a fresh playground. Add a second phase to start: probe with a throwaway table until ALTER TABLE ... SET TIFLASH REPLICA 1 actually succeeds, then drop the probe.

parseable and victorialogs both decompress hits.json.gz inside the VM at load time. The 75 GB output overflows the 200 GB sysdisk (after the install / wget overhead) and load fails with "No space left on device". Pre-decompress on the host once and make the file available to load scripts as a symlink. - playground/images/build-base-rootfs.sh: add lib stubs download-hits-json and download-hits-json-gz so per-system load scripts can pick up the file via the same pattern as the other formats. - playground/scripts/download-datasets.sh: extends the host-side download flow to decompress hits.json.gz into datasets/hits.json. - parseable/load: prefer the pre-decompressed file; the legacy wget + gunzip path stays as a fallback for standalone use. - victorialogs/load: same — use the read-only hits.json directly with split -n r/8.

datafusion-vortex{,-partitioned}: Check looked for the renamed `clickbench` binary; switch to `command -v datafusion-cli` (matches what ./query actually uses) and have install symlink ~/.cargo/bin/datafusion-cli into /usr/local/bin so the agent's stripped PATH finds it. duckdb-datalake{,-partitioned}: readlink -f the installed duckdb before symlinking into /usr/local/bin so the link stays valid regardless of $HOME at provision/query time. gizmosql: v1.26+ links GLIBC_2.38; Ubuntu 22.04 has 2.35. Wrap both binaries with the same noble-loader trick used for arc. pinot: start/load referenced the old 1.3.0 dir; install was bumped to 1.5.0. Sync. systems.py: - disable pandas (peak RSS ~30 GB OOMs the 16 GB VM). - disable paradedb (postgres crashes during index VACUUM under 16 GB).

MongoDB doesn't publish a 7.0 apt repository for noble — only 8.0+. But 8.0 has the hardcoded "Linux kernel >= 6.19" refusal we need to avoid (the playground guest runs 7.0). The 7.0 packages built for jammy depend on libssl3 which noble also provides, so they install fine on a 24.04 base.

…body pinot: schema 1.5.0 requires schemaName == tableName; rename schema 'hitsSchema' to 'hits'. Also wait for the controller's REST API to be live before AddTable and drop the silent '|| true' that masked genuine failures. daft-parquet-partitioned: drop daft==0.7.4 version pin — the old release doesn't have col(...).decode('utf-8'), which made /load return 500 'Internal Server Error' before the data ever loaded. vm_manager: when a /provision fails, save the full agent response body to logs/provision-<system>.log so the real failure (often in start/check/load) is recoverable; the 2000-byte tail in the exception message usually catches only the install epilogue. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…epared hits.json monetdb: drop the query.expect wrapper. The interactive 'password:' expect cycle silently dumped --help on certain mclient builds and broke load/check. Use 'mclient -h 127.0.0.1 -P monetdb' driven from stdin instead — credentials inline, no PTY, no expect timeout. siglens: prefer /opt/clickbench/datasets_ro/hits.json (pre-decompressed 217 GB on the readonly dataset disk) over running pigz inside the VM. Previously the in-VM gunzip blew the 200 GB rootfs ENOSPC partway in. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Pinot's QuickStart -type batch keeps the controller's segment store in /tmp (a tmpfs). The pre-snapshot stop/start cycle wipes that, and the snapshotted daemon never re-registers the hits table on restore: /ready stays false, the host's 600s budget expires, and /api/query times out. Treat pinot like the dataframe systems: preserve the running daemon's state across snapshot. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…tion Pandas (and the other in-process dataframe systems) had /query return only {"elapsed": <s>}. The playground UI then surfaced '{"elapsed": 0.026}' as the entire query output, which is the timing — not the value the user asked for. server.py for pandas, chdb-dataframe, duckdb-dataframe, polars-dataframe, daft-parquet, daft-parquet-partitioned now also returns {"result": str(...)} (ClickHouse Pretty for chdb, repr for everything else; the agent's OUTPUT_LIMIT caps it before it crosses the host boundary). query scripts rewritten to feed the JSON body through a python3 heredoc so they print {result} on stdout and {elapsed} on stderr — matches the cross-system shell contract. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Cross-system convention is queries.sql. Renaming victorialogs and mongodb to match means handle_queries can stay simple — drop the multi-extension fallback added a moment ago. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… S3 plugin is gone) trino:latest no longer ships the hadoop-S3 plugin, so the fs.hadoop.enabled=true + custom AWSCredentialsProvider shim path is broken: 'External location is not a valid file system URI: s3://...'. Switch hive.properties to fs.native-s3.enabled=true with region + endpoint set explicitly; the public bucket allows unauthenticated GETs and the AWS SDK falls through its default credentials chain to anonymous when no creds are configured. The shim + core-site.xml mounts in start stay around as no-ops for now. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The original benchmark setup used --output-format=NULL because it measures timing only; under the playground that produces 200 OK with an empty body, which the UI faithfully shows as '(no output)'. Switch to ALIGNED — same human-readable table presto* uses — so the saved row + the UI both have something to display. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…host swap umbra still OOMs at create.sql:109 with the 256 GB swap on the host: docker's default cgroup setup gives the container the host's full memory but Umbra's own allocator caps itself at the cgroup's 'available' figure, which lands near the 16 GB physical RAM. Pin the container to 128 GB with unlimited swap so Umbra's allocator sees enough room to load the table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

apache/drill ships a JDK whose CgroupV2Subsystem.getInstance() NPEs when 'anyController is null' — happens on the playground VM where cgroup2 is mounted with no controllers visible to the container. The NPE killed RootAllocator init and every query returned 'Could not initialize class org.apache.drill.exec.memory.RootAllocator' with no other visible output beyond the JVM picking up the _JAVA_OPTIONS env line. Turning off the JVM's container-aware sysinfo path with -XX:-UseContainerSupport skips the broken code; SELECT now works end-to-end (verified: SELECT 1 -> '1 row selected (1.105 seconds)'). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The original runner used '.mode trash' to keep timing parsing clean by throwing away result rows. Under the playground that yielded an empty result body even when the query succeeded. '.mode box' renders a readable table; the 'Run Time:' line still matches the timing regex. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…nvisible to parallel/sh GNU parallel runs each job under /bin/sh (not bash) by default, and 'export -f ingest_chunk' only carries the function into bash children. The chunks were silently routed into a non-existent command name, parallel exited 0, the load took 4700+ s, and the table came back with 0 rows. Inline the awk + curl pipeline as parallel's literal command string so it's interpreted directly by /bin/sh. Add curl --fail --show-error so an HTTP error from /api/v1/ingest now propagates to the load script's exit code. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…es.sql The error-detection branch in quickwit/query only looked at .error / .status. Quickwit returns the JSON-parse failure as {"message": "expected value at line 1 column 1"} which the old check missed, so the playground recorded the failed query as a success. Add .message + a 'no .took' fallback so any shape of malformed response surfaces as exit 1. Also rename the workload file from queries.json to queries.sql (removing the cosmetic SQL one that was sitting alongside) so the playground UI picks it up via the standard handle_queries path. Quickwit consumes Elasticsearch DSL JSON; the .sql name is just the cross-system convention for the file the playground reads. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…starts tiup playground generates a fresh data dir per invocation if no --tag is given. Pre-snapshot stop killed the loaded cluster and the subsequent pre-snapshot start spun up a brand new one; the snapshot captured the empty replacement and queries against the restored VM returned 'Table test.hits doesn't exist'. Pin --tag clickbench so the load survives stop/start. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Post-snapshot-restore, ES responds on :9200 but shards are still recovering. Queries land before allocation completes and fail with no_shard_available_action_exception (status 503). Make start/check poll _cluster/health/hits and require active shards before returning success. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

gizmosql_client is a DuckDB-cli fork. With stdout piped, DuckDB-cli truncates any table taller than the default page to a "<N> rows (<M> columns)" summary — even under .mode box. Setting .maxrows -1 and .maxwidth 0 disables both axes of truncation so the user sees the actual rows. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…nymous shim trino:latest no longer ships the legacy hadoop-S3 plugin (removed in v461). The replacement native-S3 filesystem has no anonymous-creds mode, so it can't read the public clickhouse-public-datasets bucket even with region/endpoint set — the URI is rejected outright with 'External location is not a valid file system URI: s3://...'. Pin trino:455 (last release with hadoop-S3) and restore the fs.hadoop.enabled=true + S3AnonymousProvider shim path that was working until the recent :latest bump. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The recent 16 -> 96 GiB override was unfair to every other engine. Revert it; do what we did for the dataframe systems instead: - drop the docker --memory=128g cap (raise it to --memory=256g to allow swap-backed growth), keep --memory-swap=-1, add --memory-swappiness=100 so the cgroup pages out anon memory aggressively the moment we exceed physical RAM - flip the guest's vm.overcommit_memory to 1 and vm.swappiness to 100 inside ./start so the kernel stops refusing the large mmap requests Umbra issues during COPY Removes MEM_OVERRIDES_MIB and the vm_manager plumbing for it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The AWS-SDK + Hadoop-jar name-glob no longer matches anything in trino:455 (the dependency tree shifted between releases), so the S3AnonymousProvider compile dies with 'package com.amazonaws.auth does not exist'. Always use the full /usr/lib/trino/**/*.jar classpath; the shim has no class-name collisions to worry about. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Druid's JVMs survive a snapshot restore but the SQL stack stays dead for 10+ minutes — likely ZK session skew across the snapshot boundary. The old druid/start only checked /status (router up fast), so it returned 'idempotent: nothing to do' and queries kept landing on the broken SQL endpoint. - druid/start: probe SELECT 1 with a 5s curl, and on failure pkill -KILL every druid JVM and cold-start the stack. - druid/check already uses SELECT 1 so it's the right gate. Independently, even with start fixed, /ready was reporting ready=true throughout the post-restore window because _daemon_started.is_set() is restored from the snapshot's Python memory. The host's _wait_for_daemon_ready passed instantly, /query landed mid-rebuild, and the 60s host budget fired. Fix: - add a btime watcher thread that calls _maybe_reconcile_for_restore every second, so the moment the VM resumes the watcher clears _daemon_started and spawns _ensure_daemon_started off-thread. - /ready also calls _maybe_reconcile_for_restore so a host probe can't beat the watcher. - _maybe_reconcile_for_restore now kicks _ensure_daemon_started in a thread itself (it was previously synchronous-only from /query; the watcher must not block). - bump _ensure_daemon_started's check loop from 60s to 10 min so slow daemons (Druid, Doris, Pinot) actually reach pass before /ready flips. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Parseable filters every query by [startTime, endTime] against the row's ingest timestamp. The benchmark script used today's calendar day, which is fine in a one-shot run-on-the-day-you-loaded-it benchmark — but in the playground we load during provisioning, snapshot the result, and then queries run hours-to-days later. Every row falls outside today's window and the result is always zero. Use [2000, 2099] so any plausible load + query date is included. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

heavyai/check sent 'SELECT 1' over omnisql stdin. omnisql 5.10.2 parses that as incomplete, exits with 'Missing semicolon at end of SQL command.' without ever contacting the daemon, and the agent's check loop spins for the full 900 s. Add the ';'. oxla's only public docker image (public.ecr.aws/oxla/release) was de-listed; the repo no longer surfaces in the ECR public gallery and there's no replacement on Docker Hub or GitHub Releases. Drop it from the catalog (alongside sirius). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

postgresql-orioledb: - Park PGDATA on the per-VM sysdisk instead of the container's overlay layer (which lives on the 200 GiB rootfs). The orioledb undo log doubles the write footprint of the base table and blew up at line ~70M of hits.tsv. - Bump the sysdisk for this engine to 400 GiB via a new SYSDISK_OVERRIDES_GB hook in systems.py. The image is sparse so physical cost is what postgres actually writes. - Rootfs is left at 200 GiB — build-system-rootfs.sh clones the base via sparse-cp with no resize2fs, so a rootfs override would need a deeper change. Moving PGDATA to sysdisk sidesteps that. UI: - Hovering a slab in the top system picker now highlights the matching row in the competition leaderboard, so the user can scan from picker to result without losing context. New .slab-hover CSS class toggled via mouseenter/mouseleave. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ooter The image's sqlline used to print '(N rows in X.YYY seconds)' below the result rows; current builds print 'N row(s) selected (X.YYY seconds)' instead. Our grep matched only the old form, so the result body kept the summary line and the timing extractor returned empty, failing every query with 'no marker in drill output'. Match either form for stripping, and pull the timing from any '(X.YYY seconds)' suffix. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

tiup playground does not reuse the data dir across restarts even with --tag — each invocation initialises a fresh cluster, drops PD metadata about previously-stored TiKV regions, and the test.hits table becomes invisible. The agent's normal pre-snapshot stop-then-start cycle therefore destroys the data tidb-lightning just spent an hour loading. Mark .preserve-state so the snapshot captures TiDB running as-is (no stop/start cycle around the snapshot), and the restored VM resumes with the table intact. The post-restore btime watcher still re-runs ./start, which is idempotent (returns early when MySQL on :4000 already responds), so this remains compatible with the docker-reconcile path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

mongosh routes console.error() through its own log formatter rather than to process.stderr the way Node REPL does, so the elapsed time the eval block was printing never reached the agent's _extract_script_timing(stderr) parser. The UI's Time: column was empty for every mongo query. Wrap the mongosh invocation in shell-side date arithmetic and emit the seconds to stderr ourselves. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Previous attempt set --memory=256g --memory-swap=-1 --memory-swappiness=100, but on cgroup v2 the swappiness flag is silently discarded and any --memory cap creates a hard cgroup ceiling that the kernel will OOM on regardless of swap. Let Umbra run with no docker memory cgroup and rely on the host kernel + 256 GiB swap drive. Also raise vm.max_map_count to 1048576 — Umbra issues many small mmaps for its memory-mapped storage and a 100M-row COPY blows past the 65530 default well before any OOM-killer fires. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… binary The trino:455 image ships no /usr/bin/find, so the previous 'find /usr/lib/trino -name "*.jar"' classpath collector silently returned empty and javac failed with 'package com.amazonaws.auth does not exist'. Use a brace-glob over the two specific HDFS-plugin jars (aws-java-sdk-core and hadoop-apache) and match either the legacy 'com.amazonaws_' / 'io.trino.hadoop_' name prefix used by older Trino builds or the bare modern name. Tested: javac produces S3AnonymousProvider.class against /usr/lib/trino/plugin/hive/hdfs/aws-java-sdk-core-1.12.770.jar /usr/lib/trino/plugin/hive/hdfs/hadoop-apache-3.3.5-3.jar Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

omnisci/core-os-cpu:v5.10.2 ships with an empty allowed-import-paths, so the load script's COPY hits FROM '/tmp/hits.csv' fails with 'File or directory path "/tmp/hits.csv" is not whitelisted.' Drop an omnisci.conf with [/tmp/] on the allowlist into heavyai-storage before launching the container — the startomnisci wrapper picks it up automatically. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

tursodb has been panicking partway through .import: thread 'main' panicked at core/storage/sqlite3_ondisk.rs:818:5: assertion failed: !*syncing.borrow() note: run with `RUST_BACKTRACE=1` environment variable ... The note speaks for itself. Set RUST_BACKTRACE=1 so the panic line in the provision log (and any UI-facing panic from /query) ships with a call stack for the upstream bug report. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

SHOW BACKENDS TSV columns are 1 BackendId 2 IP 3 HeartbeatPort 4 BePort 5 HttpPort 6 BrpcPort 7 LastStartTime 8 LastHeartbeat 9 Alive ... We were inspecting column 10 (SystemDecommissioned), which is always "false" once the BE is registered — so the wait loop in ./start timed out even when the backend was alive and serving. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

alexey-milovidov and others added 30 commits May 14, 2026 20:22

ClickBench/chdb-parquet-partitioned: use bare "hits_*.parquet" in que…

6c19356

…ries.sql

ClickBench/mongodb: pin to 7.0 to avoid Linux >= 6.19 refusal

6d5ef9e

mongod 8.0.23 hardcodes "MongoDB cannot start: Linux kernel versions 6.19 and newer" and exits — the playground's guest kernel is 7.0. The 7.0 release line predates that check and runs fine.

ClickBench/presto*: print query rows instead of discarding them

6fc224d

The CLI was invoked with --output-format=NULL, which suppresses the result. Switch to ALIGNED so the playground shows actual rows. The benchmark still gets its timing from the wall-clock around the java -jar invocation.

alexey-milovidov and others added 28 commits May 14, 2026 20:22

alexey-milovidov self-assigned this May 14, 2026

alexey-milovidov merged commit 65fc071 into main May 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes for various systems#903

Fixes for various systems#903
alexey-milovidov merged 67 commits into
mainfrom
clickbench-fixes

alexey-milovidov commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alexey-milovidov commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant