Skip to content

Add Hyrise#883

Open
alexey-milovidov wants to merge 2 commits intomainfrom
add-hyrise
Open

Add Hyrise#883
alexey-milovidov wants to merge 2 commits intomainfrom
add-hyrise

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

Summary

  • Add a ClickBench entry for Hyrise, a research in-memory column-oriented DBMS from HPI.
  • Hyrise has no upstream binaries and needs gcc-15 / clang-20, so the build runs inside a multi-stage Dockerfile (ubuntu:25.04 builder → slim runtime image with hyriseServer + libhyrise_impl.so + libjemalloc.so.2).
  • Load via COPY hits FROM '/data/hits.csv' WITH (FORMAT CSV); column types come from hits.csv.json placed next to the data file (CREATE TABLE followed by COPY hits an internal Hyrise assertion). Data size is reported from meta_segments.estimated_size_in_bytes since Hyrise has no on-disk persistence.
  • Hyrise's SQL is limited: no LENGTH, REGEXP_REPLACE, DATE_TRUNC, or OFFSET. run.sh now keys off psql's exit code so those queries (Q28, Q29, Q39–Q43) are recorded as null instead of stealing the timing line that psql still prints.

Closes #751

Test plan

  • docker build produces a working image; hyriseServer starts and accepts psql connections
  • Loaded a 1000-row CSV slice; SELECT COUNT(*), MIN/MAX(EventDate), AVG(UserID), COUNT(DISTINCT UserID), EXTRACT(MINUTE FROM EventTime) all return expected results
  • All 43 queries run end-to-end: 36 produce timings, 7 produce null (Q28/29 — unsupported functions, Q39–Q43 — OFFSET/DATE_TRUNC)
  • Output format matches ClickBench expectations: Load time: line, Data size: line, 43 lines of [t1,t2,t3],
  • Full 100M-row run on a c6a-class machine — needs to be done by a maintainer

🤖 Generated with Claude Code

alexey-milovidov and others added 2 commits May 8, 2026 20:56
Closes #751

Hyrise is a research in-memory column-oriented database from HPI
(https://github.com/hyrise/hyrise). It implements the PostgreSQL wire
protocol, so the benchmark connects via psql and uses Hyrise's
COPY ... WITH (FORMAT CSV) to load the standard ClickBench CSV dataset.

The system is built from source via Hyrise's install_dependencies.sh and
cmake/ninja; install_dependencies.sh requires Ubuntu 25.04 or newer.
Since Hyrise has no on-disk persistence, the data size is reported as the
total estimated segment size from the meta_segments meta table.

Hyrise has limited SQL coverage (no DATE/DATETIME types, no REGEXP_REPLACE,
no DATE_TRUNC). Queries that use unsupported functions are kept verbatim
and will be reported as null in the result file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the build into a multi-stage Dockerfile (ubuntu:25.04 + gcc-15) so the
benchmark works on any Ubuntu host without polluting it with Hyrise's
toolchain. The runtime image only carries hyriseServer, libhyrise_impl.so,
libjemalloc.so, and a small set of shared-library deps (~250 MB).

Build args:
- HYRISE_REF (default master) — pin a Hyrise revision
- NO_LTO (default FALSE) — toggle LTO for faster development builds

Loading: drop create.sql and use hits.csv.json next to the data file as the
schema source. CREATE TABLE followed by COPY trips a Hyrise assertion
("set_immutable() should not be called on an empty chunk", chunk.cpp:125)
because COPY tries to seal the empty chunk left by CREATE TABLE; letting
COPY auto-create the table from the CSV meta avoids the issue.

run.sh: detect failed queries via psql's exit code rather than grepping the
output, so errors like "Invalid input error: Could not resolve function
'LENGTH'" are recorded as null. Hyrise lacks LENGTH, REGEXP_REPLACE,
DATE_TRUNC, and OFFSET, so 7 queries (Q28, Q29, Q39-Q43) are reported as
null; the remaining 36 succeed.

Tested locally on arm64: docker build produced a working image, hyriseServer
accepts psql connections, COPY loads a 1000-row sample, and run.sh produces
the expected 43 lines of [t1,t2,t3] output with nulls in the right slots.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Hyrise

1 participant