Skip to content

Add Frigatebird#906

Open
alexey-milovidov wants to merge 1 commit into
mainfrom
add-frigatebird
Open

Add Frigatebird#906
alexey-milovidov wants to merge 1 commit into
mainfrom
add-frigatebird

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

Summary

  • Adds a frigatebird/ ClickBench recipe for Frigatebird, an embedded columnar SQL database written in Rust (push-based Volcano execution, morsel parallelism, LZ4 + O_DIRECT storage).
  • ./load streams hits.parquet through a small pyarrow script (parquet_to_inserts.py) into the Frigatebird REPL as batched INSERT INTO hits VALUES (...) statements — Frigatebird has no COPY / Parquet / CSV ingest path.
  • create.sql collapses all integer widths to BIGINT and DATE to TIMESTAMP (Frigatebird's type system has no narrower forms), and uses the mandatory ORDER BY (CounterID, EventDate, UserID, EventTime, WatchID).
  • ./query measures runtime with bash built-in time since the CLI has no built-in timer.

Notes

  • parquet_to_inserts.py emits negative integers as quoted strings to work around Frigatebird's INSERT planner rejecting UnaryOp { Minus, Number } literals; the column-type coercion path parses them back to i64.
  • Frigatebird's SQL surface doesn't include EXTRACT, REGEXP_REPLACE, LENGTH/STRLEN, CASE, etc., so several queries will fail at parse/plan time and land as null in the results JSON.
  • In smoke testing, Frigatebird's TEXT decompressor panics with failed to decompress page payload: string is not valid utf8 on the non-UTF-8 bytes that the hits dataset's text columns contain. The recipe is wired up so the upstream behaviour on the full dataset is reproducible; expect many or all queries to be null until upstream stabilises ingest/scan for non-UTF-8 strings.

Resolves #809

Test plan

  • ./install && ./benchmark.sh on a fresh Ubuntu 24.04 VM
  • Confirm any queries that succeed have plausible timings; remaining queries surface as null

Frigatebird (https://github.com/Frigatebird-db/frigatebird) is an
embedded columnar SQL database in Rust. It ingests only via
INSERT ... VALUES, so ./load streams hits.parquet through
parquet_to_inserts.py (pyarrow) as batched INSERTs into the REPL.

Per the README, expect many queries to show up as null: Frigatebird's
SQL surface lacks EXTRACT/REGEXP_REPLACE/LENGTH/CASE, and its TEXT
decompressor panics on the non-UTF-8 bytes in the hits dataset.

Resolves #809
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add FrigateBird

1 participant