OpenTelemetry logs, traces, and metrics stored in DuckLake, visualized in Grafana.
canardstack is an experimental project that makes it possible to stream OpenTelemetry data to DuckLake, a lakehouse standard from the creators of duckdb. The project goal is to explore cheap and simple ways to query terabytes of observability data from a single node stored in open formats on object storage.
Builds on prior work from otlp2parquet, otlp2pipeline, and duckdb-otlp.
- Quickstart
- Demo
- What You Can Do
- Differences From ClickStack
- Send Telemetry
- Query Data
- Deployment
- Architecture
- Operations
- Limits
- For Developers
- Documentation
Install and start canardstack. With no options, it uses local DuckLake storage
under .canardstack and listens for OTLP data on 127.0.0.1:4318.
# requires rust toolchain: `curl https://sh.rustup.rs -sSf | sh`
cargo install --locked canardstack
# starts server on :4318
canardstackIn another terminal, send one OTLP/HTTP JSON log:
OTLP_TIME_UNIX_NANO="$(date +%s)000000000"
curl -sS -X POST http://127.0.0.1:4318/v1/logs \
-H 'Authorization: Bearer dev-canardstack-key' \
-H 'Content-Type: application/json' \
--data "{\"resourceLogs\":[{\"resource\":{\"attributes\":[{\"key\":\"service.name\",\"value\":{\"stringValue\":\"quickstart\"}}]},\"scopeLogs\":[{\"logRecords\":[{\"timeUnixNano\":\"${OTLP_TIME_UNIX_NANO}\",\"body\":{\"stringValue\":\"hello world\"}}]}]}]}"canardstack acknowledges ingest after the raw request is fsynced locally. Give the scheduler a few seconds to register the log with the DuckLake catalog, then query it through the Loki-compatible API:
curl -sS -H 'Authorization: Bearer dev-canardstack-key' \
'http://127.0.0.1:4318/loki/api/v1/query?query=%7Bservice_name%3D%22quickstart%22%7D'Run canardstack with the full OpenTelemetry demo using the demo guide.
Use canardstack to:
- Receive OTLP/HTTP logs, traces, gauge metrics, and sum metrics.
- Store normalized telemetry in DuckLake-backed DuckDB tables.
- Inspect data in Grafana through Prometheus, Loki, and Tempo-compatible APIs.
- Query the same DuckLake data directly from DuckDB, MotherDuck, or another SQL client.
- Run local experiments with a single Rust binary and one DuckDB process.
canardstack is best suited for local, single-tenant, or experimental deployments where the operator wants direct access to lakehouse telemetry data and can accept the current v0 durability and compatibility limits.
ClickStack is the production-grade observability stack built around ClickHouse, HyperDX, and an OpenTelemetry Collector.
canardstack is a narrower experiment with different tradeoffs:
- Storage is DuckLake over DuckDB, not ClickHouse. Telemetry lands in open DuckLake tables backed by Parquet data files, so DuckDB-native clients can inspect the same data directly.
- The Grafana-facing APIs are compatibility adapters, not the primary query path. canardstack implements bounded Prometheus, Loki, and Tempo subsets; it does not try to match ClickStack's HyperDX UI or query experience.
- Deployment is intentionally small: one Rust binary, one synchronous HTTP server, one DuckDB process per role, and no async runtime, Kafka, or separate hot store.
- Ingest durability is local-spool-first and at-least-once. A
2xxmeans the raw request was fsynced and accepted for bounded processing, not that rows are already query-visible. - Direct SQL access. Local clients can attach the DuckLake catalog directly, and cloud deployments can expose the catalog over Quack for DuckDB-native clients when the operator chooses to manage that access boundary.
- The scope is intentionally single-tenant and experimental.
Configure OTLP/HTTP producers and OpenTelemetry Collectors with the send telemetry guide.
The current high-level data-flow diagram lives in the site architecture guide.
Operator notes, configuration guidance, diagnostics, and failure response runbooks live in the operations docs.
canardstack is experimental and not production-ready.
Known v0 limits:
- Current single-node throughput is bounded by raw-spool append and backlog
behavior. On May 20, 2026, the highest clean 10-minute mixed-signal run was
2000 GB/daywith--ingest-concurrency 64(23.1 MB/saccepted decoded throughput, no429/503or query failures). A2500 GB/daymixed run reached Vector-like log event rates briefly, but failed the 10-minute guardrail with429queue-pressure responses after roughly eight minutes. - No exactly-once ingest acknowledgement. A crash after
2xxshould replay a fsynced raw-spool record if it was not checkpointed, but duplicate replay can occur when storage commit succeeds before raw-spool checkpoint. - No OTLP/gRPC endpoint. Use an OpenTelemetry Collector if your clients need gRPC.
- No histograms or exponential histograms.
- No multi-tenancy.
- No full PromQL, LogQL, TraceQL, Prometheus, Loki, or Tempo implementation.
- No arbitrary SQL through compatibility APIs.
- No sub-second freshness target.
Contributor setup and implementation details live in docs/developer.md. Start there when changing canardstack itself.
- Developer guide
- Deployment overview
- V0 architecture
- Storage schema
- Query API
- Operator metrics
- Operations
- Benchmarking
- Benchmark gates
- Failure runbooks
Thanks to @hanorigins, Tyler Hillery, and @decalek from the DuckDB Discord for starting a discussion that led to this proof of concept.