Releases: Aryagorjipour/logdive
v0.3.0
What's new
Parenthesised query groups
OR and AND can now be combined with explicit grouping:
(level=error OR level=warn) AND service=payments
(service=payments OR service=auth) AND message contains "timeout"
level=error AND (tag=production OR tag=staging)
Case-insensitive level matching
level=ERROR, level=Error, and level=error all return the same rows and hit the same index. Implemented via a functional expression index (idx_level_norm ON log_entries(lower(level))). Existing databases pick it up automatically on first open — no migration step.
Benchmarked at 100k rows: all three case variants run in ~51 ms, identical within measurement noise.
Pagination
# CLI
logdive query 'level=error' --limit 50 --offset 100
# HTTP API
curl 'http://127.0.0.1:4000/query?q=level%3Derror&limit=50&offset=100'Distroless Docker runtime
Runtime stage changed from debian:bookworm-slim to gcr.io/distroless/cc-debian12:nonroot — no shell, no curl, no root (uid 65532).
The healthcheck no longer depends on curl. logdive-api now has a --health-check flag that TCP-connects to its own port and exits 0/1:
HEALTHCHECK CMD ["/usr/local/bin/logdive-api", "--health-check"]Breaking changes
| Area | Change | Migration |
|---|---|---|
| Library | execute() / execute_at() third parameter changed from Option<usize> to QueryOptions { limit, offset } |
Pass QueryOptions { limit: Some(n), offset: None } |
| CLI | logdive query --format renamed to --output |
Replace --format with --output in scripts |
| Docker | Healthcheck: curl GET /version → logdive-api --health-check |
Update compose / k8s probes |
Performance
Criterion benchmarks against a 100k-row SQLite index (Acer Nitro 5, Linux). Four new benchmark groups added in this release covering every new feature.
| Operation | Result |
|---|---|
| Indexed-field query, zero result | 23 µs |
| Indexed-field query, 25% match, LIMIT 1000 | ~49 ms |
| JSON field query, 25% match, LIMIT 1000 | ~4.1 ms |
| JSON field query, 0% match — full table scan | ~69 ms |
CONTAINS full-table scan |
~35–38 ms |
OR query, 2-branch 50% match |
~68 ms |
Parenthesised group (A OR B) AND C, 12.5% |
~45 ms |
| Case-insensitive level — all three case variants | ~51 ms (identical) |
| Pagination deep page overhead (offset 2450 vs 0) | +8 ms |
| Batched ingest, 10k rows | ~189k rows/s |
| Parse + ingest end-to-end, 10k rows | ~150k rows/s |
Binary sizes (release profile, stripped): 3.9 MB (logdive), 4.2 MB (logdive-api).
Run cargo bench to reproduce on your own hardware.
Test suite
417 tests across 9 test binaries — all pass.
Install
Cargo:
cargo install logdive # CLI
cargo install logdive-api # HTTP serverDocker (GHCR):
docker pull ghcr.io/aryagorjipour/logdive:0.3.0
docker run -p 4000:4000 -v "$PWD/data:/data" ghcr.io/aryagorjipour/logdive:0.3.0Prebuilt binaries: see assets below.
What's next — v0.4.0
Performance focus: extend benchmark suite to 500k rows, query latency improvements on the indexed path. Also planned: --output yaml / --output csv, Windows --follow, configurable retention by source tag.
v0.2.1 — Hardening
Install
cargo install logdive logdive-api
Or grab a prebuilt binary for Linux x86_64 / macOS arm64 from the assets below.
Security tests (H1)
10 new tests in crates/core/tests/security.rs:
- SQL injection via field name — tokenizer rejects ', ;, Unicode lookalikes (U+2019) before SQL is generated
- SQL injection via value — bound parameters prevent DROP TABLE and 1=1 tautology payloads
- LIKE wildcard escaping — _ and \ in contains queries match literally, not as SQL wildcards
- Resource exhaustion — 1 000-disjunct OR query completes without stack overflow; 10 MB raw line ingested without panic or OOM
Functional tests (H2)
28 new tests across 7 suites:
- Property-based (proptest): arbitrary input never panics; valid equality queries produce correct ASTs; OR disjunct count matches input
- Cross-format dedup: same raw line ingested twice → one row; JSON vs. logfmt with identical logical content → two distinct rows
- Concurrent ingest: two logdive ingest processes on the same DB produce no corruption and dedup is respected
- Parser edge cases: UTF-8 BOM rejected; deeply nested object in known field preserved; whitespace in field values preserved verbatim
- Time-range: space-separator datetime accepted; boundary row at cutoff included (>=); far-future timestamp returns empty; +00:00 equivalent to Z
- Follow mode: file deleted after open returns Ok([]), not error; burst of appended lines read completely in one call
- API integration: limit > match count returns all matches; contains operator; since time-range; CORS preflight returns access-control-allow-origin: *; raw field present in every response entry
- Prune boundary: entry at cutoff not deleted; idempotent second prune deletes nothing
Supply-chain hardening (H3)
- Cargo.lock tracked for reproducible builds and deterministic audit scans
- deny.toml: license allowlist (MIT / Apache-2.0 / BSD-2-Clause / BSD-3-Clause / ISC / CC0-1.0 / Unlicense / Zlib / BSL-1.0 and variants), RustSec advisory checks (vulnerability = deny, unsound = deny), crates.io-only source policy
- scripts/audit.sh: cargo-audit runner
- scripts/sbom.sh: CycloneDX JSON SBOM via cargo-cyclonedx
- .github/workflows/audit.yml: daily advisory scan + cargo deny check (informational)
- .github/workflows/ci.yml: permissions: contents: read added; cargo deny check added to lint job (merge-blocking)
Performance fixes
- entry_to_json_string in logdive-api now uses serde_json::to_string(&entry) directly — eliminates an O(fields) heap allocation per HTTP response row
- LogEntry::with_tag signature changed from Option to Option<&str> — eliminates a String clone per ingested entry
v0.2.0
Install
From crates.io
cargo install logdive logdive-api
[0.2.0] - 2026-05-15
Added
-
M6 — Docker image + multi-arch
- Multi-stage
Dockerfile(cargo-chef caching,debian:bookworm-slim
runtime) publishing bothlogdiveandlogdive-apibinaries in a
single image. - Default
ENTRYPOINT ["logdive-api"]; CLI accessible via
--entrypoint logdive. ENV LOGDIVE_DB=/data/index.dbandENV LOGDIVE_API_HOST=0.0.0.0
set sane container defaults without modifying binary source.VOLUME ["/data"]andEXPOSE 4000declared.HEALTHCHECKonGET /version(30 s interval, 5 s start period).- Non-root system user
logdive(UID/GID 1000). - GitHub Actions workflow (
.github/workflows/docker.yml):linux/amd64linux/arm64viadocker buildx+ QEMU; GHA cache (type=gha,
mode=min) for BuildKit layers; GHCR push viaGITHUB_TOKEN(no PAT);
semver tags onv*push, branch tags onmain/release/v*,
build-only (no cache write) on PRs.
logdive-apiauto-creates an empty index with initialized schema on
first run when the database file is absent, including any missing parent
directories. Genuinely bad paths still surface as startup failures.
- Multi-stage
-
M5 — API capability endpoints + CORS
GET /versionendpoint onlogdive-apireturningversion,
formats(ingest formats the binary was compiled with), and
capabilities(available endpoint names) as a JSON object — designed
for client-side feature detection.--cors-originsflag onlogdive-api(env:LOGDIVE_API_CORS_ORIGINS)
accepting a comma-separated list of allowed origins. Defaults to
disabled (same-origin only). Use*as the sole value to allow any
origin. Invalid values or mixing*with specific origins cause a
fast startup error.tower-httpCorsLayerwired into the router when origins are
configured; GET-only, no credentials, preflight handled automatically.LogFormat::ALLconst onlogdive-core'sLogFormatenum exposing
all supported ingest format variants.
-
M4 —
prunesubcommand +LOGDIVE_DBenv varlogdive prunesubcommand removes entries older than a given duration
(--older-than 30d) or before a specific datetime (--before 2026-01-01); the two flags are mutually exclusive and exactly one is
required.- Interactive
[y/N]confirmation by default (shows the row count to be
deleted);--yesbypasses. LOGDIVE_DBenvironment variable accepted on the global--dbflag
for bothlogdiveandlogdive-api; CLI flag takes precedence when
both are provided.
-
M3 — Follow mode
logdive ingest --followtails a file for new lines, similar to
tail -f. Detects log rotation (inode/device change) and truncation
and reopens the file automatically.- Uses
notify6.1 for cross-platform filesystem events andctrlc3.5
for clean Ctrl-C shutdown. CLI remains fully synchronous — no Tokio
dependency added. - Starts at end-of-file (follow semantics, not from-start).
--followrequires--file; rejected at parse time when used with
stdin.
-
M2 — logfmt and plain-text ingestion
--format json|logfmt|plainflag onlogdive ingest(defaultjson).- logfmt parser: hand-written tokenizer supporting bareword booleans,
escaped quotes, hyphenated/dotted keys, and last-write-wins on
duplicate keys. - Plain parser: entire line becomes
message; no timestamp or level
extraction. --timestamp-nowflag assigns RFC 3339 UTC timestamps to entries that
lack one, applicable to all three formats.
-
M1 — OR operator in query language
- Query language extended to support
ORbetweenAND-groups:
level=error OR level=warn. ANDbinds tighter thanOR; SQL generation always parenthesises
each AND-group:WHERE (a=? AND b=?) OR (c=?).
- Query language extended to support
Changed
build_routerinlogdive-apigains acors_origins: Vec<HeaderValue>
parameter (M5). Integration tests updated to passvec.
Breaking (library)
QueryNode::And(Vec<Clause>)replaced byQueryNode::Or(Vec<AndGroup>)
whereAndGroup { clauses: Vec<Clause> }(M1). Even single-clause
queries are wrapped in the two-level structure.parse_linesignature changed from(line: &str)to
(format: LogFormat, line: &str)(M2).
v0.1.0
Install
From crates.io
cargo install logdive logdive-api
[0.1.0] - 2026-04-19
Added
logdive ingest— ingest structured JSON logs from a file or stdin into
a local SQLite index (~/.logdive/index.dbby default).logdive query— query the index with a typed expression language
supporting=,!=,>,<,contains,last <duration>, and
since <datetime>operators combined withAND.logdive stats— display index metadata (entry count, time range, tags,
DB size).logdive-api— read-only HTTP API server withGET /query(NDJSON) and
GET /stats(JSON).- SQLite-backed storage via
rusqlite(bundled); hybrid schema with fixed
columns for known fields and a JSON blob for unknown fields queryable via
json_extract(). blake3row hashing for deduplication viaINSERT OR IGNORE.- Dual MIT OR Apache-2.0 license.