Skip to content

Duckle v0.1.0 (Hotfix)

Choose a tag to compare

@github-actions github-actions released this 28 May 07:34
· 291 commits to main since this release

Hotfix for v0.1.0. Same feature set as v0.1.0 (60 UI languages, DuckDB Quack, xf.fill_backward, full UI i18n coverage), plus a sweep of engine fixes and performance work that came out of real user reports.

Performance

  • Batched single-CLI-spawn execution. Multi-stage SQL pipelines now run as one duckdb.exe invocation instead of one spawn per stage. Per-stage progress events still arrive in real time via NDJSON marker files each stage writes. Measured 4.5x speedup on fixed overhead on Windows (5-stage pipeline: 245 ms → 56 ms). Driver-backed stages (Oracle, SQL Server, Kafka, Mongo, REST, AI components) and ctl.* hooks transparently drop to the per-stage path.
  • DuckDB PRAGMA preset on every run: preserve_insertion_order=false + enable_object_cache=true + enable_progress_bar=false. Halves wall time on Parquet writes and avoids re-reading file metadata between stages that hit the same source.
  • Lazy materialization: single-consumer pure-SQL transforms now build CREATE OR REPLACE VIEW instead of TABLE. DuckDB inlines them into the downstream query and gets predicate / projection pushdown into the source read. 2-5x speedup on linear pipelines.
  • Streaming NDJSON writer in Oracle / SQL Server sources. A 1 M-row x 37-col Oracle pull used to peak at ~30 GB resident set; it now stays at O(64 KiB) regardless of row count.
  • Workspace memory knobs: DUCKLE_MEMORY_LIMIT, DUCKLE_THREADS, DUCKLE_TEMP_DIR set workspace-wide caps without touching every stage.

Correctness

  • #4 - src.oracle wide-table data loss + slow fetch. Two root causes: a try-String-then-i64-then-f64 cascade silently NULLed DATE / TIMESTAMP / BLOB columns, and Oracle's default prefetch_rows = 1 meant 10 k rows = 10 k network round trips. Fix: type-aware dispatcher + prefetch_rows(1000).
  • src.sqlserver data loss on DATE / DATETIME / DECIMAL / BINARY / GUID. Same correctness pattern as #4, applied to tiberius.
  • Joins: composite keys, ambiguous-column dedupe (USING / EXCLUDE instead of JOIN ... ON that produces both keys), NULL-safe anti-join (NOT EXISTS instead of NOT IN which silently drops on NULLs).
  • UNION / INTERSECT: now BY NAME to dodge silent positional-column corruption when two upstreams have the same columns in different orders.
  • xf.assert fires reliably on Windows release builds (CTE materialization pattern; the optimizer was previously pruning the error() branch under aggressive pruning).
  • Window functions error at planner time when ORDER BY is missing for rank / lead / lag / etc., instead of producing nondeterministic output.
  • Column-existence validation: stages reference upstream columns get a clear "did you mean ..." error at compile time instead of a runtime "column not found" deep in the SQL plan.
  • xf.transpose / xf.pivot: dynamic PIVOT cannot live in a view in DuckDB 1.5; planner now forces TABLE materialization for those two components regardless of consumer count (regression introduced by lazy materialization).
  • Cast guards: xf.cast errors at planner time on empty cast entries or duplicate target columns.
  • arr.contains NULL-safe (COALESCE(list_contains(...), FALSE)).

Features

  • xf.row_hash + xf.audit: CDC / provenance primitives. row_hash computes MD5 / SHA-256 over a chosen subset of columns; audit appends inserted_at / updated_at / source columns.
  • xf.fill_constant: fill nulls with a literal (companion to fill_forward and fill_backward).
  • src.csv dateFormat / timestampFormat props: override DuckDB's auto-detection per column.

Fixes from v0.1.0 (carried forward)

  • #2 - DUCKLE_DUCKDB_BIN not set on REST-shaped sources (Oracle, SQL Server, Snowflake, Databricks, Synapse, BigQuery, SaaS REST aliases). Fix: env::set_var in the Tauri setup hook so both the in-process OnceLock and the helper's env-var lookup see the path.
  • #3 - CSV declared schema ignored at execution time. Setting a column to VARCHAR in the Schema panel now actually emits columns = {...} in the generated read_csv_auto(...).

Binaries (6 platforms)

Platform File
Windows x64 Duckle-windows-x64.exe
Windows arm64 Duckle-windows-arm64.exe
Linux x64 Duckle-linux-x64
Linux arm64 Duckle-linux-arm64
macOS arm64 (Apple Silicon) Duckle-macos-arm64
macOS x64 (Intel) Duckle-macos-x64

Same single-binary, engines-download-on-first-launch shape as v0.1.0.

Upgrade

If you already have v0.1.0 or the previous v0.1.0-hotfix running, swap the binary in place. The workspace folder and the engine cache at ~/.duckle/engines/duckdb/ (or the OS equivalent) are untouched.