Duckle v0.1.0 (Hotfix v2)
Builds on v0.1.0-hotfix. This release fixes the desktop app's export / clipboard features (broken on every platform), wires up the Runs and Plans tabs, and folds in a large engine performance + correctness sweep that came out of real user reports.
Updated 2026-05-28 - this build was re-rolled under the same tag with: a fix for database sources (Oracle / SQL Server / Mongo / ...) hanging on wide tables so the sink never wrote a file (#4); a fix for "0 rows written" being reported on large pipelines (per-node row counts now populate correctly); paginated sources now fail loudly instead of silently truncating at the maxPages cap; plus desktop UI fixes (stale column pickers that could not be deselected, Plan-tab error surfacing, a column-validator false-positive). If you downloaded an earlier copy of this tag, please re-download the binary below.
Updated 2026-05-29 - re-rolled again under the same tag (the tag itself is unchanged) with three more fixes: a large Filter / Validator performance fix - a node whose reject port is not wired no longer materializes the discarded rows, so a 10M-row filter that keeps 2M now runs in under a second instead of ~16s (it used to write the other 8M rejected rows to disk for nothing); database sinks (SQL Server / Oracle) no longer error with "output path is required" and now auto-create the target table when it does not exist yet (#8, verified on live SQL Server 2022 + Oracle 21, for both new and existing tables); and src.oracle no longer loses precision on high-precision NUMBER columns (values past ~15 significant digits were silently rounded through a double). A follow-up extends the Filter / Validator performance fix to the case where the reject (or error) port IS wired: each output now becomes a lazy view when it has a single consumer, so a pipeline that routes the rejected rows to a sink no longer pays an intermediate 8M-row table write either (~17s to ~1.6s on that shape). Re-download the binary below for these.
Updated 2026-05-30 - re-rolled again under the same tag (the tag is unchanged) with a batch of engine speed + correctness fixes from a full component audit: control-flow nodes (wait / barrier / checkpoint / iterate / try / trigger / foreach) and conditional-split branches now compile to lazy views instead of copying the whole dataset, so a pass-through over 10M rows is milliseconds instead of ~12s;
src.csv/src.tsvschema declarations that cover only some of a file's columns now work instead of failing with a cryptic error; a Diff Detect with no compare columns now fails clearly instead of silently dropping every changed row; SCD1 and set operations (intersect / except) align columns by name so a different column order no longer corrupts values; Snowflake / Databricks reads use far less memory on large result sets; the Format / numeric / sort / add-column transforms reject malformed input cleanly instead of corrupting output or aborting the run; and de-duplicate (uniqueness), skip, and distinct gained an optional ordering so the rows they keep are reproducible run-to-run. Cloud (S3 / GCS / Azure) source + sink option parity and external-sink streaming are still in progress. Re-download the binary below for these.
Desktop app fixes
- Export & clipboard work again (all platforms). Copy SQL, Export SQL, Export JSON, and copy-node-id were no-ops in the packaged app because the webview can't use the browser clipboard or
<a download>. They now route through the Tauri clipboard plugin and a native save dialog. - Runs and Plans tabs show real data (#6). They were placeholder panels. The Run tab now lists per-node status, row counts, timings, and errors from the last run; the Plan tab live-compiles the pipeline and shows the DuckDB SQL for each step (with a Copy button and a clear compile-error panel). Run history was persisted correctly all along - it just was not rendered. (A dedicated top-level Schedules tab is still a follow-up; schedules remain reachable via the pipeline right-click menu.)
- Column pickers keep stale selections visible. In the aggregation, filter, and column fields, a selected column that was no longer present in the upstream schema used to disappear from the list, which made it impossible to see or remove. Stale picks now render as "(not in input)" so you can deselect and fix them.
- Plan tab surfaces compile errors. A pipeline that fails to compile now shows the actual error in the Plan tab instead of a blank panel, and the Run / Plan panels no longer render transparently over the canvas.
- Accurate row counts on large runs. On bigger tables the run summary could report "0 rows written" despite a successful run that wrote the file correctly. The batched executor was reading a per-stage count file in the brief window before DuckDB finished writing it; it now waits for the complete value. Per-node counts are listed individually, and the summary's "rows written" counts sink rows only (it previously summed source + transforms + sink).
- Compiled export includes procedural steps (#7). Driver sources/sinks (Oracle, REST, Kafka, ...) and
ctl.*control steps now appear in the exported SQL as descriptive comments instead of empty blocks. - External links open in the system browser - CI build links and the GitHub/GitLab token pages used APIs the webview ignored; they now use the Tauri opener.
- Clearer column errors - referencing a column that does not exist upstream (e.g. an
xf.distincton a missing column) now fails at compile time with a "column not found in upstream" message across 20+ transforms, instead of a cryptic runtime DuckDB binder error.
Engine performance
- Filter / Validator no longer materializes discarded rows. A Filter (or any
qa.*validator) split its input into a pass set and a reject set and always wrote the reject set to a temp table, even when nothing downstream consumed the reject port. On a 10M-row source kept down to 2M that wrote the other 8M rows to disk for nothing - the filter step alone took ~16s. The reject set is now materialized only when the reject port is actually wired; otherwise the step is a lazy view (10M -> 2M filter: ~16s -> under 1s, matching raw DuckDB). - Batched single-CLI-spawn execution. All-SQL pipelines run as one
duckdb.exeinvocation instead of one spawn per stage. Measured 4.5x faster on fixed overhead on Windows (5-stage pipeline: 245 ms -> 56 ms). Driver-backed stages andctl.*hooks fall back to the per-stage path automatically. - DuckDB PRAGMA preset (
preserve_insertion_order=false,enable_object_cache=true,enable_progress_bar=false) on every run. - Lazy materialization: single-consumer pure-SQL steps compile to
CREATE VIEWinstead ofTABLE, so DuckDB inlines them and gets predicate/projection pushdown. NewDUCKLE_FORCE_VIEWS=1forces views even for multi-consumer nodes (#5). - Streaming NDJSON source loads: a 1 M-row x 37-col Oracle/SQL Server pull now stays at O(64 KiB) resident set instead of peaking at ~30 GB.
- Workspace memory knobs:
DUCKLE_MEMORY_LIMIT,DUCKLE_THREADS,DUCKLE_TEMP_DIR.
Engine correctness
- #8 - loading into SQL Server / Oracle. The desktop wrongly required an output path for database / warehouse / broker sinks, and the driver sinks only ran INSERTs so a brand-new target table failed with "Invalid object name" / ORA-00942. Database sinks no longer require a path, and SQL Server / Oracle sinks now auto-create the target table (inferring column types from the upstream) when it does not exist. Verified on live SQL Server 2022 and Oracle 21 for both new and existing tables.
- src.oracle high-precision NUMBER: a scaled NUMBER was read through a 64-bit double, which only round-trips ~15 significant digits, so e.g. a NUMBER(38,12) silently lost its last digits. The exact value is now preserved when it would not survive a double; BINARY_DOUBLE / BINARY_FLOAT (true IEEE floats) are unchanged.
- #4 - database source on a wide table hung, sink wrote nothing: the per-stage executor previewed each node with
SELECT * ... LIMIT 100, but read the CLI's output only after it exited. A wide table's preview (e.g. a 36-column date dimension, ~128 KiB of JSON) overflowed the OS pipe buffer, so the CLI blocked writing while the engine blocked waiting - hanging the run on the source node's preview before the sink ever ran. The runner now drains output concurrently, so any width/size completes. This was width-specific, not Oracle-specific (a wide SQL Server table hit it too). - #4 - src.oracle wide-table data loss + slow fetch (earlier): type-aware value dispatch (no more silently-NULLed DATE/TIMESTAMP/BLOB) +
prefetch_rows(1000), plus session NLS normalization and a per-run liveness trace at%TEMP%/duckle-oracle-trace.log. - Paginated sources fail loudly on truncation: REST (cursor/page/offset/link), Qdrant, Weaviate, Milvus, DynamoDB, and Elasticsearch now return a clear error if they hit the
maxPagessafety cap with more data still upstream, instead of silently materializing a partial result. - src.sqlserver data loss on DATE/DATETIME/DECIMAL/BINARY/GUID - same correctness pattern.
- xf.cast honors its on-error setting (TRY_CAST vs CAST), and xf.addcol casts the new column to its declared type.
- Column validator no longer falsely rejects references to columns produced upstream by window / rank / row-number / other column-adding transforms.
- Joins: composite keys, ambiguous-column dedupe (USING / EXCLUDE), NULL-safe anti-join (NOT EXISTS).
- UNION / INTERSECT now BY NAME to dodge silent positional-column corruption.
- Window functions error at planner time when ORDER BY is missing.
- xf.transpose / xf.pivot no longer break under lazy materialization (dynamic PIVOT can't live in a view; forced to TABLE).
- arr.contains NULL-safe; xf.cast guards against empty/duplicate cast entries.
Features carried since v0.1.0
60 UI languages, DuckDB Quack remote protocol, xf.fill_backward / xf.fill_constant, xf.row_hash + xf.audit, src.csv dateFormat/timestampFormat, full UI i18n coverage. Plus the v0.1.0-hotfix fixes for #2 (DUCKLE_DUCKDB_BIN) and #3 (CSV declared schema).
Build
- The macOS Intel binary is now cross-compiled on the Apple-Silicon runner (
x86_64-apple-darwin), removing the dependency on GitHub's scarcemacos-13runners that previously stalled releases for hours.
Binaries (6 platforms)
| Platform | File |
|---|---|
| Windows x64 | Duckle-windows-x64.exe |
| Windows arm64 | Duckle-windows-arm64.exe |
| Linux x64 | Duckle-linux-x64 |
| Linux arm64 | Duckle-linux-arm64 |
| macOS arm64 (Apple Silicon) | Duckle-macos-arm64 |
| macOS x64 (Intel) | Duckle-macos-x64 |
Single-binary, engines-download-on-first-launch. Swap the binary in place over any earlier build; your workspace folder and the engine cache at ~/.duckle/engines/ are untouched.