Duckle v0.3.0
Duckle v0.3.0 - local-first, embedded ETL on DuckDB.
Patched since the initial v0.3.0 release
The binaries on this release have been refreshed. Re-download the binary for
your OS below to pick these up (the version number is unchanged).
2026-06-16 re-roll - workspace-relative paths, clearer auto-layout, materialization control, JSON zip
- Auto-layout now arranges the canvas by dependency depth (issue #36): nodes
flow left to right by their position in the graph, siblings stacked and
centered, with generous spacing so edges and connectors stay readable. The
previous layout placed every node on a single row and ignored the wiring. - Workspace-relative paths (issue #37): a built-in
${workspace}placeholder
(alias${projectroot}) resolves to the active workspace root, so source and
sink paths can be written relative to it and a whole workspace folder stays
portable when it is copied or moved. No context needs to be defined; it
resolves on the canvas, in schema autodetect, and in headless / scheduled runs. - Schema autodetect now resolves context variables before inspecting a source,
so a path (or any field) bound to a context variable can be detected, not just
a hand-typed literal. Previously autodetect sent the raw${...}placeholder
to the engine and could not find the file. - Per-stage materialization control on every node's Basic tab. Choose how a step
is stored: Auto (a view for a single consumer, a table when several steps read
it), View (always lazy), Memory (read once, held as a table buffered in RAM),
or Disk (read once, streamed through a temporary Parquet file to keep memory
low for very large intermediates). In addition, a source that feeds a filter
or quality validator which splits rows into pass and reject is now read once
automatically instead of being scanned twice. - New Zip Arrays to Table transform (Transform > Array). Turn a record that
carries a list of column names and a list of row-arrays into a normal table
with one column per name and one row per array - the common "headings + rows"
JSON shape, with no hand-written SQL. - The local webhook source no longer drops requests on macOS under load.
2026-06-15 re-roll - local accounts, faster context switching, corruption-safe workspaces
- Local multi-account profiles: create one or more named profiles (with an
optional picture), shown in the top-right corner, and switch between them for
quick context switching. Each profile remembers its own workspace folder, so
switching profiles swaps the whole project context in one click. Profiles are
stored only on this device, are never transmitted, and have no password -
they separate working contexts, they are not a security boundary. - A single corrupt workspace file no longer hides everything (issue #35):
if a context, connection or pipeline JSON file is hand-edited into invalid
JSON, Duckle now skips just that file and shows a banner naming it, instead of
silently failing to load the entire workspace. If a structural file
(duckle.json/repository.json) is invalid, the workspace stays
un-editable and is never overwritten, so the good files on disk are protected
until you fix or restore the broken one. - Account-switching fixes (all part of the new profiles feature):
- Switching to, or deleting back to, a profile that points at the workspace
already open no longer blanks the canvas down to the default sample. - On a cold start Duckle opens the active profile's workspace, not the last
globally used folder. - The account menu and editor are no longer clipped by the top bar, so the
dropdown opens and you can switch, add, edit and remove profiles.
- Switching to, or deleting back to, a profile that points at the workspace
- Project: a documentation site is now live at
https://duckle.org.
2026-06-14 re-roll (issue fixes)
- src.xml now captures text inside CDATA sections instead of skipping it, so a
value written by snk.xml (which uses CDATA for complex cells) round-trips back
correctly (issue #33). - A pipeline run via a schedule now resolves workspace context the same way the
canvas does, so a context-based value (for example an Oracle password stored
as a context variable) is substituted before the run. Previously the raw
placeholder reached the driver and a job that worked from the canvas failed
under a schedule with errors like ORA-01017 (issue #32).
2026-06-13 re-roll (UI follow-up)
- The run-failure banner in the Output panel now has a dismiss (X) button on
the right; it reappears on the next failed run. - The component properties panel always opens on the Basic tab. Switching to
Schema / Preview / Advanced / Validation is your choice and stays put while a
component is selected; picking a different component resets back to Basic.
2026-06-13 re-roll - full-codebase correctness audit (67 fixes)
A multi-pass review of the whole codebase (engine, connectors, transforms,
desktop, frontend) with every finding independently verified before fixing.
Binaries refreshed; re-download below to pick these up.
- Reliability and data safety:
- Cancelling a run now cancels only that run; a nested sub-pipeline (Iterate /
ForEach / Run Job / Parallelize) no longer resets or steals another run's cancel. - The DuckDB engine and the local AI model now download atomically (written to
a temp file and renamed into place), so an interrupted download can never
leave a half-written file that looks installed. - A partial "Run from here" no longer advances incremental / change-feed
watermarks, so a later full run cannot skip rows that a preview loaded but
never wrote to a sink. - In fast batched runs, a failing transform is now blamed on the correct node
instead of the downstream sink. - Autosave keeps a tab marked unsaved if the write actually failed, instead of
silently losing edits.
- Cancelling a run now cancels only that run; a nested sub-pipeline (Iterate /
- Formats and connectors:
- Cloud (S3 / GCS / Azure) sources and sinks now reject Avro / ORC clearly
instead of silently reading them as CSV or writing Parquet to a .avro/.orc path. - CSV "Windows-1252" encoding now works (it was previously rejected).
- Kafka "Initial offset" (earliest / latest) is now honoured; the default
reads the available backlog. - Snowflake and Databricks requests use the merged OS + bundled trust store
again (fixes a corporate-proxy / Zscaler TLS regression). The Snowflake sink
waits for an async statement to finish (no false success), REST page
pagination starts from the right page, and the Snowflake source handles
gzipped partitions and typed columns. - Identifier escaping hardened for ClickHouse, Cassandra and Oracle; the XML
sink emits only valid element names; the Mongo source no longer drops a row
when one value fails to convert.
- Cloud (S3 / GCS / Azure) sources and sinks now reject Avro / ORC clearly
- Transforms and SQL:
- Window aggregate (aggwin) with an Order by keeps the per-partition total
instead of silently becoming a running total. - INTERSECT / EXCEPT match by column name; NTILE uses the requested bucket
count; rank direction is parsed correctly. - Denormalize, array-collect and JSON array-agg now produce a deterministic
element order. - An aggregation on a named column with no function is rejected instead of
silently becoming a row count.
- Window aggregate (aggwin) with an Order by keeps the per-partition total
- Write-mode safety:
- A SQLite / DuckDB sink with an unrecognised write mode (a typo like "appnd")
now errors instead of dropping and recreating the table. - Upsert with blank conflict columns is rejected; a database port above 65535
is range-checked instead of wrapping to the wrong port.
- A SQLite / DuckDB sink with an unrecognised write mode (a typo like "appnd")
- Hardening:
- Closed several secret-leak paths in SQL export, the MCP server and git push
output; the per-workspace secret key file is created owner-only. - UTF-8 panic guards in CSV type sniffing and the Map node, a size cap on the
webhook body, temp-file cleanup, atomic config writes, and safer git
clone/checkout argument handling.
- Closed several secret-leak paths in SQL export, the MCP server and git push
2026-06-12 re-roll - secrets at rest, connector correctness, desktop reliability
- Secrets at rest: saved connection secrets (passwords, tokens, keys) and the
cached git token are now encrypted with a per-workspace AES-256-GCM key under
.duckle/keys/. Only secret fields are encrypted; host, database and user
names stay readable. The key is gitignored, so a committedconnections/
folder can be shared without exposing credentials. Existing plaintext values
are encrypted on the next save, and${ENV:...}placeholders are never
encrypted. - Connector correctness:
- Databricks sink no longer counts a still-running or failed write as
success: it inspects the statement state, polls it to completion and fails
loudly on error. - RabbitMQ source acknowledges messages only after the batch is durably
written, so a write failure leaves them queued for redelivery instead of
dropping them. - Webhook source answers 200 only after the batch is persisted (503 on
failure), so a sender never treats a never-stored event as delivered. - The Avro sink infers a nullable schema, so a null in any column no longer
aborts the whole file. - The XML sink splits a literal
]]>across two CDATA sections when writing
nested values, keeping the output well-formed. - The MongoDB sink delete propagation matches boolean and numeric flag
columns, not only strings. - Snowflake and Databricks delete propagation escapes backslashes in the
delete value so it matches the source value. - The Snowflake source errors clearly on an unnamed result column instead of
risking misaligned columns.
- Databricks sink no longer counts a still-running or failed write as
- Desktop reliability:
- A cached git token that cannot be decrypted (missing workspace key) now
reports a clear error instead of a confusing push-authentication failure. - The local model download integrity check no longer deletes a valid
download on a transient read error. - Auto Layout now persists the new node positions.
- A cached git token that cannot be decrypted (missing workspace key) now
2026-06-12 re-roll
- Engine correctness pass:
- Cloud (S3 / GCS / Azure / HTTP) JSON sources now honour the Records path
option and the 100 MB object-size cap, same as local JSON sources. - Inline dbt models with a non-identifier name (for example
my-model) no
longer fail with table-not-found on read-back; the written and read names
now always agree. - A Distinct node with an Order by but no key columns now errors instead of
silently dropping the ordering. - The GCP Pub/Sub source persists rows before acknowledging them, so a
failure no longer loses the batch. - A REST response of
{}now yields zero rows (like[]), not one empty row. - BigQuery ATTACH escapes its project / dataset values.
- Cloud (S3 / GCS / Azure / HTTP) JSON sources now honour the Records path
- Multi-source guards and dbt:
- Wiring a second input into a join / diff / SCD lookup port now errors
(it was silently dropped). Map nodes still accept multiple lookups. - A dbt node exposes all its upstream tables to the project as the list
varduckle_inputs(each is also a real table dbt reads via sources).
- Wiring a second input into a join / diff / SCD lookup port now errors
- App and community:
- The native webview right-click menu (Back / Reload / Print) is suppressed
on the app header, footer and chrome. The canvas and Projects tree keep
their own menus, and text fields keep copy / paste. - A "Report a bug" button in the status bar and a Discord banner in the
README link to the community server.
- The native webview right-click menu (Back / Reload / Print) is suppressed
2026-06-11 re-roll
- Multiple DuckDB sources reading from the same database file in one pipeline
no longer fail withdatabase with name "duckle_src" already exists. Each
attach-backed stage releases its alias so any number can target one file. - History tab (#29): it rendered transparent so the canvas showed through;
it now has an opaque background like the Run and Plan tabs. - JSON source Records path (#27): the plain JSON source exposes the Records
path option too, not just the JSONL source. - Saved connections in node parameters (#30): a kind-filtered "Saved
connection" picker at the top of each credential block auto-fills the
fields below. - Copy and paste components (#28): Ctrl/Cmd+C and Ctrl/Cmd+V copy and paste
components within a pipeline or across pipelines.
Highlights
dbt, rebuilt around speed
- dbt Fusion is now the default engine: a Rust dbt runtime that parses and
builds projects far faster than the Python toolchain. dbt Core (Apache,
via dbt-duckdb) stays as an automatic fallback when Fusion is unavailable. - Free first-launch provisioning: Duckle fetches and sets up its dbt engine
for you on first use. No Python setup, no separate install, no cost. - Multi-source dbt models: an
xf.dbtnode accepts several upstream inputs at
once. Each wired source materializes a real table named by its node id, and
your models read them through dbtsources- cross-system modeling inside
one dbt build. - A flagship Customer 360 example ships in the gallery: six sources across
four system types feed a dbt build (staging, intermediate and mart models
with tests), then enrich and fan out to four sinks.
JSON ingestion: nested record paths
- The JSON and JSONL sources have a Records path option. Point it at the key
that holds the record array in a REST envelope (for exampledata, or a
dotted path likeresponse.records) and Duckle unnests it into proper
columns, recursively flattening nested structs.
Native brand icons across the palette
- Sources, sinks and SaaS connectors show full-colour brand logos on the
canvas, in the palette and in quick-add search. Transforms keep their
themed glyphs.
Local multi-account profiles
- Create named profiles with an optional picture, shown top-right, each bound
to its own workspace folder, and switch between them for quick context
switching. Stored only on this device, never transmitted, no password.
Canvas and panels
- The right-hand properties panel is collapsible to the edge, with the state
remembered between sessions. - Each pipeline remembers its own canvas viewport (pan and zoom).
Operations (Phase 2)
- Structured error taxonomy in the engine for clearer, categorised failures.
- Prometheus metrics: a
.promtextfile under the workspacelogs/dir for
node_exporter's textfile collector or Grafana Alloy (no HTTP endpoint, no
agent - works headless and air-gapped). - Backfill controls: inspect and reset incremental watermarks from the UI and
the runner. - A Runs history tab to review past executions and trends.
Reliability
- A single corrupt or hand-edited workspace JSON file no longer blocks the
whole workspace from loading; the bad file is skipped and named, and good
files on disk are protected from being overwritten. - All helper subprocesses (dbt, git, shell, the DuckDB CLI) run without
flashing a console window on Windows. - Engine staging is resilient when a binary is locked by a running instance:
Duckle stages the update aside and swaps it in rather than failing with an
access error.
Install
Download the raw executable for your OS below and run it - no installer.
Upgrading
This is a drop-in upgrade from v0.2.0. Existing pipelines and connections work
unchanged. The dbt engine provisions itself the first time you run a dbt node.