Releases: slothflowlabs/duckle
Duckle v0.4.1
Duckle is a local-first ETL/ELT studio: design pipelines on a canvas or describe them to the on-device AI assistant, and run them at native speed on DuckDB. v0.4.1 upgrades the bundled DuckDB engine to 1.5.4, adds in-app self-update so you never hand-download a build again, gives the duck-family sources a Custom SQL read mode, fixes REST connectivity behind a proxy, and ships a refreshed brand.
DuckDB 1.5.4 and in-app updates
- Managed DuckDB upgraded to 1.5.4. The bundled engine now pins DuckDB 1.5.4. Installs are version-aware: a
.installed-versionstamp is written on install, so a version bump is detected and re-fetched instead of leaving a stale binary in place. - Existing installs are prompted to upgrade in place, not reinstall. An out-of-date engine keeps running and no longer triggers the blocking setup modal. A dismissible banner offers a one-click in-place upgrade ("DuckDB engine 1.5.4 is available"), the engine binary is overwritten in place, and the previous version's cached cross-OS binaries are cleared so an old version is not left behind.
- New in-app self-update: "Update now". One click downloads the latest build for your OS, verifies it against the release
SHA256SUMS.txt(and refuses to proceed if checksums are missing), swaps it over the running app, and restarts onto it, so there is no more manually downloading new files. Progress is shown in the banner, and a manual download remains as a fallback. Releases now publishSHA256SUMS.txtso the updater can verify downloads. - Update banner secondary action renamed to "View Changelog". Since "Update now" handles the upgrade in-app, the secondary button now opens the releases page for the per-tag notes.
- ARM64 Linux engine install fixed. The managed DuckDB download for ARM64 Linux pointed at the wrong asset name, so the engine could 404 and never install on arm64 Linux. The download URL is now correct.
Duck-family source options (#76, #77)
- Custom SQL read mode for more duck sources (#77). The DuckLake, MotherDuck, and Quack sources now offer a Read mode choice of Whole table or Custom SQL, matching the flexibility the DuckDB source already had. Pick Custom SQL to write your own query against the attached catalog (referenced as
duckle_src), e.g.SELECT * FROM duckle_src.main.orders WHERE status = $1, which also pushes your filter down into the source scan. - Table name is now optional on those sources. Because Custom SQL supplies its own query, the Table field is no longer hard-required, so a Custom SQL read validates without naming a table.
- The View materialize choice is now honored on duck-family sources (#76). Selecting Materialize = View on an attach-backed source (DuckDB, DuckLake, MotherDuck, Quack, Iceberg, Delta) was previously dropped and silently turned into a table. It now runs through a safe lazy path so your choice sticks and downstream stages get projection and predicate pushdown. The default "auto" behavior is unchanged.
Merge mode and Inline SQL fixes (#39, #78)
- Merge write mode now available in the app for DuckDB and SQLite sinks (#39). The DuckDB-native Merge mode (update only the columns the source provides, insert new rows, and leave other target columns untouched) is now selectable from the write-mode dropdown for the DuckDB and SQLite sinks, matching what the engine already supports.
- Saved SQL routine picks now stick and run in the Inline SQL transform (#78). Selecting a saved routine no longer silently reverts or fails the run with "Custom SQL is empty". The chosen routine is applied in a single update so it persists, with its body inlined into the field the component actually reads.
Proxy support and output panel (#80)
- Proxy support for REST and cloud-API connectors (#80). The shared HTTP agent that powers the REST connector and all cloud-API connectors now reads an HTTP proxy from the environment, so calls route through it instead of connecting directly and timing out (os error 10060) on networks that require a proxy.
- Flexible proxy configuration. Duckle reads
DUCKLE_HTTPS_PROXY/DUCKLE_PROXYfirst (a Duckle-only proxy that does not change your global env), then the conventionalHTTPS_PROXY,ALL_PROXY, andHTTP_PROXY(any case). The URL may include credentials, e.g.http://user:pass@proxy:8080. An invalid proxy URL is logged and skipped rather than failing the run. - In-app updater honors the proxy too. The updater's HTTP client picks up the same proxy setting, so update checks and downloads work behind a corporate proxy. New README rows document the env var and a matching troubleshooting entry.
- Run Output panel now scrolls. Removed a nested fixed height on the output area so it no longer clips its rows; the panel scrolls when a run produces many stages of output.
Brand refresh
- New orange accent throughout. The brand accent moves from lemon yellow to orange (
#ff7a45/#ed5f22) across the desktop app in both dark and light themes: buttons, focus rings, selection highlights, node run badges, and the in-app glow, plus the marketing site, docs graphics, and the announcement bar. - New pixel-art "D" sloth logo and app icon. A fresh orange pixel-art "D" sloth replaces the old geometric mark in the in-app header, the README hero, and the regenerated desktop / iOS / Android icon sets, with a true vector version on the website nav, footers, and ecosystem hub (including a dark-circle badge so it reads on light backgrounds).
- "Duckle by SlothFlowLabs" maker signature. The header wordmark now carries a "by SlothFlowLabs" signature beneath "Duckle" in the app, and across the website and README.
- Independence clarification. The README and every website page now state that Duckle is an independent, open-source project that builds on the DuckDB engine but is not part of, affiliated with, or endorsed by DuckDB Labs or MotherDuck.
Download
Grab the single-file binary for your OS below. The headless runner and the MCP server are embedded, so there are no separate downloads.
| OS | Asset |
|---|---|
| Windows x64 | Duckle-windows-x64.exe |
| Windows ARM64 | Duckle-windows-arm64.exe |
| macOS (Apple Silicon) | Duckle-macos-arm64 |
| macOS (Intel) | Duckle-macos-x64 |
| Linux x64 | Duckle-linux-x64 |
| Linux ARM64 | Duckle-linux-arm64 |
On Windows, double-click (SmartScreen may warn the first time on an unsigned binary; choose More info then Run anyway). On macOS / Linux, chmod +x the file and run it. On first launch Duckle guides you through installing DuckDB (required) and, optionally, the on-device Duckie AI assistant.
Upgrade notes
- Workspaces are forward-compatible; no migration is needed.
- Existing installs are offered DuckDB 1.5.4 as a one-click in-place upgrade, and an update notification can fetch this build directly with Update now.
- Built artifacts are unsigned by design (appending the self-extracting payload invalidates any code signature), so do not codesign or Authenticode-sign them.
Duckle v0.4.0
Duckle is a local-first ETL/ELT studio: design pipelines on a canvas or describe them to the on-device AI assistant, and run them at native speed on DuckDB. v0.4.0 turns Build Pipeline into a real cross-platform deploy tool, speeds up fan-out workloads, adds new ways to materialize and loop, and ships a brand-new website and docs. This build also adds run-time date/time path placeholders with auto-created folders, project-tree drag-and-drop, REST source improvements, and a partial-column merge write mode.
Highlights
Cross-OS Build Pipeline
Build Pipeline turns a pipeline into ONE self-contained executable. v0.4.0 adds a Target OS selector to the build dialog:
- Windows and Linux build natively, and a Linux server file can now be cross-built from any host - the static Linux engine and a Linux DuckDB CLI are bundled into the artifact for you, so you can produce a Linux deployment file without a Linux box.
- macOS builds on a Mac (Apple's toolchain is Mac-only); the dialog says so up front instead of failing late, and only offers a target it can actually produce.
- The output stays a single file: a runner stub plus a zipped payload (the resolved pipeline, contexts, secret handling, DuckDB, and only the extensions that pipeline needs). Copy it to a server and run or schedule it - nothing to install.
Materialize a step to DuckDB
Any step can now be pinned to memory, to disk, or persisted to a real .duckdb file you can reopen later. Use it to cache an expensive join across runs or to hand a step's output to a BI tool. The memory/disk choice moved to the Basic tab so it is one click away.
Parallel loops (ctl.foreach)
For-each loops can run their per-row child pipeline with a configurable concurrency. Fan-out workloads - per-table loads, per-partition exports, per-tenant moves - now run several children at once instead of strictly one at a time, which dramatically shortens wide ingests.
New transform: Zip Arrays to Table (xf.zip)
Turn a headings array plus row-arrays into a proper table, one column per heading - the missing piece for APIs and sources that return columnar arrays.
Portable workspaces (${workspace})
A built-in ${workspace} placeholder resolves to the current workspace folder, so file paths inside a pipeline travel with the project across machines and accounts.
Local multi-account profiles
Keep separate accounts and workspaces and switch between them from the top bar without restarting. Cold start opens the active account's workspace, and switching accounts no longer blanks the canvas.
Resolved context-variable hints
Bindable property fields now show the resolved value of a ${VAR} inline (secrets masked), so you can see exactly what a node will use before you run it.
Dynamic paths, data-driven loops, and project organization
- Run-time path placeholders. Drop
${date},${time},${datetime},${timestamp}, or${now}into any source or sink path and they resolve to the current run time (UTC). A sink's parent folder is created for you, so${workspace}/exports/${date}/orders.parquetwrites into a fresh dated folder. They resolve the same way on the canvas, on a schedule, with the headless runner, and inside a built bundle - the time is stamped when the pipeline runs, not when it was built, so a daily run of one bundle lands in a new dated folder each day. - Data-driven source names.
ctl.foreachreads its upstream rows and runs a child pipeline once per row, exposing each column as${ITER_ITEM_<COLUMN>}(plus${ITER_INDEX}). A table of names can drive the source table name in the child, so one pipeline loops over many tables. - Organize the project tree. Drag a pipeline (or any item, or one of your own folders) onto a folder to move it; the move is saved to the workspace.
A new website and docs
A complete documentation site is live at duckle.org: a landing page, getting-started / components / automation / AI docs, a filterable integrations directory, light and dark themes, and search-friendly metadata.
Engine and connectors
- MotherDuck: the real MotherDuck brand mark; the inline token is applied via
SET(no more repeated device-login prompts); sinks auto-create the target table on first write; and the Snowflakeendpointplus MotherDuckupsertoptions are exposed in the property panel. - Context variables now resolve inside
ctl.foreachandctl.runjobchild pipelines, and for schema autodetect. - Per-stage materialization controls, and attach-backed Parquet sources stay on the fast path even under a reject split.
- REST source: the API-key auth header name is now configurable (so endpoints like
X-Redmine-API-Keywork instead of a fixedX-API-Key), and offset pagination can stop on a bodytotal_countrather than fetching one empty page past the end. - Merge write mode: DuckDB-family sinks (DuckDB, SQLite, DuckLake, MotherDuck) gain a
mergemode that updates only the columns the source provides viaMERGE INTO, preserving the columns it does not - the win over delete-and-reinsert, which would null them.
Canvas and workspace
- Smarter one-click auto-layout: nodes are ranked by dependency depth and columns are spaced by node width, so connectors always have room.
- A single corrupt workspace JSON file no longer blocks the whole workspace from opening.
- The account dropdown and editor are no longer clipped behind the top bar.
Reliability fixes
- The webhook source no longer drops requests on macOS.
- Relational and MotherDuck sinks create the destination on the first write in append mode.
Download
Grab the single-file binary for your OS below. The headless runner and the MCP server are embedded, so there are no separate downloads.
| OS | Asset |
|---|---|
| Windows x64 | Duckle-windows-x64.exe |
| Windows ARM64 | Duckle-windows-arm64.exe |
| macOS (Apple Silicon) | Duckle-macos-arm64 |
| macOS (Intel) | Duckle-macos-x64 |
| Linux x64 | Duckle-linux-x64 |
| Linux ARM64 | Duckle-linux-arm64 |
On Windows, double-click (SmartScreen may warn the first time on an unsigned binary; choose More info then Run anyway). On macOS / Linux, chmod +x the file and run it. On first launch Duckle guides you through installing DuckDB (required) and, optionally, the on-device Duckie AI assistant.
Upgrade notes
- Workspaces are forward-compatible; no migration is needed.
- In-app update detection will notify existing installs that a newer build is available.
- Built artifacts are unsigned by design (appending the self-extracting payload invalidates any code signature), so do not codesign or Authenticode-sign them.
Duckle v0.3.0
Duckle v0.3.0 - local-first, embedded ETL on DuckDB.
Patched since the initial v0.3.0 release
The binaries on this release have been refreshed. Re-download the binary for
your OS below to pick these up (the version number is unchanged).
2026-06-16 re-roll - workspace-relative paths, clearer auto-layout, materialization control, JSON zip
- Auto-layout now arranges the canvas by dependency depth (issue #36): nodes
flow left to right by their position in the graph, siblings stacked and
centered, with generous spacing so edges and connectors stay readable. The
previous layout placed every node on a single row and ignored the wiring. - Workspace-relative paths (issue #37): a built-in
${workspace}placeholder
(alias${projectroot}) resolves to the active workspace root, so source and
sink paths can be written relative to it and a whole workspace folder stays
portable when it is copied or moved. No context needs to be defined; it
resolves on the canvas, in schema autodetect, and in headless / scheduled runs. - Schema autodetect now resolves context variables before inspecting a source,
so a path (or any field) bound to a context variable can be detected, not just
a hand-typed literal. Previously autodetect sent the raw${...}placeholder
to the engine and could not find the file. - Per-stage materialization control on every node's Basic tab. Choose how a step
is stored: Auto (a view for a single consumer, a table when several steps read
it), View (always lazy), Memory (read once, held as a table buffered in RAM),
or Disk (read once, streamed through a temporary Parquet file to keep memory
low for very large intermediates). In addition, a source that feeds a filter
or quality validator which splits rows into pass and reject is now read once
automatically instead of being scanned twice. - New Zip Arrays to Table transform (Transform > Array). Turn a record that
carries a list of column names and a list of row-arrays into a normal table
with one column per name and one row per array - the common "headings + rows"
JSON shape, with no hand-written SQL. - The local webhook source no longer drops requests on macOS under load.
2026-06-15 re-roll - local accounts, faster context switching, corruption-safe workspaces
- Local multi-account profiles: create one or more named profiles (with an
optional picture), shown in the top-right corner, and switch between them for
quick context switching. Each profile remembers its own workspace folder, so
switching profiles swaps the whole project context in one click. Profiles are
stored only on this device, are never transmitted, and have no password -
they separate working contexts, they are not a security boundary. - A single corrupt workspace file no longer hides everything (issue #35):
if a context, connection or pipeline JSON file is hand-edited into invalid
JSON, Duckle now skips just that file and shows a banner naming it, instead of
silently failing to load the entire workspace. If a structural file
(duckle.json/repository.json) is invalid, the workspace stays
un-editable and is never overwritten, so the good files on disk are protected
until you fix or restore the broken one. - Account-switching fixes (all part of the new profiles feature):
- Switching to, or deleting back to, a profile that points at the workspace
already open no longer blanks the canvas down to the default sample. - On a cold start Duckle opens the active profile's workspace, not the last
globally used folder. - The account menu and editor are no longer clipped by the top bar, so the
dropdown opens and you can switch, add, edit and remove profiles.
- Switching to, or deleting back to, a profile that points at the workspace
- Project: a documentation site is now live at
https://duckle.org.
2026-06-14 re-roll (issue fixes)
- src.xml now captures text inside CDATA sections instead of skipping it, so a
value written by snk.xml (which uses CDATA for complex cells) round-trips back
correctly (issue #33). - A pipeline run via a schedule now resolves workspace context the same way the
canvas does, so a context-based value (for example an Oracle password stored
as a context variable) is substituted before the run. Previously the raw
placeholder reached the driver and a job that worked from the canvas failed
under a schedule with errors like ORA-01017 (issue #32).
2026-06-13 re-roll (UI follow-up)
- The run-failure banner in the Output panel now has a dismiss (X) button on
the right; it reappears on the next failed run. - The component properties panel always opens on the Basic tab. Switching to
Schema / Preview / Advanced / Validation is your choice and stays put while a
component is selected; picking a different component resets back to Basic.
2026-06-13 re-roll - full-codebase correctness audit (67 fixes)
A multi-pass review of the whole codebase (engine, connectors, transforms,
desktop, frontend) with every finding independently verified before fixing.
Binaries refreshed; re-download below to pick these up.
- Reliability and data safety:
- Cancelling a run now cancels only that run; a nested sub-pipeline (Iterate /
ForEach / Run Job / Parallelize) no longer resets or steals another run's cancel. - The DuckDB engine and the local AI model now download atomically (written to
a temp file and renamed into place), so an interrupted download can never
leave a half-written file that looks installed. - A partial "Run from here" no longer advances incremental / change-feed
watermarks, so a later full run cannot skip rows that a preview loaded but
never wrote to a sink. - In fast batched runs, a failing transform is now blamed on the correct node
instead of the downstream sink. - Autosave keeps a tab marked unsaved if the write actually failed, instead of
silently losing edits.
- Cancelling a run now cancels only that run; a nested sub-pipeline (Iterate /
- Formats and connectors:
- Cloud (S3 / GCS / Azure) sources and sinks now reject Avro / ORC clearly
instead of silently reading them as CSV or writing Parquet to a .avro/.orc path. - CSV "Windows-1252" encoding now works (it was previously rejected).
- Kafka "Initial offset" (earliest / latest) is now honoured; the default
reads the available backlog. - Snowflake and Databricks requests use the merged OS + bundled trust store
again (fixes a corporate-proxy / Zscaler TLS regression). The Snowflake sink
waits for an async statement to finish (no false success), REST page
pagination starts from the right page, and the Snowflake source handles
gzipped partitions and typed columns. - Identifier escaping hardened for ClickHouse, Cassandra and Oracle; the XML
sink emits only valid element names; the Mongo source no longer drops a row
when one value fails to convert.
- Cloud (S3 / GCS / Azure) sources and sinks now reject Avro / ORC clearly
- Transforms and SQL:
- Window aggregate (aggwin) with an Order by keeps the per-partition total
instead of silently becoming a running total. - INTERSECT / EXCEPT match by column name; NTILE uses the requested bucket
count; rank direction is parsed correctly. - Denormalize, array-collect and JSON array-agg now produce a deterministic
element order. - An aggregation on a named column with no function is rejected instead of
silently becoming a row count.
- Window aggregate (aggwin) with an Order by keeps the per-partition total
- Write-mode safety:
- A SQLite / DuckDB sink with an unrecognised write mode (a typo like "appnd")
now errors instead of dropping and recreating the table. - Upsert with blank conflict columns is rejected; a database port above 65535
is range-checked instead of wrapping to the wrong port.
- A SQLite / DuckDB sink with an unrecognised write mode (a typo like "appnd")
- Hardening:
- Closed several secret-leak paths in SQL export, the MCP server and git push
output; the per-workspace secret key file is created owner-only. - UTF-8 panic guards in CSV type sniffing and the Map node, a size cap on the
webhook body, temp-file cleanup, atomic config writes, and safer git
clone/checkout argument handling.
- Closed several secret-leak paths in SQL export, the MCP server and git push
2026-06-12 re-roll - secrets at rest, connector correctness, desktop reliability
- Secrets at rest: saved connection secrets (passwords, tokens, keys) and the
cached git token are now encrypted with a per-workspace AES-256-GCM key under
.duckle/keys/. Only secret fields are encrypted; host, database and user
names stay readable. The key is gitignored, so a committedconnections/
folder can be shared without exposing credentials. Existing plaintext values
are encrypted on the next save, and${ENV:...}placeholders are never
encrypted. - Connector correctness:
- Databricks sink no longer counts a still-running or failed write as
success: it inspects the statement state, polls it to completion and fails
loudly on error. - RabbitMQ source acknowledges messages only after the batch is durably
written, so a write failure leaves them queued for redelivery instead of
dropping them. - Webhook source answers 200 only after the batch is persisted (503 on
failure), so a sender never treats a never-stored event as delivered. - The Avro sink infers a nullable schema, so a null in any column no longer
aborts the whole file. - The XML sink splits a literal
]]>across two CDATA sections when writing
nested values, keeping the output well-formed. - The MongoDB sink delete propagation matches boolean and numeric flag
columns, not only strings. - Snowflake and Databricks delete propagation escapes backslashes in the
delete value so it matches the source value. - The Snowflake source errors clearly on an unnamed result column instead of
risking misaligned columns.
- Databricks sink no longer counts a still-running or failed write as
- Desktop reliability:
- A cached git token that cannot be decrypted (missing workspace key) now
reports a clear error instead of a confusing push-authentication failure. - The local model download integrity check no longer deletes a valid
download on a transient ...
- A cached git token that cannot be decrypted (missing workspace key) now
Duckle v0.2.0
Duckle v0.2.0 - local-first, embedded ETL on DuckDB.
Highlights
Universal upsert + CDC delete propagation
mode = upsert(MERGE) on every relational sink: PostgreSQL, MySQL/MariaDB,
CockroachDB, SQL Server, Oracle, Snowflake, Databricks, DuckDB and SQLite,
plus MongoDB (replace_one).- Optional delete propagation: a flag column (e.g. a CDC change type) removes
matched rows from the target instead of upserting them. Verified live in
Docker for SQL Server, Oracle, MySQL, Snowflake and DuckDB.
Change data capture + incremental
- DuckLake CDC change-feed source (
src.ducklake.changes): replays
table_changes() since the last consumed snapshot, emitting a change_type
column - pairs with delete propagation for a true mirror. - Watermark incremental load (
xf.incremental): processes only rows past the
last successful run's high-water mark, saved to workspace state.
Orchestration + canvas
- Run Job (parent calls a child pipeline with context variables) and
Parallelize (independent downstream branches run concurrently; 0 = auto, one
branch per CPU core). - Control-flow nodes: Log, Warn, Die.
- Undo/redo across nodes, edges and settings; Ctrl/Cmd+S save; component-level
run logs written under the workspace.
Look and feel
- DuckDB-aligned theme in light and dark modes: lemon-yellow / orange brand
fills for primary actions and selection.
Install
Download the raw executable for your OS below and run it - no installer.
Enterprise / corporate networks (v0.2.0, refreshed 2026-06-05)
Duckle's HTTP clients now trust the operating-system certificate store in
addition to the bundled Mozilla roots, so it works behind a TLS-inspecting
corporate proxy (Zscaler, Netskope, ...) whose CA is installed in the OS
store, instead of failing with invalid peer certificate: UnknownIssuer.
This covers the engine + model downloads and the REST / cloud-API / warehouse
connectors plus the update and CI checks. The trust set is a strict superset
of the previous one, so machines without a corporate CA are unaffected.
- New
DUCKLE_CA_CERTenv var: point it at a PEM bundle to trust an extra CA
explicitly (split-tunnel setups, or a CA handed out as a file). - DuckDB's own extension downloads (
extensions.duckdb.org) and cloud reads
(S3 / GCS / Azure) run inside the DuckDB engine with its own TLS, so also
allow / exemptextensions.duckdb.orgfrom inspection for those. - The Duckie AI assistant remains optional - only it needs
huggingface.co.
Thanks to @DarekDan
Fixes (v0.2.0, refreshed 2026-06-06)
- CSV / TSV reject port (#15): a delimited source with a declared typed
column now routes rows that fail to parse (e.g. an invalid date) to its
reject output as raw text, so they can be written straight to a separate
CSV for review. Valid rows flow on, typed, instead of the whole read
aborting on a single bad value. Wire the source's reject output to a sink's
main input. Pipelines that do not wire the reject port are unchanged.
Fixes (v0.2.0, refreshed 2026-06-07)
- SFTP support (#16): the File Transfer source's Protocol dropdown now does
real SFTP (SSH) via russh + russh-sftp, alongside FTP / FTPS. Password or
OpenSSH private-key auth, plus an optional host-key fingerprint (SHA256) pin.
Pure-Rust, no extra system dependencies. - Parquet / CSV partition guard: a partitioned file sink now fails fast with a
clear message if "Partition by columns" would create more than "Max
partitions" files (default 10,000; 0 = unlimited), instead of silently
writing tens of thousands of tiny files. Stops a high-cardinality partition
key (e.g. country pairs) from turning a write into a multi-minute file storm.
Fixes (v0.2.0, refreshed 2026-06-08)
- Schema autodetect (#18): "Autodetect from source" returned a generic
col_1 / col_2 / col_3 placeholder for Excel, DuckLake, Avro, Iceberg, Delta,
Spatial and Fixed-Width sources, even though running the node read the real
schema. Autodetect now builds the exact same query as a run, so it reports
the real columns for every file and embedded source. - Excel multi-file reads (#18): an Excel source pointed at a folder or a
wildcard (e.g.data/*.xlsx) now reads every matching workbook instead of
silently loading only the first one. The file picker also filters on
.xlsx/.xlsinstead of.excel. - Embedded + DuckLake upsert (#19): the SQLite and DuckDB sinks now expose the
Upsert write mode (set-based delete-by-key + re-insert) that the engine
already supported, and the DuckLake sink gains the same Upsert mode with
conflict columns.
Build + deploy: standalone pipelines (v0.2.0, refreshed 2026-06-08)
- Build Pipeline: right-click a pipeline (in the project tree or on the
canvas) and pick "Build pipeline" to produce ONE self-contained executable
named after the pipeline - the equivalent of a Talend "Build Job". The file
embeds the resolved pipeline, its contexts and routines, DuckDB, and only
the DuckDB extensions that pipeline's components actually need, plus its
secrets. There is no folder, no run script, and nothing extra to download. - Run it anywhere: on the server it self-extracts to a temp cache, uses its
own embedded DuckDB and extensions (so the host needs no DuckDB install),
runs the pipeline, and exits with the pipeline's status code. Schedule it
with whatever the OS already has - cron, systemd timers, or Windows Task
Scheduler. See docs/current/scheduler.md. - Secrets, your choice per build: Environment mode replaces every secret with
a${ENV:KEY}placeholder and ships asecrets.env.example, so nothing
sensitive is written into the artifact; the runner resolves real env vars
first, then asecrets.envbeside the exe. Passphrase mode encrypts secrets
with AES-256-GCM, unlocked at run time withDUCKLE_BUNDLE_PASSPHRASE. - Lean by design: only the needed extensions are bundled, so a CSV-to-CSV
pipeline builds to about 28 MB instead of hundreds of MB. - File Transfer sink: a new sink uploads a pipeline's output over FTP, FTPS or
SFTP (SSH), the write-side counterpart to the existing File Transfer source.
Connect to Claude / any MCP client (v0.2.0, refreshed 2026-06-09)
- Connect to Claude: a new button in the designer top bar opens a popup that
wires Duckle into Claude Code, Claude Desktop, Cursor, or any MCP client in
one click. Duckle now ships a Model Context Protocol (MCP) server, so an AI
assistant can browse the component catalog, generate a pipeline straight into
your working directory, validate it, run it, and build a standalone artifact- all in your workspace, on your machine.
- One click: "Connect to Claude Code" runs the registration for you;
"Add to Claude Desktop" / "Add to Cursor" write the server into that client's
config (both the Microsoft Store / MSIX and standalone Claude Desktop layouts
are handled); or copy the command / config for any other client. - Bundled, no extra download: the MCP server ships inside the app and reuses the
DuckDB engine, so there is nothing else to install. Headless usage + the full
tool reference are in docs/current/mcp.md. @SouravRoy-ETL
Fixes (v0.2.0, refreshed 2026-06-09)
- Source schema preserved on run (#18 follow-up): running a pipeline no longer
overwrites a source's autodetected / declared schema. Re-running keeps the
schema you set (the engine uses it, e.g. CSV column types) and only refreshes
the preview rows. Covers CSV, Excel, DuckDB and DuckLake sources. - Connector username in ATTACH (#20): PostgreSQL / MySQL / CockroachDB /
Redshift / pgvector connections now map the username field to the DB user in
the generated connection string, so a connection that sets a username
authenticates correctly. Thanks to @kyounghoonJang (#21).
Reroll - 2026-06-09 (logo, smaller downloads, Snowflake + SQLite/DuckDB fixes)
- New brand: a geometric "D." mark replaces the old logo across the app icon /
taskbar, the README hero, and the in-app top bar (theme-aware - brand yellow
on slate in dark mode, orange on a pale disc in light mode). - Snowflake key-pair auth (#22): fixed
390144 "JWT token is invalid"on
regional / PrivateLink accounts. The JWT now uses the account locator only in
its iss / sub claims (the full account is still used for the REST URL). - SQLite / DuckDB sinks (#19): the Write mode dropdown now offers Append,
Truncate, and Upsert (delete-by-key + re-insert, with optional delete-flag
propagation), not just Create or replace. - MCP popup: the "Connect to Claude" action buttons now use Claude's orange to
match its color scheme.
Reroll - 2026-06-10 (Snowflake, DuckDB/SQLite, Excel fixes)
- Snowflake source (#24): result sets that split into multiple partitions
(roughly n>300 wide rows) no longer fail with "response not JSON" - the
gzip-compressed partition bodies are now decoded, and result columns are
typed from the result metadata (real timestamps / dates / numbers instead
of VARCHAR). - SQLite / DuckDB sinks (#19): selecting Upsert without conflict columns now
errors clearly ("upsert needs at least one conflict column") instead of
silently falling back to DROP TABLE + CREATE. - Excel source (#25): the Schema panel is now respected - retyped and removed
columns are applied on read instead of being ignored.
Reroll - 2026-06-10 (canvas quick-add + Iterate / For Each fix)
- Quick-add on the canvas: start typing on the designer to fuzzy-search every
component (sources, transforms, sinks, connectors, control, quality, code)
and drop the match where your cursor is - Enter to add, Esc to close. - Iterate / For Each (#26): these run a child pipeline, but the panel never let
you pick one, so a run failed with "pipelineRef required". You can now select
the pipeline to run (plus an iteration count for Iterate).
Duckle v0.1.0 (Hotfix v2)
Builds on v0.1.0-hotfix. This release fixes the desktop app's export / clipboard features (broken on every platform), wires up the Runs and Plans tabs, and folds in a large engine performance + correctness sweep that came out of real user reports.
Updated 2026-05-28 - this build was re-rolled under the same tag with: a fix for database sources (Oracle / SQL Server / Mongo / ...) hanging on wide tables so the sink never wrote a file (#4); a fix for "0 rows written" being reported on large pipelines (per-node row counts now populate correctly); paginated sources now fail loudly instead of silently truncating at the maxPages cap; plus desktop UI fixes (stale column pickers that could not be deselected, Plan-tab error surfacing, a column-validator false-positive). If you downloaded an earlier copy of this tag, please re-download the binary below.
Updated 2026-05-29 - re-rolled again under the same tag (the tag itself is unchanged) with three more fixes: a large Filter / Validator performance fix - a node whose reject port is not wired no longer materializes the discarded rows, so a 10M-row filter that keeps 2M now runs in under a second instead of ~16s (it used to write the other 8M rejected rows to disk for nothing); database sinks (SQL Server / Oracle) no longer error with "output path is required" and now auto-create the target table when it does not exist yet (#8, verified on live SQL Server 2022 + Oracle 21, for both new and existing tables); and src.oracle no longer loses precision on high-precision NUMBER columns (values past ~15 significant digits were silently rounded through a double). A follow-up extends the Filter / Validator performance fix to the case where the reject (or error) port IS wired: each output now becomes a lazy view when it has a single consumer, so a pipeline that routes the rejected rows to a sink no longer pays an intermediate 8M-row table write either (~17s to ~1.6s on that shape). Re-download the binary below for these.
Updated 2026-05-30 - re-rolled again under the same tag (the tag is unchanged) with a batch of engine speed + correctness fixes from a full component audit: control-flow nodes (wait / barrier / checkpoint / iterate / try / trigger / foreach) and conditional-split branches now compile to lazy views instead of copying the whole dataset, so a pass-through over 10M rows is milliseconds instead of ~12s;
src.csv/src.tsvschema declarations that cover only some of a file's columns now work instead of failing with a cryptic error; a Diff Detect with no compare columns now fails clearly instead of silently dropping every changed row; SCD1 and set operations (intersect / except) align columns by name so a different column order no longer corrupts values; Snowflake / Databricks reads use far less memory on large result sets; the Format / numeric / sort / add-column transforms reject malformed input cleanly instead of corrupting output or aborting the run; and de-duplicate (uniqueness), skip, and distinct gained an optional ordering so the rows they keep are reproducible run-to-run. Cloud (S3 / GCS / Azure) source + sink option parity and external-sink streaming are still in progress. Re-download the binary below for these.
Desktop app fixes
- Export & clipboard work again (all platforms). Copy SQL, Export SQL, Export JSON, and copy-node-id were no-ops in the packaged app because the webview can't use the browser clipboard or
<a download>. They now route through the Tauri clipboard plugin and a native save dialog. - Runs and Plans tabs show real data (#6). They were placeholder panels. The Run tab now lists per-node status, row counts, timings, and errors from the last run; the Plan tab live-compiles the pipeline and shows the DuckDB SQL for each step (with a Copy button and a clear compile-error panel). Run history was persisted correctly all along - it just was not rendered. (A dedicated top-level Schedules tab is still a follow-up; schedules remain reachable via the pipeline right-click menu.)
- Column pickers keep stale selections visible. In the aggregation, filter, and column fields, a selected column that was no longer present in the upstream schema used to disappear from the list, which made it impossible to see or remove. Stale picks now render as "(not in input)" so you can deselect and fix them.
- Plan tab surfaces compile errors. A pipeline that fails to compile now shows the actual error in the Plan tab instead of a blank panel, and the Run / Plan panels no longer render transparently over the canvas.
- Accurate row counts on large runs. On bigger tables the run summary could report "0 rows written" despite a successful run that wrote the file correctly. The batched executor was reading a per-stage count file in the brief window before DuckDB finished writing it; it now waits for the complete value. Per-node counts are listed individually, and the summary's "rows written" counts sink rows only (it previously summed source + transforms + sink).
- Compiled export includes procedural steps (#7). Driver sources/sinks (Oracle, REST, Kafka, ...) and
ctl.*control steps now appear in the exported SQL as descriptive comments instead of empty blocks. - External links open in the system browser - CI build links and the GitHub/GitLab token pages used APIs the webview ignored; they now use the Tauri opener.
- Clearer column errors - referencing a column that does not exist upstream (e.g. an
xf.distincton a missing column) now fails at compile time with a "column not found in upstream" message across 20+ transforms, instead of a cryptic runtime DuckDB binder error.
Engine performance
- Filter / Validator no longer materializes discarded rows. A Filter (or any
qa.*validator) split its input into a pass set and a reject set and always wrote the reject set to a temp table, even when nothing downstream consumed the reject port. On a 10M-row source kept down to 2M that wrote the other 8M rows to disk for nothing - the filter step alone took ~16s. The reject set is now materialized only when the reject port is actually wired; otherwise the step is a lazy view (10M -> 2M filter: ~16s -> under 1s, matching raw DuckDB). - Batched single-CLI-spawn execution. All-SQL pipelines run as one
duckdb.exeinvocation instead of one spawn per stage. Measured 4.5x faster on fixed overhead on Windows (5-stage pipeline: 245 ms -> 56 ms). Driver-backed stages andctl.*hooks fall back to the per-stage path automatically. - DuckDB PRAGMA preset (
preserve_insertion_order=false,enable_object_cache=true,enable_progress_bar=false) on every run. - Lazy materialization: single-consumer pure-SQL steps compile to
CREATE VIEWinstead ofTABLE, so DuckDB inlines them and gets predicate/projection pushdown. NewDUCKLE_FORCE_VIEWS=1forces views even for multi-consumer nodes (#5). - Streaming NDJSON source loads: a 1 M-row x 37-col Oracle/SQL Server pull now stays at O(64 KiB) resident set instead of peaking at ~30 GB.
- Workspace memory knobs:
DUCKLE_MEMORY_LIMIT,DUCKLE_THREADS,DUCKLE_TEMP_DIR.
Engine correctness
- #8 - loading into SQL Server / Oracle. The desktop wrongly required an output path for database / warehouse / broker sinks, and the driver sinks only ran INSERTs so a brand-new target table failed with "Invalid object name" / ORA-00942. Database sinks no longer require a path, and SQL Server / Oracle sinks now auto-create the target table (inferring column types from the upstream) when it does not exist. Verified on live SQL Server 2022 and Oracle 21 for both new and existing tables.
- src.oracle high-precision NUMBER: a scaled NUMBER was read through a 64-bit double, which only round-trips ~15 significant digits, so e.g. a NUMBER(38,12) silently lost its last digits. The exact value is now preserved when it would not survive a double; BINARY_DOUBLE / BINARY_FLOAT (true IEEE floats) are unchanged.
- #4 - database source on a wide table hung, sink wrote nothing: the per-stage executor previewed each node with
SELECT * ... LIMIT 100, but read the CLI's output only after it exited. A wide table's preview (e.g. a 36-column date dimension, ~128 KiB of JSON) overflowed the OS pipe buffer, so the CLI blocked writing while the engine blocked waiting - hanging the run on the source node's preview before the sink ever ran. The runner now drains output concurrently, so any width/size completes. This was width-specific, not Oracle-specific (a wide SQL Server table hit it too). - #4 - src.oracle wide-table data loss + slow fetch (earlier): type-aware value dispatch (no more silently-NULLed DATE/TIMESTAMP/BLOB) +
prefetch_rows(1000), plus session NLS normalization and a per-run liveness trace at%TEMP%/duckle-oracle-trace.log. - Paginated sources fail loudly on truncation: REST (cursor/page/offset/link), Qdrant, Weaviate, Milvus, DynamoDB, and Elasticsearch now return a clear error if they hit the
maxPagessafety cap with more data still upstream, instead of silently materializing a partial result. - src.sqlserver data loss on DATE/DATETIME/DECIMAL/BINARY/GUID - same correctness pattern.
- xf.cast honors its on-error setting (TRY_CAST vs CAST), and xf.addcol casts the new column to its declared type.
- Column validator no longer falsely rejects references to columns produced upstream by window / rank / row-number / other column-adding transforms.
- Joins: composite keys, ambiguous-column dedupe (USING / EXCLUDE), NULL-safe anti-join (NOT EXISTS).
- UNION / INTERSECT now BY NAME to dodge silent positional-column corruption.
- Window functions error at planner time when ORDER BY is missing.
- xf.transpose / xf.pivot no longer break under lazy materialization (dynamic PIVOT can't live in a view; forced to TABLE).
- arr.contains NULL-safe; xf.cast guards against empty/duplicate cast entries.
Features carrie...
Duckle v0.1.0 (Hotfix)
Hotfix for v0.1.0. Same feature set as v0.1.0 (60 UI languages, DuckDB Quack, xf.fill_backward, full UI i18n coverage), plus a sweep of engine fixes and performance work that came out of real user reports.
Performance
- Batched single-CLI-spawn execution. Multi-stage SQL pipelines now run as one
duckdb.exeinvocation instead of one spawn per stage. Per-stage progress events still arrive in real time via NDJSON marker files each stage writes. Measured 4.5x speedup on fixed overhead on Windows (5-stage pipeline: 245 ms → 56 ms). Driver-backed stages (Oracle, SQL Server, Kafka, Mongo, REST, AI components) andctl.*hooks transparently drop to the per-stage path. - DuckDB PRAGMA preset on every run:
preserve_insertion_order=false+enable_object_cache=true+enable_progress_bar=false. Halves wall time on Parquet writes and avoids re-reading file metadata between stages that hit the same source. - Lazy materialization: single-consumer pure-SQL transforms now build
CREATE OR REPLACE VIEWinstead ofTABLE. DuckDB inlines them into the downstream query and gets predicate / projection pushdown into the source read. 2-5x speedup on linear pipelines. - Streaming NDJSON writer in Oracle / SQL Server sources. A 1 M-row x 37-col Oracle pull used to peak at ~30 GB resident set; it now stays at O(64 KiB) regardless of row count.
- Workspace memory knobs:
DUCKLE_MEMORY_LIMIT,DUCKLE_THREADS,DUCKLE_TEMP_DIRset workspace-wide caps without touching every stage.
Correctness
- #4 - src.oracle wide-table data loss + slow fetch. Two root causes: a try-String-then-i64-then-f64 cascade silently NULLed DATE / TIMESTAMP / BLOB columns, and Oracle's default
prefetch_rows = 1meant 10 k rows = 10 k network round trips. Fix: type-aware dispatcher +prefetch_rows(1000). - src.sqlserver data loss on DATE / DATETIME / DECIMAL / BINARY / GUID. Same correctness pattern as #4, applied to tiberius.
- Joins: composite keys, ambiguous-column dedupe (
USING/EXCLUDEinstead ofJOIN ... ONthat produces both keys), NULL-safe anti-join (NOT EXISTSinstead ofNOT INwhich silently drops on NULLs). - UNION / INTERSECT: now
BY NAMEto dodge silent positional-column corruption when two upstreams have the same columns in different orders. - xf.assert fires reliably on Windows release builds (CTE materialization pattern; the optimizer was previously pruning the error() branch under aggressive pruning).
- Window functions error at planner time when ORDER BY is missing for
rank/lead/lag/ etc., instead of producing nondeterministic output. - Column-existence validation: stages reference upstream columns get a clear "did you mean ..." error at compile time instead of a runtime "column not found" deep in the SQL plan.
- xf.transpose / xf.pivot: dynamic PIVOT cannot live in a view in DuckDB 1.5; planner now forces TABLE materialization for those two components regardless of consumer count (regression introduced by lazy materialization).
- Cast guards:
xf.casterrors at planner time on empty cast entries or duplicate target columns. - arr.contains NULL-safe (
COALESCE(list_contains(...), FALSE)).
Features
- xf.row_hash + xf.audit: CDC / provenance primitives. row_hash computes MD5 / SHA-256 over a chosen subset of columns; audit appends inserted_at / updated_at / source columns.
- xf.fill_constant: fill nulls with a literal (companion to fill_forward and fill_backward).
- src.csv dateFormat / timestampFormat props: override DuckDB's auto-detection per column.
Fixes from v0.1.0 (carried forward)
- #2 - DUCKLE_DUCKDB_BIN not set on REST-shaped sources (Oracle, SQL Server, Snowflake, Databricks, Synapse, BigQuery, SaaS REST aliases). Fix: env::set_var in the Tauri setup hook so both the in-process OnceLock and the helper's env-var lookup see the path.
- #3 - CSV declared schema ignored at execution time. Setting a column to VARCHAR in the Schema panel now actually emits
columns = {...}in the generatedread_csv_auto(...).
Binaries (6 platforms)
| Platform | File |
|---|---|
| Windows x64 | Duckle-windows-x64.exe |
| Windows arm64 | Duckle-windows-arm64.exe |
| Linux x64 | Duckle-linux-x64 |
| Linux arm64 | Duckle-linux-arm64 |
| macOS arm64 (Apple Silicon) | Duckle-macos-arm64 |
| macOS x64 (Intel) | Duckle-macos-x64 |
Same single-binary, engines-download-on-first-launch shape as v0.1.0.
Upgrade
If you already have v0.1.0 or the previous v0.1.0-hotfix running, swap the binary in place. The workspace folder and the engine cache at ~/.duckle/engines/duckdb/ (or the OS equivalent) are untouched.
Duckle v0.1.0
Highlights
DuckDB Quack remote protocol - the May 2026 spec lands as src.quack and snk.quack (Cloud Warehouses group). HTTP on port 9494, SECRET-based token auth, autoloaded by DuckDB 1.5.3. Bring up the server with CALL quack_serve('quack:localhost', token => 'super_secret'); and point Duckle at it. Append / overwrite / truncate / upsert all work through the existing relational sink path.
60 UI languages (was 35). Adds Norwegian, Danish, Finnish, Catalan, Bulgarian, Slovak, Croatian, Serbian, Slovenian, Lithuanian, Latvian, Estonian, Khmer, Burmese, Sinhala, Nepali, Swahili, Afrikaans, Welsh, Irish, Icelandic, Albanian, Azerbaijani, Mongolian, Kazakh. RTL still works for Arabic, Hebrew, Persian, Urdu. Switch from the globe in the topbar.
Full UI i18n coverage. Previous release only translated the topbar, AI chat, and palette top-level categories. v0.1.0 covers the left sidebar tabs, Canvas / Plan / Run, the Run / Stop / Save / Validate / Auto-layout / More toolbar, palette subgroup labels (Files, Databases, APIs, Streaming, NoSQL, etc.), the Properties panel tabs and empty states, the bottom panel (Problems / Output / Console), the status bar, the engine selector, and the node card kind header. Hand-curated overrides correct machine-translation slips for the highest-traffic terms across ten major languages.
xf.fill_backward - pandas-style bfill / fill up. Sibling to the existing xf.fill_forward. For each NULL it takes the next non-null value within the ordered window. Closes #1.
Binaries
| Platform | File |
|---|---|
| Windows x64 | Duckle-windows-x64.exe |
| Linux x64 | Duckle-linux-x64 |
| macOS arm64 | Duckle-macos-arm64 |
Single self-contained executable. Engines (DuckDB CLI 1.5.3, optionally llama.cpp + Qwen for Duckie AI) download to app-data on first launch.
Quack quickstart
-- Server (any DuckDB v1.5.3 instance)
> CREATE TABLE orders AS SELECT 1 AS id, 'paid' AS status;
> CALL quack_serve('quack:localhost', token => 'super_secret');In Duckle: drag a DuckDB Quack source from the Cloud Warehouses group. Host: localhost, port: 9494, token: super_secret, schema: main, table: orders. Wire to any sink, hit Run.
Notes
- The bundled DuckDB CLI was already v1.5.3 spec-wise, but installs from earlier Duckle releases may still hold v1.1.3 on disk. Delete
%APPDATA%\io.duckle.app\engines\duckdb\duckdb.exe(Windows) or the equivalent on macOS / Linux to trigger a re-download on next launch. - All translation files are machine-translated via Google with hand-curated overrides for the ten major languages. Native speakers are welcome to PR cleaner translations - see
frontend/src/i18n/README.md.
Duckle v0.0.12
Highlights
Duckie AI Assistant - a local chat panel that turns natural-language requests into runnable Duckle pipelines. Runs llama.cpp with Qwen2.5-Coder-1.5B on your CPU; no API key, no network round-trip, no data leaves the machine. Streams the response and exposes an "Insert into canvas" button that drops the generated nodes straight into the editor.
In-app Git integration (GitHub + GitLab) - a Git side-panel inside Duckle for the workspace folder. Status, stage-all, commit, push, pull, branch create / checkout, remote set. PAT-based auth that tries the system credential helper first and only prompts on 401. PATs are stored at <workspace>/.duckle/secrets/git.json and the .duckle/.gitignore is auto-written so secrets never reach the repo.
CI status badge in the topbar - polls the latest pipeline on the current branch every 30 s. Detects provider from the remote URL (GitHub Actions or GitLab CI) and colour-codes success / failure / in-progress. Click to open the build in a browser.
README overhaul - new Quick Links nav table at the top, expanded component reference, six paste-and-run pipeline recipes, FAQ, troubleshooting, and a worked-example "run your first pipeline" walkthrough.
GitLab CI mirror - .gitlab-ci.yml now mirrors .github/workflows/ci.yml and release.yml on the GitLab Linux runner tier (test / build / integration / release stages).
Binaries
| Platform | File | Size |
|---|---|---|
| Windows x64 | Duckle-windows-x64.exe |
27.9 MB |
| Linux x64 | Duckle-linux-x64 |
31.0 MB |
| macOS arm64 | Duckle-macos-arm64 |
24.0 MB |
Self-contained. Engines (DuckDB CLI, llama.cpp, Qwen GGUF) download on first launch to app-data.
Notes
- v0.0.11 was skipped.
- DuckDB CLI engine is required for execution; Duckle prompts to download it on first launch.
- AI Chat is opt-in - the llama.cpp runtime and Qwen model download only when you open the Duckie panel for the first time.
Duckle v0.0.10
Release: bump tauri.conf.json to 0.0.10 v0.0.10 ships: - Duckie AI Assistant (local Qwen + llama.cpp + chat panel) - 5 SaaS REST aliases (Slack, Discord, Telegram, Twilio + 4 PM tools) - 6 AI transforms (embed, llm, classify, chunk, pii, dedupe) - src.dynamodb + src.kinesis via direct HTTP + AWS SigV4 (no AWS SDK) - src.git, src.odata, src.soap, src.ftp, src.clipboard, src.email, src.webhook, snk.email - code.javascript (via boa), code.wasm (via wasmi), code.shell - XML response support in src.rest - production-ready README rewrite Counts move 269 -> 292 available, 11 -> 5 preview, 21 -> 16 planned.
Duckle v0.0.9
Engine: src.soap + XML response support in src.rest Adds RestResponseFormat::Xml to RestSourceSpec so any REST request can parse an XML body via the same element-path walker src.xml uses. The walker is extracted into a free fn (walk_xml_to_rows) shared between src.xml and the new REST/SOAP path; src.xml's own behavior is unchanged. src.soap is a thin alias: matches!() in the planner adds defaults of method=POST, Content-Type text/xml; charset=utf-8, responseFormat=xml, plus an optional SOAPAction header from the soapAction prop. The user supplies the envelope as `body` and the row_path as the element walk into the response (e.g. Envelope/Body/GetUsersResponse/Users/User). The XML walker now matches local-name when comparing row_path segments to element names, so `Envelope/Body/Foo` matches a `soap:Envelope/soap: Body/Foo` document without forcing the user to spell the namespace prefix every time. Exact-match (`soap:Envelope/...`) still works since both sides are local-stripped. Counts move 268 -> 269 available, 24 -> 23 planned.