v0.1.3
Tycoon v0.1.3
Released: 2026-04-28
Headline
Five of the seven planned v0.1.3 themes shipped; the two XL items
(tycoon data sync cloud↔local snapshots, and one-command
MotherDuck + Nao + LM Studio setup) are deferred to v0.1.4.
- Templates are now parameterized.
tycoon init --template X --param name=valuesubstitutes runtime values into
tycoon.ymland source configs at scaffold time — so
github-analyticsandweather-stationare actually runnable
pipelines, not init-only demos. - Observability v2 landed. Every ingest captures dlt's
trace.pickle(byte sizes + per-step durations + per-job
detail); every dbt invocation snapshotsmanifest.jsonand
diffs against the previous run (SQL-hash changes, added /
removed / retyped columns, model adds/removes). All surfaces in
tycoon data history show <id>and the existing Rill dashboards. - csv-import is buildable end-to-end on fresh install: the
template now shipsdbt_project/+ a sample CSV, and CI runs
the full init → ingest → transform pipeline on every PR via a
newoffline_e2emarker. - Warehouse alignment now covers Snowflake, BigQuery, and
Redshift — not just DuckDB and MotherDuck.
What changed
Added
See the detailed "Planned scope" sections below. Quick index:
- Template parameterization (
--param name=value) — §1 - csv-import template ships a buildable dbt project + offline e2e
throughdbt run— §3 - dlt trace enrichment (byte sizes, per-step durations, per-job
detail) — §4 - dbt manifest schema-diff (SQL hash + column changes across
invocations) — §5 - Snowflake / BigQuery / Redshift dbt-profile warehouse alignment
intycoon register dbt— §2 - Nao context surface for coding agents: auto-generated
AGENTS.mdpointer +tycoon ask contextcat-command —
see "Nao context surface" section below.
Changed
tycoon data history show <load_id>now renders pipeline
duration, total bytes written, and a per-step breakdown when a
dlt trace is captured. The per-table view gains a Bytes column.tycoon data history show <invocation_id>(dbt) now appends a
"Schema changes vs. previous run" table when a manifest diff
recorded changes._extract_dbt_duckdb_pathis retained as a thin shim over the
new structured_extract_dbt_warehouse_target. Existing
callers unchanged.- Dependency bumps.
dlt[duckdb]1.25.0 → 1.26.0,
dagster1.13.0 → 1.13.2 (and the three sibling packages
dagster-webserver,dagster-dbt0.29.0 → 0.29.2,
dagster-dlt0.29.0 → 0.29.2),nao-core0.1.7 → 0.1.8,
typer0.24.1 → 0.25.0,uvicorn0.44.0 → 0.46.0,
pydantic2.13.2 → 2.13.3,fastapi0.136.0 → 0.136.1.
All 328 tests still pass after the bumps; nothing in tycoon's
call surface required code changes. - Rill 0.86 docstring refresh. Rill 0.86 was released the
same day as this version. Therill_generator.pymodule
docstring now references 0.86, but the generator's output
(Parquet bridge vialocal_fileconnector) is unchanged.
Rill 0.86's "DuckLake live connector" was probed for v0.1.3
scope but deferred — SQLite-backed DuckLake catalogs hold an
exclusive OS-level lock while attached, breaking the
"Rill running while pipelines write" workflow that the
Parquet bridge supports today. Migration is on the v0.1.4
candidate list pending either Rill shared-lock support or a
Postgres-catalog story.
Fixed
generate_rill_configno longer accepts an unused
warehouse_db_pathparam (carried-forward known issue from
v0.1.2). The function only ever introspectedraw_db_path;
the warehouse path was a leftover from an earlier draft. All
10 call sites intests/test_explore.pyplus the single
production caller insrc/tycoon/commands/explore.pyare
updated. Tightens the public surface.- Type diagnostics in
src/tycoon/ingestion/runner.pycleared
(carried-forward known issue from v0.1.2). The
rest_api_sourcecall nowcasts its config dict to dlt's
exportedRESTAPIConfigtyped-dict type; the env-var warning
loop pulls the matched${VAR}string from
_check_unexpanded_env_varsdirectly instead of re-running the
regex with a suppressed# type: ignore[union-attr].
_check_unexpanded_env_varsnow returnslist[tuple[str, str]]
(key + matched-var pairs) — internal-only helper; no public-API
impact.
Upgrade notes
pip install -U database-tycoonNo breaking changes. Four things are worth knowing:
- New observability tables.
.tycoon/metadata.duckdbgains
five new tables (dlt_trace_runs,dlt_trace_steps,
dlt_trace_jobs,dbt_manifest_snapshots,
dbt_schema_changes). Existing projects pick these up
automatically on the next ingest / dbt run; nothing to
migrate. Delete.tycoon/metadata.duckdbif you want to start
fresh — it's still fully disposable. - Template
{owner}/{repo}placeholders are now
{{ owner }}/{{ repo }}. Anyone with an in-tree fork of
github-analyticsorweather-stationshould update their
placeholder syntax; dbt Jinja ({{ ref(...) }}) still passes
through untouched. - Cloud-adapter alignment is new behavior. Running
tycoon register dbtagainst a Snowflake, BigQuery, or Redshift dbt
project will now prompt to updatestack.warehouse. Decline
the prompt to keep the old behavior. tycoon ask init/ask syncnow write anAGENTS.md
at the project root. First-time existing-project users who
re-run either command will see a newAGENTS.mdappear,
pointing coding agents at.tycoon/nao/databases/**,
.tycoon/nao/repos/dbt/, and.tycoon/nao/RULES.md. Commit it
— it's a static pointer file that stays useful in fresh
clones. If you already have a hand-authoredAGENTS.md,
tycoon detects the missing<!-- @generated by tycoon ask -->
sentinel and leaves your file alone with a warning.
Known limitations (carried forward from v0.1.2)
tycoon data queryis DuckDB/MotherDuck-only. Snowflake and
BigQuery warehouses can be registered and aligned, but the
query command doesn't dispatch to them yet. Coming with the
deferredtycoon data syncwork (§7).- External dlt pipelines can't be run via
tycoon data sources run. Only tycoon-managed dlt sources are. Out of v0.1.3 scope.
Known issues (carried from v0.1.2 — still open)
- All three carried-forward known issues were resolved during the
v0.1.3 cycle. See "Fixed" above.
What's next (v0.1.4 candidates)
Both deferred v0.1.3 themes carry forward.
tycoon data sync— cloud ↔ local DuckDB snapshots (issue
#12).
Snapshot a MotherDuck warehouse to a local DuckDB file (for
offline analysis, AI training, air-gapped demos) and upload the
other direction. Uses DuckDB's nativeATTACH 'md:...'+COPY FROM; tycoon handles auth + path resolution + naming.- One-command MotherDuck + Nao + LM Studio setup (issue
#7).
A single command that walks through MotherDuck auth (OAuth or
token), Nao init + schema sync, and LM Studio detection +
wiring. Today each of these is a separate 2–5 step process and
ordering matters.
Scope detail
The themes targeted for v0.1.3, with the five that landed marked
✅ and the two deferred items kept here for context.
1. Template parameterization ✅ landed
Templates can now declare parameters that get substituted at tycoon init time, turning placeholder-laden templates into concrete working
projects on the first run.
Format. Each template gains an optional template.yml metadata
file next to its tycoon.yml:
parameters:
- name: owner
description: GitHub username or organization
example: octocat
required: true
- name: repo
description: Repository name
example: hello-world
required: trueCLI. A new repeatable --param name=value flag on tycoon init:
tycoon init --template github-analytics \
--param owner=acme --param repo=widgetsMissing required parameters are prompted for interactively. The
template.yml metadata itself never lands in the scaffolded project.
Substitution. {{ name }} placeholders (with or without
whitespace) in .yml, .yaml, .sql, .md, and .txt files under
the template directory get replaced with the resolved values. Unknown
placeholders are left intact, so dbt Jinja ({{ ref('x') }}) passes
through untouched.
Templates updated.
github-analytics: declaresowner+repo. All four dlt
resources (issues,pulls,stargazers,contributors) now use
the substituted values.weather-station: declaresstation_id+office+gridX+
gridY, matching the NOAA API's path segments.
e2e upgrades. Both templates' @pytest.mark.e2e tests now run
actual ingestion (--max-records 5 caps) instead of stopping at
init, with xfail-on-upstream-flake semantics so rate-limiting or
API downtime doesn't hard-fail CI.
2. Snowflake / BigQuery warehouse alignment ✅ landed
v0.1.2 extended the v0.1.1 warehouse-alignment check from DuckDB-only
to MotherDuck. v0.1.3 extends it to Snowflake, BigQuery, Redshift,
and "anything else." The tradeoff called out in the original scoping
turned out to be the right framing: cloud profiles aren't
structurally comparable by a single "path" field, so the alignment
check reduces to adapter-type equality (plus account-match for
Snowflake).
New structured extractor. _extract_dbt_warehouse_target(dbt_dir)
returns a frozen DbtWarehouseTarget with four fields:
adapter_type— raw dbt adapter name (duckdb,snowflake,
bigquery,redshift, or anything dbt reports).identifier— the single best locator: filesystem path for local
DuckDB,md:<name>for MotherDuck,accountfor Snowflake,
projectfor BigQuery,hostfor Redshift.display— human-friendly string for prompts (e.g.,
snowflake://acme-us-east-1/ANALYTICS).details— per-adapter extras (database / schema / dataset /
warehouse / role / method / location / keyfile-path), kept so
callers can render richer warnings without re-parsing profiles.
A .tycoon_warehouse_type helper property maps the raw adapter to
tycoon's WarehouseType enum, handling the MotherDuck-via-DuckDB
overlap (md:* paths resolve to motherduck, everything else with
adapter_type == "duckdb" resolves to duckdb). Unknown adapters
(e.g. databricks) return None.
Register-time alignment. register_dbt now branches on adapter
type:
- DuckDB / MotherDuck — unchanged from v0.1.2 (path-normalized
comparison, updates bothdatabase.warehouseandstack.warehouse). - Snowflake / BigQuery / Redshift / unknown — a new
_align_cloud_warehousehelper warns whenstack.warehouse
disagrees with the dbt adapter type and offers to update it.
database.warehouseis explicitly left alone (it only makes sense
for DuckDB/MotherDuck). For Snowflake, any pre-existing
warehouse_connection.accountintycoon.ymlis compared to the
dbt profile's account and a non-fatal mismatch warning is emitted. - Unknown adapters produce an informational warning but no
stack.warehousechange (we don't want to push users into a
WarehouseType.otherwe can't yet reason about).
_extract_dbt_duckdb_path is kept as a thin backwards-compatible
shim over the new structured extractor, so existing init-wizard
callers that only care about the DuckDB/MotherDuck subset stay
untouched.
Tests. 11 new register tests (6 for the structured extractor
covering duckdb/md/snowflake/bigquery/unknown/missing-profile; 4 for
the CLI-level alignment covering Snowflake/BigQuery-update,
already-aligned no-op, and account-mismatch warning). Full suite
328 passed.
3. dbt build step in e2e tests ✅ landed
The csv-import e2e test now runs the full init → ingest → transform
pipeline on every PR (via the offline_e2e marker). New in v0.1.3:
- csv-import template ships a real dbt project. A new
dbt_project/subdirectory under the template bundles
dbt_project.yml,profiles.yml, and a
models/staging/stg_widgets.sql+schema.ymlpair. Scaffolded
verbatim bytycoon init --template csv-import. - Sample data included.
data/input/widgets.csvships with 10
rows sotycoon data sources run files && tycoon data transform run
works with zero manual setup. - Staging model
stg_widgetscasts / trims raw CSV rows into
typed columns (widget_id INTEGER,widget_name VARCHAR,
quantity INTEGER) withunique+not_nulltests on the PK. - Gotcha worth documenting: dlt's filesystem source +
read_csv()
transformer produces a single unioned_read_csvtable per schema
(not one table per CSV file). The staging model references that
dlt-internal table name; users adding more CSVs can filter or split
downstream. - e2e assertions extended: row-count and column-type checks on
main.stg_widgets, plus a cross-check that the observability layer
captured thedbt runinvocation and the per-node success.
4. dlt trace enrichment (observability v2a) ✅ landed
v0.1.2's dlt observability captured _dlt_loads + per-table row
counts. Trace-level detail (byte sizes, per-step durations, per-job
error messages) lives in ~/.dlt/pipelines/<name>/trace.pickle — dlt
ships these as pickled PipelineTrace objects, not JSON. v0.1.3 adds
a capture_dlt_trace helper that unpickles the trace, calls
.asdict(), and inserts it into three new tables in the metadata
DB:
dlt_trace_runs— one row per pipeline run
(transaction_id, pipeline_name, started_at, finished_at, duration_s,
engine_version, success, exception).dlt_trace_steps— one row per (transaction_id, step)
covering extract / normalize / load / run with per-step duration
andstep_exception.dlt_trace_jobs— one row per (transaction_id, job_id) for
load-step packages: table_name, file_format, state, file_size_bytes,
elapsed_s, failed_message.
The capture hook runs after every successful ingest and reuses the
existing pipeline.pipeline_name to locate trace.pickle. All
operations are best-effort — a missing or malformed trace never
propagates to the caller.
Surfaces:
tycoon data history show <load_id>now prints pipeline name,
total duration, total bytes written, and a Steps table (extract /
normalize / load durations + status). The per-table view gains a
Bytes column when trace data exists.- Three new Parquet files under
data/parquet/_tycoon/keep Rill's
local_fileconnector in sync.
Tests. tests/test_observability_trace.py covers the dict-form
capture (insert + idempotency + missing-id + step-exception),
disk-form capture (missing file, pickled round-trip), and the
Parquet-export inclusion. tests/test_history.py verifies the
enriched drilldown renders Duration + Bytes when a trace is
captured.
5. dbt manifest.json schema-diff (observability v2b) ✅ landed
The v0.1.2 dbt_runs / dbt_nodes capture answers "what happened
during this invocation" but doesn't catch "this model's SQL
changed between runs" or "a column got renamed / dropped."
v0.1.3 fills the gap by snapshotting target/manifest.json after
every dbt invocation and diffing against the previous snapshot.
Two new tables:
dbt_manifest_snapshots— one row per captured manifest
(invocation_id,generated_at,dbt_schema_version,
fingerprint_json). The fingerprint is a compact JSON blob of
{unique_id: {resource_type, checksum, columns: {name: type}}}
filtered to model / seed / snapshot resources — small enough to
keep the snapshot row cheap while preserving everything the diff
needs.dbt_schema_changes— one row per detected change, with
columnsinvocation_id/prev_invocation_id/change_type/
unique_id/column_name/old_value/new_value. Five
change types:model_added,model_removed,sql_changed,
column_added,column_removed,column_type_changed. First
snapshot records zero changes (nothing to diff against).
Capture hook runs best-effort after every tycoon data transform run/test/build — wired into _capture_dbt_and_refresh_safe next to
the existing capture_dbt_safe. Missing or malformed manifests
never propagate.
Surfaces:
tycoon data history show <invocation_id>appends a "Schema
changes vs. previous run" table (Change · Node · Column · Old →
New) below the existing Nodes table when changes were captured.- Both tables are part of the Parquet export (
data/parquet/_tycoon/ dbt_manifest_snapshots.parquet+dbt_schema_changes.parquet)
ready for Rill dashboards.
Tests. tests/test_observability_manifest.py covers fingerprint
extraction (model/seed/snapshot filtering, missing checksum +
columns), the pure diff function (add / remove model, sql change,
column added / removed / type changed, first-capture no-op), and a
manifest.json round-trip with duplicate-invocation idempotency,
missing-file no-op, and safe-wrapper exception swallowing.
tests/test_history.py::test_show_dbt_surfaces_schema_changes
verifies the drilldown rendering.
Nao context surface for coding agents ✅ landed
A late-add to v0.1.3 (after the rest of the release notes were
"finalized") that shipped because it was small, additive, and
made the existing tycoon ask work meaningfully more useful for
agent-driven workflows.
What it does. tycoon ask init and tycoon ask sync now also
write an AGENTS.md at the project root pointing at the
Nao-synced context tree. Coding agents that auto-read AGENTS.md
(Claude Code, Cursor, Windsurf, etc.) get oriented to the
project's data context for free — they know to look at:
.tycoon/nao/databases/type=<engine>/database=<name>/schema=<schema>/table=<table>/{columns,preview}.md.tycoon/nao/repos/dbt/models/(synced dbt SQL + YAML).tycoon/nao/RULES.md(project-specific agent rules)
Sentinel-based ownership. The generated AGENTS.md carries
an <!-- @generated by tycoon ask --> sentinel near the top.
On subsequent ask init / ask sync runs, tycoon refreshes the
file only if the sentinel is present in the first 500 chars.
If a user wrote their own AGENTS.md, tycoon prints a warning
and leaves it alone — no clobbering.
tycoon ask context cat-command for piping context into any
agent harness without launching the chat UI:
tycoon ask context # list every synced table
tycoon ask context --table dim_users # cat columns.md + preview.md
tycoon ask context --schema mart # all tables in schema mart
tycoon ask context --rules-only # cat RULES.md
tycoon ask context --include-dbt # appends synced dbt model SQLOutput is plain markdown on stdout so it composes cleanly:
tycoon ask context --table dim_users | claude -p "explain this table"Tests. tests/test_nao.py::TestAgentsMd covers the pure
generator + write logic (sentinel-detected overwrite, user-file
preservation, missing-file write). tests/test_ask_context.py
covers the CLI surface — listing mode, table/schema filters,
rules-only, missing-context errors, and filter-no-match
diagnostics.
6. One-command MotherDuck + Nao + LM Studio setup
Follow-through on issue
#7. A
single command (tycoon register stack or similar) walks through:
(a) MotherDuck auth (OAuth or token), (b) Nao init + schema sync,
(c) LM Studio (or any local OpenAI-compatible endpoint) detection +
wiring. Today each of these is a separate 2–5 step process and
ordering matters.
7. tycoon data sync — cloud ↔ local DuckDB snapshots
Follow-through on issue
#12. A
new subcommand that snapshots a MotherDuck warehouse to a local
DuckDB file (for offline analysis / AI training / air-gapped
demos), and conversely uploads a local DuckDB to MotherDuck. Uses
DuckDB's native ATTACH 'md:...' + COPY FROM under the hood;
tycoon's job is just the auth + path resolution + naming conventions.
Candidates carried forward from v0.1.2 "What's next"
These are the same themes enumerated above, referenced here for
continuity with the v0.1.2 release notes. If any slip past v0.1.3 they
should be moved to a v0.1.4 "What's next" section rather than left
undocumented.
Known issues to address
- Pre-existing
tytype diagnostics insrc/tycoon/ingestion/runner.py
lines 84 (dltrest_api_sourceexpectingRESTAPIConfignotdict)
and 260 (regexmatch.grouponNone) — carried from v0.1.2. src/tycoon/scaffolding/rill_generator.pyline 210:warehouse_db_path
public API param is accepted but unused insidegenerate_rill_config
— carried from v0.1.2. Either wire it in or deprecate the parameter.
Test + CI goals for v0.1.3
- Coverage floor ratchet: raise from 60% (v0.1.2 floor) to
62–63% if feasible. Each new capture helper + parameterization
path should ship with tests that push the overall number up. - Full e2e gate on every PR: once
dbt buildruns in csv-import's
e2e, promote any future template that acquires a full offline
pipeline into theoffline_e2emarker too. - Dependabot: a follow-up to v0.1.2's dependency bumps — configure
.github/dependabot.ymlso Python and Actions deps get automated PRs
we can review at a glance instead of bundling them into release
cycles by hand.