Tycoon v0.1.2

Released: 2026-04-19

Six themes in this release: MotherDuck alignment,
tycoon register warehouse, network-gated e2e tests plus a new
PR-gated CI workflow that runs the default suite on every push, a
Nao/ask cleanup pass, a new Tycoon observability layer that
captures dlt + dbt run history into a dedicated metadata DuckDB and
surfaces it as two auto-generated Rill dashboards, and a terminal-side
view of that history via tycoon data history plus a new Runs column
on tycoon data status. Together they finish the register family,
extend the warehouse-alignment story from v0.1.1, fix a cluster of
papercuts around tycoon ask that surfaced during the MotherDuck + Nao

LM Studio dogfood walkthrough, and close the observability gap between
"I ran a pipeline" and "I can see exactly what it did, now and
historically".

Headline

Warehouse alignment now covers MotherDuck

The alignment check added in v0.1.1 prevented the "dbt writes to
data/warehouse.duckdb, tycoon data query reads from somewhere else"
divergence — but only for local DuckDB. v0.1.2 extends it to MotherDuck:
if your dbt project's profile targets md:theirs and tycoon.yml has a
local DuckDB warehouse (or the other way around), the wizard and
tycoon register dbt both prompt to adopt the dbt-side value.

Snowflake/BigQuery intentionally not covered yet — dbt profiles for
those warehouses are structurally different (no single "connection
string" to compare), and we'd rather ship MotherDuck cleanly than
half-cover the others. tycoon doctor still reports credential
readiness for Snowflake/BigQuery.

`tycoon register warehouse`

The register family grows one more:

tycoon register warehouse

An interactive prompt — "cloud or local?". Pick local and you're asked
for a DuckDB path. Pick cloud and you get a MotherDuck name prompt plus
instructions for grabbing a token from
app.motherduck.com/token if
MOTHERDUCK_TOKEN isn't already set. Updates
database.warehouse and stack.warehouse in tycoon.yml; prompts
before overwriting an existing warehouse setting.

We considered a parallel tycoon register ingestion and shelved it —
for external ingestion (Airbyte, Fivetran) tycoon only records the
choice; the command would have been a one-line config flip, and the
wizard already covers that case. If it turns out to be useful, it's
two dozen lines in a follow-up.

Network-gated e2e template tests

A new @pytest.mark.e2e marker, deselected from the default pytest
run, powers per-template end-to-end tests. The default suite stays fast
and fully offline; pytest -m e2e opts into the network path.

One test per built-in template exists today; coverage depth varies by
template:

csv-import — full ingest + row-count assertion (no network).
nyc-transit — fetches nyc-dot with --max-records 50; flaky
upstream responses xfail rather than fail the build.
github-analytics — skipped without GITHUB_TOKEN; init-only
today (template URLs still hard-code {owner}/{repo} placeholders).
weather-station — init-only today (URL templates need
{station_id} / {office} fill-in; follow-up will add defaults).

A new .github/workflows/e2e.yml runs the suite on workflow_dispatch
only — no cron, no default-branch runs, no minutes burned on flaky
upstream timeouts. Someone clicks "Run workflow" when they want an
upstream compatibility check.

Tycoon observability: dlt + dbt run history, now a first-class citizen

tycoon data status has always shown the last dlt load time and
current row counts — useful, but silent about history. And dbt has
never had any tycoon-side observability at all: target/run_results.json
gets overwritten on every invocation, so unless you archived it
yourself you had no way to see "did my last build succeed? how long
did it take? which model is getting slower over time?"

v0.1.2 introduces a small observability layer that closes both gaps at
once.

A dedicated metadata DuckDB at .tycoon/metadata.duckdb. Four
tables, disposable by design — delete the file to reset history:

dlt_runs — one row per _dlt_loads entry across every source schema
dlt_rows_by_table — per-load row counts per table (via
count(*) GROUP BY _dlt_load_id)
dbt_runs — one row per dbt invocation (command, elapsed, success,
models_ok, models_error, tests_passed, tests_failed, dbt_version,
target, invocation_id)
dbt_nodes — one row per model/test/seed/snapshot per invocation
(status, execution_time, rows_affected, compile_time, message)

All writes are INSERT … ON CONFLICT DO NOTHING, so re-capturing the
same load or invocation is a no-op. You can also query the metadata DB
directly:

tycoon data query --db .tycoon/metadata.duckdb \
  "SELECT invocation_id, started_at, elapsed_s, success
   FROM dbt_runs ORDER BY started_at DESC LIMIT 10"

Capture hooks. Two thin best-effort calls, each wrapped in
try/except so observability can never break the operation it's
attached to:

The dlt ingestion runner mirrors new loads into the metadata DB at
the end of every successful tycoon data sources run.
tycoon data transform run/test/build parses
target/run_results.json immediately after each dbt invocation and
inserts the invocation + every node.

Two auto-generated Rill dashboards. Each appears only when its
table has data, so new projects don't start with empty explores:

_tycoon_dlt_usage.yaml — dlt load timeline, success rate, rows
loaded per schema/table/load.
_tycoon_dbt_usage.yaml — dbt run timeline, success rate, avg
duration, models built, model errors, tests passed/failed.

Parquet snapshots under data/parquet/_tycoon/ are re-exported from
the metadata DB after every capture, so Rill stays current without you
touching a scaffold command.

One caveat worth flagging on the dlt side. Row counts derive from
count(*) GROUP BY _dlt_load_id — exact for write_disposition=append,
best-effort for replace and merge (older loads' row counts drop as
rows get overwritten). Byte sizes + per-job durations (parseable from
~/.dlt/pipelines/<name>/trace.json) are deferred, as is dbt
schema-diff via manifest.json snapshotting.

What's deliberately not hooked. tycoon run dbt … — the generic
CLI passthrough — skips observability capture. That command is pure
forwarding by design, and making it aware of a specific tool would
break the contract. Users who want history graduate to
tycoon data transform.

`tycoon data history` and the status Runs column

The metadata DB is great, but SQL'ing it by hand isn't the nicest UX.
v0.1.2 adds a first-class terminal view:

tycoon data history                          # last 20 runs, dlt + dbt mixed
tycoon data history --tool dbt --limit 50    # dbt-only, last 50
tycoon data history --source pokeapi         # all dlt loads for a given source
tycoon data history show deadbeef            # per-node / per-table drilldown

Short id prefixes resolve against both dlt load_ids and dbt
invocation_ids. Ambiguous prefixes error out with the candidates
listed, so you never operate on the wrong run by accident. The
--source filter accepts either a source name from tycoon.yml
(pokeapi) or a schema literal (raw_pokeapi) — both resolve to the
same filter. When --source is active, dbt runs are hidden from the
list since they aren't source-scoped.

tycoon data status gains a new Runs column pulled from the same
metadata DB, showing the total number of captured dlt loads per source.
When any source has run history, a Drill in with tycoon data history
hint is printed beneath the table. Falls back to — when the metadata
DB doesn't exist yet.

Three small quality-of-life polish items landed alongside:

Scaffolded .gitignore excludes .tycoon/metadata.duckdb* — new
projects don't accidentally commit their run history on git add ..
tycoon data clean learns --metadata — by default (including
with --all), the observability metadata DB is preserved so
routine clean cycles don't nuke run history. --metadata is the
explicit opt-in to wipe it.
tycoon doctor gains an observability check — prints one of
"metadata DB not yet created", "no runs captured yet", or
"N dlt load(s), M dbt run(s) captured" so "why are my dashboards
empty?" is diagnosable in one glance.

CI: tests now gate every PR

Until v0.1.2, the test suite only ran when someone remembered to invoke
uv run pytest locally. There was no PR workflow — the only CI job
was a tag-triggered PyPI publish and the manual e2e.yml. That's a
real hole.

New .github/workflows/ci.yml runs on every pull_request and push
to main:

Full default pytest -q suite (unit + offline-e2e — the csv-import
template's full init → sources add → sources run → row-count pipeline now runs on every PR, not just when someone clicks "Run
workflow")
uvx ruff check lint gate
Matrix on Python 3.12 + 3.13 so regressions on either minor
version get caught before release
Concurrency-gated so pushes cancel superseded runs

The network-gated e2e tests (nyc-transit, github-analytics,
weather-station) stay behind the original e2e marker and the
manual e2e.yml workflow — they need credentials and hit flaky
upstream APIs, not suitable for per-PR gating.

Three contributor-facing additions round out the testing story:

Coverage floor: CI now fails if overall coverage drops below 60%.
Baseline is ~65%, so there's ~5% drift headroom. The floor lives in
[tool.coverage.report].fail_under and is meant to ratchet upward
1–2 points per release as real tests get added — aspirational
tracking is how coverage gates end up being ignored; a regression
gate that works is the point. The jump from 61% to 65% came from
two targeted test-suite additions below.
CONTRIBUTING.md: first-time-contributor guide covering dev
setup, what CI gates on, the three test-marker tiers (default,
offline_e2e, e2e), code conventions, and the release process.
Optional pre-commit hooks: .pre-commit-config.yaml runs ruff
(--fix mode) plus a handful of standard hygiene hooks before each
commit. Opt-in via uvx pre-commit install — CI is still the source
of truth; this just catches failures before the PR opens.

What changed

Added

Tycoon observability layer
(#13): new
src/tycoon/observability.py module owns .tycoon/metadata.duckdb
with four tables (dlt_runs, dlt_rows_by_table, dbt_runs,
dbt_nodes). Capture helpers capture_dlt and capture_dbt mirror
from each raw DB / parse target/run_results.json with idempotent
ON CONFLICT DO NOTHING writes. The ingestion runner calls
capture_dlt_safe after every successful run_source; the dbt
transform command calls capture_dbt_safe after every run / test
/ build. Both are best-effort — observability failures never
propagate. rill_generator.refresh_usage_dashboards re-exports the
four Parquets under data/parquet/_tycoon/ and idempotently writes
_tycoon_dlt_usage.yaml / _tycoon_dbt_usage.yaml (plus their
metrics_view + sources). Dashboards appear only when their backing
table is non-empty.
tycoon data history — terminal view of recent dlt + dbt runs
with --tool {all,dlt,dbt}, --limit N, and --source <name>
filters. tycoon data history show <id> drills into a specific run
(short id prefix resolution; ambiguous prefixes error out with
candidates listed). --source accepts either a config name from
tycoon.yml or a raw schema literal.
tycoon data status gains a Runs column pulled from
dlt_runs in the metadata DB, plus a drill-in hint when any source
has history. Falls back to — when the metadata DB doesn't exist yet.
tycoon doctor now checks observability — reports one of
"metadata DB not yet created", "no runs captured yet", or
"N dlt load(s), M dbt run(s) captured".
Scaffolded .gitignore excludes .tycoon/metadata.duckdb* so
run history never accidentally gets committed.
tycoon data clean --metadata — new flag for explicitly wiping
the observability metadata DB. By default (including --all), the
metadata DB is preserved so routine clean cycles don't nuke history.
.github/workflows/ci.yml — PR + main-push gate running the full
default pytest suite (unit + offline-e2e) plus uvx ruff check on
every change. Matrix on Python 3.12 + 3.13. Concurrency-gated.
offline_e2e pytest marker — promotes the csv-import template's
full init → ingest → row-count pipeline into the default pytest
run so CI gates on real integration, not just unit tests.
Ruff configuration in pyproject.toml (line-length 120,
target py312) with two per-file ignores for legitimate patterns.
Coverage gate via pytest-cov: floor at 60% in
[tool.coverage.report].fail_under; baseline ~65%. CI uploads
coverage.xml as an artifact on the 3.12 matrix leg.
FastAPI server tests (11 new) — /, /health, /check-updates
(mocked httpx), /api/status, /api/run/pipeline/{source_name},
/api/run/dbt (including 404 / 409 paths), and the
/ws/logs/{run_id} WebSocket.
Dagster orchestration smoke tests (10 new) — defs imports
cleanly, build_ingestion_assets handles missing and present
configs, dashed source names are sanitized, resource factories
return valid Dagster resources. Catches the
DagsterInvalidDefinitionError class of bug (legacy #4 / #13) at
import time.
CONTRIBUTING.md — onboarding doc: dev setup, CI gate details,
test marker semantics, code conventions, release process.
.pre-commit-config.yaml — optional pre-commit hooks (ruff +
standard hygiene) mirroring CI. Opt in via uvx pre-commit install.
MotherDuck warehouse alignment: _extract_dbt_duckdb_path and
the register dbt / init-wizard alignment flow now handle
path: md:<name> dbt profiles, preserving the prefix and comparing
against tycoon's md:* warehouse value.
tycoon register warehouse: cloud (MotherDuck) or local (DuckDB)
interactive prompt; surfaces MOTHERDUCK_TOKEN guidance when absent;
prompts before overwriting an existing warehouse.
@pytest.mark.e2e marker (registered in
[tool.pytest.ini_options]) with default-deselect behavior and
tests/test_templates_e2e.py covering all four built-in templates.
.github/workflows/e2e.yml: manual-trigger-only CI job for the
e2e suite, with GITHUB_TOKEN secret available for the
github-analytics slot.

Changed

Init wizard's warehouse-alignment branch now triggers when the
chosen warehouse is either DuckDB or MotherDuck (previously DuckDB
only). If alignment swaps a local path for an md:* target (or vice
versa), stack.warehouse is updated accordingly.

Fixed

tycoon init no longer emits database.raw == database.warehouse
(#11). The
local-DuckDB wizard branch produced a tycoon.yml where both fields
pointed at the same file; tycoon data transform run then failed with
dbt-duckdb's Unique file handle conflict. Scaffolding now keeps raw
sibling-distinct.
tycoon ask sync / ask chat now work out of the box
(#6). Replaced
python -m nao_core (which nao-core doesn't support — no __main__.py)
with venv-colocated nao binary resolution, mirroring how
tycoon data transform finds dbt.
MotherDuck URLs pass through Nao config verbatim
(#5). Previously
md:my_catalog was path-joined to ../../md:my_catalog, breaking every
MotherDuck + Nao stack silently.
ask.include_schemas is glob-expanded before writing nao config
(#10). Bare
names like mart silently matched nothing under Nao's
fnmatch(schema.table, pattern) filter; they're now auto-expanded to
mart.*. Already-qualified patterns are left alone.
Nao's chat SQLite lives at .tycoon/nao/db.sqlite
(#8) instead of
inside the venv. Chat history and local Nao user accounts survive
uv sync and tycoon upgrades.
tycoon doctor recognizes cached MotherDuck OAuth
(#3). Used to
unconditionally error with MOTHERDUCK_TOKEN is not set even when the
user had a live browser-OAuth session; now reports token (env) /
OAuth (cached session) / not configured.
nao-core bumped 0.0.59 → 0.1.7
(#9) to silence
the nagging "run nao upgrade" banner on every invocation. Emitted
config now uses templates instead of the deprecated accessors key.

Upgrade notes

pip install -U database-tycoon

No breaking changes. Existing tycoon.yml files continue to work
unchanged; the new register subcommand and e2e marker are purely
additive.

Dependency bumps: rich, dlt, duckdb, pydantic, fastapi,
dagster*, nao-core, pytest all advanced minor-or-patch versions.
No API changes expected from any of them.

Known limitations (carried forward)

Snowflake and BigQuery: dbt can transform against them, but
tycoon data query is DuckDB/MotherDuck-only for now, and warehouse
alignment doesn't cover their dbt profiles yet.
External dlt pipelines still can't be run via
tycoon data sources run — only tycoon-managed dlt sources are.
github-analytics and weather-station e2e tests only verify init
today; full ingest requires template-side support for injecting
runtime values ({owner}/{repo}, {station_id}).

What's next (v0.1.3 candidates)

Template parameterization: support for injecting values into
template URL patterns at sources run time, so github-analytics
and weather-station e2e tests can actually hit live endpoints
end-to-end.
Snowflake/BigQuery warehouse alignment: structural comparison of
dbt profiles for non-DuckDB warehouses, now that the MotherDuck path
is proven.
dbt build in e2e: extend the e2e tests to run
tycoon data transform run after ingest. Waiting on templates
shipping at least one model to build — today they're ingest-only.
Observability v2: byte sizes + per-job durations for dlt (via
~/.dlt/pipelines/<name>/trace.json); dbt schema-diff via
manifest.json snapshotting (detect renamed models, added columns,
changed SQL hashes between invocations); capture hooks from
tycoon run dbt … if we decide to break the pure-passthrough
contract.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.2

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Tycoon v0.1.2

Headline

Warehouse alignment now covers MotherDuck

`tycoon register warehouse`

Network-gated e2e template tests

Tycoon observability: dlt + dbt run history, now a first-class citizen

`tycoon data history` and the status Runs column

CI: tests now gate every PR

What changed

Added

Changed

Fixed

Upgrade notes

Known limitations (carried forward)

What's next (v0.1.3 candidates)

Uh oh!

v0.1.2

Tycoon v0.1.2

Headline

Warehouse alignment now covers MotherDuck

tycoon register warehouse

Network-gated e2e template tests

Tycoon observability: dlt + dbt run history, now a first-class citizen

tycoon data history and the status Runs column

CI: tests now gate every PR

What changed

Added

Changed

Fixed

Upgrade notes

Known limitations (carried forward)

What's next (v0.1.3 candidates)

Uh oh!

`tycoon register warehouse`

`tycoon data history` and the status Runs column