v0.1.2
Tycoon v0.1.2
Released: 2026-04-19
Six themes in this release: MotherDuck alignment,
tycoon register warehouse, network-gated e2e tests plus a new
PR-gated CI workflow that runs the default suite on every push, a
Nao/ask cleanup pass, a new Tycoon observability layer that
captures dlt + dbt run history into a dedicated metadata DuckDB and
surfaces it as two auto-generated Rill dashboards, and a terminal-side
view of that history via tycoon data history plus a new Runs column
on tycoon data status. Together they finish the register family,
extend the warehouse-alignment story from v0.1.1, fix a cluster of
papercuts around tycoon ask that surfaced during the MotherDuck + Nao
- LM Studio dogfood walkthrough, and close the observability gap between
"I ran a pipeline" and "I can see exactly what it did, now and
historically".
Headline
Warehouse alignment now covers MotherDuck
The alignment check added in v0.1.1 prevented the "dbt writes to
data/warehouse.duckdb, tycoon data query reads from somewhere else"
divergence — but only for local DuckDB. v0.1.2 extends it to MotherDuck:
if your dbt project's profile targets md:theirs and tycoon.yml has a
local DuckDB warehouse (or the other way around), the wizard and
tycoon register dbt both prompt to adopt the dbt-side value.
Snowflake/BigQuery intentionally not covered yet — dbt profiles for
those warehouses are structurally different (no single "connection
string" to compare), and we'd rather ship MotherDuck cleanly than
half-cover the others. tycoon doctor still reports credential
readiness for Snowflake/BigQuery.
tycoon register warehouse
The register family grows one more:
tycoon register warehouseAn interactive prompt — "cloud or local?". Pick local and you're asked
for a DuckDB path. Pick cloud and you get a MotherDuck name prompt plus
instructions for grabbing a token from
app.motherduck.com/token if
MOTHERDUCK_TOKEN isn't already set. Updates
database.warehouse and stack.warehouse in tycoon.yml; prompts
before overwriting an existing warehouse setting.
We considered a parallel tycoon register ingestion and shelved it —
for external ingestion (Airbyte, Fivetran) tycoon only records the
choice; the command would have been a one-line config flip, and the
wizard already covers that case. If it turns out to be useful, it's
two dozen lines in a follow-up.
Network-gated e2e template tests
A new @pytest.mark.e2e marker, deselected from the default pytest
run, powers per-template end-to-end tests. The default suite stays fast
and fully offline; pytest -m e2e opts into the network path.
One test per built-in template exists today; coverage depth varies by
template:
- csv-import — full ingest + row-count assertion (no network).
- nyc-transit — fetches
nyc-dotwith--max-records 50; flaky
upstream responsesxfailrather than fail the build. - github-analytics — skipped without
GITHUB_TOKEN; init-only
today (template URLs still hard-code{owner}/{repo}placeholders). - weather-station — init-only today (URL templates need
{station_id}/{office}fill-in; follow-up will add defaults).
A new .github/workflows/e2e.yml runs the suite on workflow_dispatch
only — no cron, no default-branch runs, no minutes burned on flaky
upstream timeouts. Someone clicks "Run workflow" when they want an
upstream compatibility check.
Tycoon observability: dlt + dbt run history, now a first-class citizen
tycoon data status has always shown the last dlt load time and
current row counts — useful, but silent about history. And dbt has
never had any tycoon-side observability at all: target/run_results.json
gets overwritten on every invocation, so unless you archived it
yourself you had no way to see "did my last build succeed? how long
did it take? which model is getting slower over time?"
v0.1.2 introduces a small observability layer that closes both gaps at
once.
A dedicated metadata DuckDB at .tycoon/metadata.duckdb. Four
tables, disposable by design — delete the file to reset history:
dlt_runs— one row per_dlt_loadsentry across every source schemadlt_rows_by_table— per-load row counts per table (via
count(*) GROUP BY _dlt_load_id)dbt_runs— one row per dbt invocation (command, elapsed, success,
models_ok, models_error, tests_passed, tests_failed, dbt_version,
target, invocation_id)dbt_nodes— one row per model/test/seed/snapshot per invocation
(status, execution_time, rows_affected, compile_time, message)
All writes are INSERT … ON CONFLICT DO NOTHING, so re-capturing the
same load or invocation is a no-op. You can also query the metadata DB
directly:
tycoon data query --db .tycoon/metadata.duckdb \
"SELECT invocation_id, started_at, elapsed_s, success
FROM dbt_runs ORDER BY started_at DESC LIMIT 10"
Capture hooks. Two thin best-effort calls, each wrapped in
try/except so observability can never break the operation it's
attached to:
- The dlt ingestion runner mirrors new loads into the metadata DB at
the end of every successfultycoon data sources run. tycoon data transform run/test/buildparses
target/run_results.jsonimmediately after each dbt invocation and
inserts the invocation + every node.
Two auto-generated Rill dashboards. Each appears only when its
table has data, so new projects don't start with empty explores:
_tycoon_dlt_usage.yaml— dlt load timeline, success rate, rows
loaded per schema/table/load._tycoon_dbt_usage.yaml— dbt run timeline, success rate, avg
duration, models built, model errors, tests passed/failed.
Parquet snapshots under data/parquet/_tycoon/ are re-exported from
the metadata DB after every capture, so Rill stays current without you
touching a scaffold command.
One caveat worth flagging on the dlt side. Row counts derive from
count(*) GROUP BY _dlt_load_id — exact for write_disposition=append,
best-effort for replace and merge (older loads' row counts drop as
rows get overwritten). Byte sizes + per-job durations (parseable from
~/.dlt/pipelines/<name>/trace.json) are deferred, as is dbt
schema-diff via manifest.json snapshotting.
What's deliberately not hooked. tycoon run dbt … — the generic
CLI passthrough — skips observability capture. That command is pure
forwarding by design, and making it aware of a specific tool would
break the contract. Users who want history graduate to
tycoon data transform.
tycoon data history and the status Runs column
The metadata DB is great, but SQL'ing it by hand isn't the nicest UX.
v0.1.2 adds a first-class terminal view:
tycoon data history # last 20 runs, dlt + dbt mixed
tycoon data history --tool dbt --limit 50 # dbt-only, last 50
tycoon data history --source pokeapi # all dlt loads for a given source
tycoon data history show deadbeef # per-node / per-table drilldown
Short id prefixes resolve against both dlt load_ids and dbt
invocation_ids. Ambiguous prefixes error out with the candidates
listed, so you never operate on the wrong run by accident. The
--source filter accepts either a source name from tycoon.yml
(pokeapi) or a schema literal (raw_pokeapi) — both resolve to the
same filter. When --source is active, dbt runs are hidden from the
list since they aren't source-scoped.
tycoon data status gains a new Runs column pulled from the same
metadata DB, showing the total number of captured dlt loads per source.
When any source has run history, a Drill in with tycoon data history
hint is printed beneath the table. Falls back to — when the metadata
DB doesn't exist yet.
Three small quality-of-life polish items landed alongside:
- Scaffolded
.gitignoreexcludes.tycoon/metadata.duckdb*— new
projects don't accidentally commit their run history ongit add .. tycoon data cleanlearns--metadata— by default (including
with--all), the observability metadata DB is preserved so
routine clean cycles don't nuke run history.--metadatais the
explicit opt-in to wipe it.tycoon doctorgains an observability check — prints one of
"metadata DB not yet created", "no runs captured yet", or
"N dlt load(s), M dbt run(s) captured" so "why are my dashboards
empty?" is diagnosable in one glance.
CI: tests now gate every PR
Until v0.1.2, the test suite only ran when someone remembered to invoke
uv run pytest locally. There was no PR workflow — the only CI job
was a tag-triggered PyPI publish and the manual e2e.yml. That's a
real hole.
New .github/workflows/ci.yml runs on every pull_request and push
to main:
- Full default
pytest -qsuite (unit + offline-e2e — the csv-import
template's fullinit → sources add → sources run → row-count pipelinenow runs on every PR, not just when someone clicks "Run
workflow") uvx ruff checklint gate- Matrix on Python 3.12 + 3.13 so regressions on either minor
version get caught before release - Concurrency-gated so pushes cancel superseded runs
The network-gated e2e tests (nyc-transit, github-analytics,
weather-station) stay behind the original e2e marker and the
manual e2e.yml workflow — they need credentials and hit flaky
upstream APIs, not suitable for per-PR gating.
Three contributor-facing additions round out the testing story:
- Coverage floor: CI now fails if overall coverage drops below 60%.
Baseline is ~65%, so there's ~5% drift headroom. The floor lives in
[tool.coverage.report].fail_underand is meant to ratchet upward
1–2 points per release as real tests get added — aspirational
tracking is how coverage gates end up being ignored; a regression
gate that works is the point. The jump from 61% to 65% came from
two targeted test-suite additions below. CONTRIBUTING.md: first-time-contributor guide covering dev
setup, what CI gates on, the three test-marker tiers (default,
offline_e2e,e2e), code conventions, and the release process.- Optional pre-commit hooks:
.pre-commit-config.yamlruns ruff
(--fixmode) plus a handful of standard hygiene hooks before each
commit. Opt-in viauvx pre-commit install— CI is still the source
of truth; this just catches failures before the PR opens.
What changed
Added
- Tycoon observability layer
(#13): new
src/tycoon/observability.pymodule owns.tycoon/metadata.duckdb
with four tables (dlt_runs,dlt_rows_by_table,dbt_runs,
dbt_nodes). Capture helperscapture_dltandcapture_dbtmirror
from each raw DB / parsetarget/run_results.jsonwith idempotent
ON CONFLICT DO NOTHINGwrites. The ingestion runner calls
capture_dlt_safeafter every successfulrun_source; the dbt
transform command callscapture_dbt_safeafter everyrun/test
/build. Both are best-effort — observability failures never
propagate.rill_generator.refresh_usage_dashboardsre-exports the
four Parquets underdata/parquet/_tycoon/and idempotently writes
_tycoon_dlt_usage.yaml/_tycoon_dbt_usage.yaml(plus their
metrics_view + sources). Dashboards appear only when their backing
table is non-empty. tycoon data history— terminal view of recent dlt + dbt runs
with--tool {all,dlt,dbt},--limit N, and--source <name>
filters.tycoon data history show <id>drills into a specific run
(short id prefix resolution; ambiguous prefixes error out with
candidates listed).--sourceaccepts either a config name from
tycoon.ymlor a raw schema literal.tycoon data statusgains a Runs column pulled from
dlt_runsin the metadata DB, plus a drill-in hint when any source
has history. Falls back to—when the metadata DB doesn't exist yet.tycoon doctornow checks observability — reports one of
"metadata DB not yet created", "no runs captured yet", or
"N dlt load(s), M dbt run(s) captured".- Scaffolded
.gitignoreexcludes.tycoon/metadata.duckdb*so
run history never accidentally gets committed. tycoon data clean --metadata— new flag for explicitly wiping
the observability metadata DB. By default (including--all), the
metadata DB is preserved so routine clean cycles don't nuke history..github/workflows/ci.yml— PR + main-push gate running the full
default pytest suite (unit + offline-e2e) plusuvx ruff checkon
every change. Matrix on Python 3.12 + 3.13. Concurrency-gated.offline_e2epytest marker — promotes the csv-import template's
full init → ingest → row-count pipeline into the defaultpytest
run so CI gates on real integration, not just unit tests.- Ruff configuration in
pyproject.toml(line-length 120,
target py312) with two per-file ignores for legitimate patterns. - Coverage gate via
pytest-cov: floor at 60% in
[tool.coverage.report].fail_under; baseline ~65%. CI uploads
coverage.xmlas an artifact on the 3.12 matrix leg. - FastAPI server tests (11 new) —
/,/health,/check-updates
(mocked httpx),/api/status,/api/run/pipeline/{source_name},
/api/run/dbt(including 404 / 409 paths), and the
/ws/logs/{run_id}WebSocket. - Dagster orchestration smoke tests (10 new) —
defsimports
cleanly,build_ingestion_assetshandles missing and present
configs, dashed source names are sanitized, resource factories
return valid Dagster resources. Catches the
DagsterInvalidDefinitionErrorclass of bug (legacy #4 / #13) at
import time. CONTRIBUTING.md— onboarding doc: dev setup, CI gate details,
test marker semantics, code conventions, release process..pre-commit-config.yaml— optional pre-commit hooks (ruff +
standard hygiene) mirroring CI. Opt in viauvx pre-commit install.- MotherDuck warehouse alignment:
_extract_dbt_duckdb_pathand
theregister dbt/ init-wizard alignment flow now handle
path: md:<name>dbt profiles, preserving the prefix and comparing
against tycoon'smd:*warehouse value. tycoon register warehouse: cloud (MotherDuck) or local (DuckDB)
interactive prompt; surfacesMOTHERDUCK_TOKENguidance when absent;
prompts before overwriting an existing warehouse.@pytest.mark.e2emarker (registered in
[tool.pytest.ini_options]) with default-deselect behavior and
tests/test_templates_e2e.pycovering all four built-in templates..github/workflows/e2e.yml: manual-trigger-only CI job for the
e2e suite, withGITHUB_TOKENsecret available for the
github-analytics slot.
Changed
- Init wizard's warehouse-alignment branch now triggers when the
chosen warehouse is either DuckDB or MotherDuck (previously DuckDB
only). If alignment swaps a local path for anmd:*target (or vice
versa),stack.warehouseis updated accordingly.
Fixed
tycoon initno longer emitsdatabase.raw == database.warehouse
(#11). The
local-DuckDB wizard branch produced atycoon.ymlwhere both fields
pointed at the same file;tycoon data transform runthen failed with
dbt-duckdb'sUnique file handle conflict. Scaffolding now keepsraw
sibling-distinct.tycoon ask sync/ask chatnow work out of the box
(#6). Replaced
python -m nao_core(which nao-core doesn't support — no__main__.py)
with venv-colocatednaobinary resolution, mirroring how
tycoon data transformfindsdbt.- MotherDuck URLs pass through Nao config verbatim
(#5). Previously
md:my_catalogwas path-joined to../../md:my_catalog, breaking every
MotherDuck + Nao stack silently. ask.include_schemasis glob-expanded before writing nao config
(#10). Bare
names likemartsilently matched nothing under Nao's
fnmatch(schema.table, pattern)filter; they're now auto-expanded to
mart.*. Already-qualified patterns are left alone.- Nao's chat SQLite lives at
.tycoon/nao/db.sqlite
(#8) instead of
inside the venv. Chat history and local Nao user accounts survive
uv syncand tycoon upgrades. tycoon doctorrecognizes cached MotherDuck OAuth
(#3). Used to
unconditionally error withMOTHERDUCK_TOKEN is not seteven when the
user had a live browser-OAuth session; now reportstoken (env)/
OAuth (cached session)/not configured.- nao-core bumped 0.0.59 → 0.1.7
(#9) to silence
the nagging "runnao upgrade" banner on every invocation. Emitted
config now usestemplatesinstead of the deprecatedaccessorskey.
Upgrade notes
pip install -U database-tycoonNo breaking changes. Existing tycoon.yml files continue to work
unchanged; the new register subcommand and e2e marker are purely
additive.
Dependency bumps: rich, dlt, duckdb, pydantic, fastapi,
dagster*, nao-core, pytest all advanced minor-or-patch versions.
No API changes expected from any of them.
Known limitations (carried forward)
- Snowflake and BigQuery: dbt can transform against them, but
tycoon data queryis DuckDB/MotherDuck-only for now, and warehouse
alignment doesn't cover their dbt profiles yet. - External dlt pipelines still can't be run via
tycoon data sources run— only tycoon-managed dlt sources are. github-analyticsandweather-statione2e tests only verify init
today; full ingest requires template-side support for injecting
runtime values ({owner}/{repo},{station_id}).
What's next (v0.1.3 candidates)
- Template parameterization: support for injecting values into
template URL patterns atsources runtime, sogithub-analytics
andweather-statione2e tests can actually hit live endpoints
end-to-end. - Snowflake/BigQuery warehouse alignment: structural comparison of
dbt profiles for non-DuckDB warehouses, now that the MotherDuck path
is proven. - dbt build in e2e: extend the e2e tests to run
tycoon data transform runafter ingest. Waiting on templates
shipping at least one model to build — today they're ingest-only. - Observability v2: byte sizes + per-job durations for dlt (via
~/.dlt/pipelines/<name>/trace.json); dbt schema-diff via
manifest.jsonsnapshotting (detect renamed models, added columns,
changed SQL hashes between invocations); capture hooks from
tycoon run dbt …if we decide to break the pure-passthrough
contract.