Add ibis/xorq analysis backend (v2 @stat pipeline)#686
Conversation
📦 TestPyPI package publishedpip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.13.5.dev25199479204or with uv: uv pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.13.5.dev25199479204MCP server for Claude Codeclaude mcp add buckaroo-table -- uvx --from "buckaroo[mcp]==0.13.5.dev25199479204" --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo-table📖 Docs preview🎨 Storybook preview |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 212edeabf0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| class IbisDfStatsV2: | ||
| """Drop-in DfStats wrapper for ``ibis.Table`` inputs. | ||
|
|
||
| Mirrors the ``DfStatsV2`` / ``PlDfStatsV2`` surface (``.sdf``, ``.errs``, | ||
| ``.ap.ordered_a_objs``, ``verify_analysis_objects``) so DataFlow, |
There was a problem hiding this comment.
Implement add_analysis on IbisDfStatsV2
IbisDfStatsV2 is described as a drop-in DataFlow-compatible stats wrapper, but it does not implement add_analysis, while DataFlow.add_analysis unconditionally calls stats.add_analysis(...) (buckaroo/dataflow/dataflow.py:430-440). Any DataFlow configured to use IbisDfStatsV2 will raise AttributeError as soon as a user adds an analysis interactively, so this breaks the claimed no-changes integration path.
Useful? React with 👍 / 👎.
| .order_by(ibis.desc("__count")) | ||
| .limit(10) | ||
| ) | ||
| df = query.execute() |
There was a problem hiding this comment.
Route histogram execution through configured backend
The histogram code executes queries via query.execute() directly, which bypasses IbisStatPipeline._execute and therefore ignores any backend passed to IbisStatPipeline(…, backend=...). In mixed or unbound-expression setups this can execute against the wrong engine (or fail) specifically for histogram stats, while the rest of the pipeline uses the configured backend.
Useful? React with 👍 / 👎.
- Fix IbisAnalysisPipeline: remove premature xorq guard (only require ibis), add schema pre-seeding for dtype, filter None from expression results, add histogram_query_fns support - Add ibis_stats_v2.py with IbisTypingStats, IbisBaseSummaryStats, IbisNumericStats, IbisComputedSummaryStats, IbisHistogramStats - Add xorq optional dependency group to pyproject.toml - Add 23 tests using ibis.memtable (no xorq needed for local tests) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pre-push hook required ruff format on the new ibis files; regenerated the lock so xorq picks up 0.3.21 (was 0.3.10). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the existing v1-style class tests with tests targeting the yet-to-exist IbisStatPipeline + @stat-based ibis_stats_v2 module. Adds coverage that was missing in v1: - Live histogram (numeric + categorical) — was dead code - Structured error capture: a bad @stat surfaces as a StatError, not a silent stat dropout - DAG validation at pipeline construction - Constant-column histogram edge case Will fail collection until ibis_stat_pipeline.py exists. Implementation in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the v1-style IbisAnalysis class hierarchy with @stat-decorated functions executed through a new IbisStatPipeline that mirrors the typed-DAG semantics of StatPipeline. Two-phase execution: 1. Batch aggregate — every @stat with an IbisColumn parameter is folded into a single table.aggregate() query and executed once. 2. Per-column post-batch — computed stats and IbisTable-param stats (histograms) run through the standard typed-DAG executor. Resolves the issues called out in the adversarial review: - Histogram is live (was dead code in v1: IbisHistogramStats was defined but never added to IBIS_ANALYSIS). Numeric and categorical paths now have test coverage that asserts populated histograms. - Structured error capture replaces the three `except Exception: continue` blocks. Failures inside an @stat or in the batch query surface as StatError entries; downstream computed stats still see UpstreamError propagation. Tests assert that an intentionally bad @stat shows up in the returned errors list. - DAG ordering is implicit. requires_summary is no longer a comment — parameter names ARE the dependency declaration, validated at construction via build_typed_dag. - Naming matches the v2 pattern: filename stays ibis_stats_v2 but the contents now match the @stat / TypedDict style of pd_stats_v2. - Empty-table length=0 fix: the previous `COUNT(c) + SUM(...IS NULL)` expression returned NULL on an empty table because SUM-over-empty is NULL in SQL. Coalesced to 0. Adds IbisColumn / IbisTable marker types to stat_func.RAW_MARKER_TYPES and a small IbisTable injection branch in _execute_stat_func so the v2 pipeline can pass the table through to histogram-style stats. Deletes the v1 IbisAnalysisPipeline / IbisAnalysis (no remaining consumers in buckaroo/ or tests/). 25/25 ibis tests pass; full unit suite passes (modulo the pre-existing unrelated MCP uvx flake on main). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Asserts the DfStats-shaped wrapper that lets DataFlow / BuckarooWidget consume an ibis.Table: .sdf, .errs (v1 ErrDict shape), the verify_analysis_objects classmethod, and .ap.ordered_a_objs. Will fail collection until IbisDfStatsV2 is added in df_stats_v2.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Makes the ibis pipeline reachable from the rest of buckaroo:
- Adds IbisDfStatsV2 in df_stats_v2.py mirroring DfStatsV2 /
PlDfStatsV2 (.sdf, .errs, .ap.ordered_a_objs,
verify_analysis_objects). Any DataFlow consumer that takes an
arbitrary DFStatsClass now also accepts the ibis path.
- Adds process_table_v1_compat to IbisStatPipeline so the wrapper
returns the same {(col, stat): (Exception, kls)} ErrDict shape
that AnalysisPipeline produced.
- Adds ordered_a_objs property to IbisStatPipeline so the
add_analysis path matches StatPipeline.
ibis is still imported lazily inside IbisDfStatsV2 — the rest of the
df_stats_v2 module stays ibis-free.
End-to-end test asserts the round-trip (table -> .sdf with length=5
and mean=3.0) and that v1 errs shape matches the existing convention.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- IbisStatPipeline -> XorqStatPipeline - IbisColumn / IbisTable -> XorqColumn / XorqTable - IbisDfStatsV2 -> XorqDfStatsV2 - IBIS_STATS_V2 -> XORQ_STATS_V2 - ibis_stat_pipeline.py / ibis_stats_v2.py and tests -> xorq_* `import ibis` and HAS_IBIS kept — those reference the actual ibis-framework package, which is what the pipeline imports under the hood. The user-facing identifiers all move to Xorq to match buckaroo's branding. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
XorqStatPipeline accepts a `backend=` and uses it for the batch aggregate via `self._execute`, but the histogram @stat calls `query.execute()` directly three times (bounds, numeric bucket, categorical group_by). A TrackingBackend should see >1 call (batch + at least one histogram query); today it sees exactly 1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Histogram had three direct `query.execute()` calls (bounds, numeric bucket, categorical group_by) that bypassed `XorqStatPipeline._execute`, ignoring any user-supplied `backend=` argument. Adds a new marker type `XorqExecute` to the v2 framework. The pipeline injects its bound `_execute` method into any @stat that declares an `execute: XorqExecute` parameter, exactly like XorqTable / RawSeries. `histogram` and the helper functions now route every query through the injected callable. The TrackingBackend test now sees all 9 queries (1 batch + 4 columns × ~2 histogram queries) instead of just the batch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- #6: @stat(provides='foo') should override the auto-derived key - #4: histogram must not issue redundant bounds query (5 queries, not 7) - #7: XorqDfStatsV2.add_analysis must extend the DAG and reprocess - #10: StatPipeline.process_df must keep column keys for empty-but-typed dfs #5 has no behavior test — pure optimization, existing tests verify correctness. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s=, add_analysis, empty df #6 — @stat(provides='key') as a clean alternative to the single-key TypedDict charade. Drops _LengthResult / _MinResult / etc. in xorq_stats_v2.py; functions now declare their actual scalar return type and use provides= to name the accumulator key. #4 — XorqStatPipeline pre-populates min=None, max=None per column and adds them to EXTERNAL_KEYS. Histogram declares min: float, max: float deps and uses them directly. base_min / base_max overwrite the None for numeric cols; non-numeric cols keep None and fall into the categorical branch. Removes the redundant per-column bounds query — query count drops from 7 to 5 on the 4-col fixture (regression test in place). #5 — length hoisted out of per-column @stat. The batch query now includes one ``__total_length__`` expression (table.count()) regardless of column count, replacing N per-column length expressions. base_length removed from XORQ_STATS_V2; length lives in EXTERNAL_KEYS and is populated post-batch. #7 — XorqDfStatsV2.add_analysis + XorqStatPipeline.add_stat. Skips the PERVERSE_DF unit-test that StatPipeline.add_stat runs (no ibis-side equivalent). Behavioral parity with DfStatsV2.add_analysis. #10 — StatPipeline.process_df short-circuits on len(df.columns) == 0 instead of len(df) == 0, so an empty-but-typed df produces per-column entries instead of {}. Brings the pandas pipeline in line with XorqStatPipeline's behavior on empty tables. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
StatPipeline runs a PERVERSE_DF unit_test in __init__ to catch dumb stat bugs early. XorqStatPipeline doesn't yet. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#9 — XorqStatPipeline.__init__ now runs a unit_test against ibis.memtable wrapping PERVERSE_DF and stores the result in _unit_test_result. Catches dumb stat bugs (typos, wrong-dtype assumptions) at construction. The smoke check temporarily swaps backend=None so it doesn't inflate the caller's backend traffic — internal validation should run on the in-memory ibis backend, not the user's pipeline target. #8 — pyproject.toml now exposes two extras: - buckaroo[ibis]: just ibis-framework[duckdb] (minimal, sufficient for a single-backend workflow) - buckaroo[xorq]: ibis + xorq (cross-backend compute) The XorqStatPipeline ImportError now points at both options. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- _to_python_scalar must coerce pd.NA / pd.NaT to None (currently returns the singleton, which fails the framework's isinstance type check downstream with a confusing TypeError) - typing_stats must classify 'interval' as datetime-ish (today's substring-based check misses it) #11.3 (lazy-import comment) and #11.4 (XORQ_STATS_V2 docstring) are non-behavior changes — no failing-test commit for those. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ORQ_STATS_V2 note #11.1 — _to_python_scalar coalesces pd.NA / pd.NaT to None. Without this they pass through the helper unchanged and trip the framework's isinstance type check downstream with a confusing TypeError. np.nan is left alone since it's a valid float. #11.2 — typing_stats uses startswith for temporal classification rather than ``in``. Adds 'interval' which was previously missed. Tighter match guards against future ibis types with substring overlap. #11.3 — df_stats_v2 imports XorqStatPipeline at module level. The lazy import inside the methods was only ever decorative — xorq_stat_pipeline.py already wraps ``import ibis`` in try/except, so the module is import-safe regardless of whether ibis is installed. #11.4 — XORQ_STATS_V2 gets a NOTE explaining it's xorq-only: typing_stats requires 'dtype', histogram requires length/min/max — all externally pre-populated by XorqStatPipeline but not in StatPipeline's EXTERNAL_KEYS. Passing this list to the pandas StatPipeline raises DAGConfigError; use PD_ANALYSIS_V2 / PL_ANALYSIS_V2 there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ruff format applied default style (single → double quotes, etc.) on the files this branch touched. uv regenerated lock entries for the new ``buckaroo[ibis]`` extras group added in #8. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Architectural rule: the xorq pipeline operates on xorq expressions only, so buckaroo's source must not ``import ibis`` / ``from ibis ...`` — xorq.api (which re-exports xorq.vendor.ibis) is the only sanctioned surface. Fails today on three call sites in xorq_stat_pipeline.py and xorq_stats_v2.py. Fix lands in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s only
The xorq stat pipeline talks to xorq expressions (xorq.vendor.ibis), so
buckaroo's source must not import raw ``ibis-framework`` — the two
namespaces yield distinct, incompatible Table / dtype / op classes, and
mixing them silently couples us to a second package whose identities
don't match xorq's vendored copies.
Changes:
- ``xorq_stat_pipeline.py``: ``import ibis`` → ``import xorq.api as xo``;
``HAS_IBIS`` → ``HAS_XORQ``; ``ibis.memtable(PERVERSE_DF)`` (in
``unit_test()``) → ``xo.memtable(PERVERSE_DF)``; install hint now
points only at ``buckaroo[xorq]``.
- ``xorq_stats_v2.py``: same import swap; ``ibis.desc("__count")`` →
``xo.desc("__count")`` in the categorical histogram order_by.
- Test fixtures: ``pytest.importorskip("ibis")`` →
``pytest.importorskip("xorq.api")``; ``ibis.memtable(...)`` →
``xo.memtable(...)``.
- ``pyproject.toml``: drop the ``[ibis]`` extra entirely; the ``[xorq]``
extra now declares only ``xorq>=0.1.0`` (xorq vendors ibis, no
separate ibis-framework dep needed). uv.lock regen drops 75 lines and
removes ``duckdb`` and ``ibis-framework`` from the resolution.
Verified locally: raw ``import ibis`` raises ImportError in the synced
venv; the 116 paf/xorq tests all pass against xorq-only.
Closes the rule introduced by the prior TDD-red commit
(test_no_raw_ibis_imports_in_source).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On Windows ``path.read_text()`` defaults to the system encoding (cp1252), and buckaroo source files contain UTF-8 em-dashes in docstrings, so the test crashed with a ``UnicodeDecodeError`` on the Windows leg of CI even though the rule itself was satisfied. Pin the read to ``encoding="utf-8"``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
f885eb4 to
596056c
Compare
xorq 0.3.21 (latest as of writing) caps ``pyarrow<22`` and declares ``requires-python = '<3.14,>=3.10'``. Combined with the new ``pyarrow>=22`` floor for python>=3.14 from #689 (just merged into main), the [xorq] extra is unsatisfiable on 3.14 — uv lock failed with: "xorq>=0.3.1 depends on pyarrow<22 ... we can conclude that xorq>=0.1.14 depends on pyarrow<22" against the 3.14 floor. Gate the extra on ``python_version < '3.14'`` so: - 3.10–3.13 install buckaroo[xorq] as before (xorq + pyarrow 21) - 3.14 installs nothing extra; pyarrow 22 lands via the main deps; xorq tests skip via the existing ``pytest.importorskip("xorq.api")`` - When xorq ships 3.14 support and lifts the pyarrow cap, drop the gate. Also a clean lock regen (``rm uv.lock && uv lock``) — the rebase used ``-X theirs`` to drive uv.lock conflicts deterministically across 19 commits, but the resulting lock had a dangling source dup that would not parse. Regen produces a single coherent lock that satisfies all matrix Pythons (3.11, 3.12, 3.13, 3.14, plus Max-Versions × 4) on linux/macos/windows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
596056c to
e7c96a7
Compare
Summary
Replaces #518 (closed because the GitHub UI showed a phantom 499-file diff after a clean rebase — actual diff is 10 files).
Adds an ibis-backed analysis backend to buckaroo, ported to the v2
@stat/StatPipelineframework following an adversarial review of the original v1-style implementation.What's in
IbisStatPipeline(buckaroo/pluggable_analysis_framework/ibis_stat_pipeline.py) — two-phase executor: a single batchedtable.aggregate(...)query for allIbisColumn-param @stat funcs, then per-column post-batch (computed stats +IbisTable-param histograms) through the standard typed-DAG executor.ibis_stats_v2.pyrewritten as@stat-decorated functions: typing, length, null_count, min, max, distinct_count, mean, std, median, non_null_count, nan_per, distinct_per, histogram. Naming now matchespd_stats_v2(was v1-style classes despite the_v2filename).IbisDfStatsV2(df_stats_v2.py) — DfStats-shaped wrapper so DataFlow /CustomizableDataflowconsumers can run against anibis.Tablewithout changes.IbisColumnandIbisTableadded toRAW_MARKER_TYPES+ a smallibis_tableinjection branch in_execute_stat_func.xorqoptional dependency group in pyproject.toml; lock pins xorq 0.3.21 + ibis-framework 12.0.0.Adversarial review fixes baked in
except Exception: continueblocks replaced withOk/Erraccumulator semantics; failures surface asStatError. Test asserts a bad@statshows up inerrors.@statparameter typing; no longer a doc comment._v2and the contents now match the@statv2 system.SUMover empty returns NULL in SQL; coalesced.Test plan
ruff check+ruff formatclean on changed filestest_ibis_stats_v2.py(was 23 in v1) covering typing, batched aggregates, numeric stats, computed stats, histogram (numeric + categorical + constant column), structured error capture, full pipeline, empty tabletest_ibis_df_stats_v2.pycovering.sdf, v1-shape.errs,verify_analysis_objects,ordered_a_objs🤖 Generated with Claude Code