Skip to content

Add ibis/xorq analysis backend (v2 @stat pipeline)#686

Closed
paddymul wants to merge 20 commits into
mainfrom
feat/paf-v2-ibis-v2
Closed

Add ibis/xorq analysis backend (v2 @stat pipeline)#686
paddymul wants to merge 20 commits into
mainfrom
feat/paf-v2-ibis-v2

Conversation

@paddymul
Copy link
Copy Markdown
Collaborator

Summary

Replaces #518 (closed because the GitHub UI showed a phantom 499-file diff after a clean rebase — actual diff is 10 files).

Adds an ibis-backed analysis backend to buckaroo, ported to the v2 @stat / StatPipeline framework following an adversarial review of the original v1-style implementation.

What's in

  • IbisStatPipeline (buckaroo/pluggable_analysis_framework/ibis_stat_pipeline.py) — two-phase executor: a single batched table.aggregate(...) query for all IbisColumn-param @stat funcs, then per-column post-batch (computed stats + IbisTable-param histograms) through the standard typed-DAG executor.
  • ibis_stats_v2.py rewritten as @stat-decorated functions: typing, length, null_count, min, max, distinct_count, mean, std, median, non_null_count, nan_per, distinct_per, histogram. Naming now matches pd_stats_v2 (was v1-style classes despite the _v2 filename).
  • IbisDfStatsV2 (df_stats_v2.py) — DfStats-shaped wrapper so DataFlow / CustomizableDataflow consumers can run against an ibis.Table without changes.
  • Markers: IbisColumn and IbisTable added to RAW_MARKER_TYPES + a small ibis_table injection branch in _execute_stat_func.
  • xorq optional dependency group in pyproject.toml; lock pins xorq 0.3.21 + ibis-framework 12.0.0.

Adversarial review fixes baked in

  • Histogram is live — was dead code (defined but never registered); numeric + categorical paths both have tests asserting populated histograms.
  • Structured error capture — three silent except Exception: continue blocks replaced with Ok/Err accumulator semantics; failures surface as StatError. Test asserts a bad @stat shows up in errors.
  • DAG ordering — implicit via @stat parameter typing; no longer a doc comment.
  • Naming v1 vs v2 — the file is _v2 and the contents now match the @stat v2 system.
  • Empty-table length=0SUM over empty returns NULL in SQL; coalesced.
  • Real widget path — IbisDfStatsV2 makes the pipeline reachable from DataFlow.

Test plan

  • ruff check + ruff format clean on changed files
  • 25 tests in test_ibis_stats_v2.py (was 23 in v1) covering typing, batched aggregates, numeric stats, computed stats, histogram (numeric + categorical + constant column), structured error capture, full pipeline, empty table
  • 4 integration tests in test_ibis_df_stats_v2.py covering .sdf, v1-shape .errs, verify_analysis_objects, ordered_a_objs
  • Full unit suite: 762 passed, 1 pre-existing MCP flake unrelated to this work, 6 skipped (vs 756 / 1 / 6 on baseline — net +6 tests)

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 30, 2026

📦 TestPyPI package published

pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.13.5.dev25199479204

or with uv:

uv pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.13.5.dev25199479204

MCP server for Claude Code

claude mcp add buckaroo-table -- uvx --from "buckaroo[mcp]==0.13.5.dev25199479204" --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo-table

📖 Docs preview

🎨 Storybook preview

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 212edeabf0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +124 to +128
class IbisDfStatsV2:
"""Drop-in DfStats wrapper for ``ibis.Table`` inputs.

Mirrors the ``DfStatsV2`` / ``PlDfStatsV2`` surface (``.sdf``, ``.errs``,
``.ap.ordered_a_objs``, ``verify_analysis_objects``) so DataFlow,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Implement add_analysis on IbisDfStatsV2

IbisDfStatsV2 is described as a drop-in DataFlow-compatible stats wrapper, but it does not implement add_analysis, while DataFlow.add_analysis unconditionally calls stats.add_analysis(...) (buckaroo/dataflow/dataflow.py:430-440). Any DataFlow configured to use IbisDfStatsV2 will raise AttributeError as soon as a user adds an analysis interactively, so this breaks the claimed no-changes integration path.

Useful? React with 👍 / 👎.

.order_by(ibis.desc("__count"))
.limit(10)
)
df = query.execute()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Route histogram execution through configured backend

The histogram code executes queries via query.execute() directly, which bypasses IbisStatPipeline._execute and therefore ignores any backend passed to IbisStatPipeline(…, backend=...). In mixed or unbound-expression setups this can execute against the wrong engine (or fail) specifically for histogram stats, while the rest of the pipeline uses the configured backend.

Useful? React with 👍 / 👎.

paddymul and others added 19 commits April 30, 2026 22:25
- Fix IbisAnalysisPipeline: remove premature xorq guard (only require ibis),
  add schema pre-seeding for dtype, filter None from expression results,
  add histogram_query_fns support
- Add ibis_stats_v2.py with IbisTypingStats, IbisBaseSummaryStats,
  IbisNumericStats, IbisComputedSummaryStats, IbisHistogramStats
- Add xorq optional dependency group to pyproject.toml
- Add 23 tests using ibis.memtable (no xorq needed for local tests)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pre-push hook required ruff format on the new ibis files; regenerated
the lock so xorq picks up 0.3.21 (was 0.3.10).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the existing v1-style class tests with tests targeting the
yet-to-exist IbisStatPipeline + @stat-based ibis_stats_v2 module.

Adds coverage that was missing in v1:
  - Live histogram (numeric + categorical) — was dead code
  - Structured error capture: a bad @stat surfaces as a StatError,
    not a silent stat dropout
  - DAG validation at pipeline construction
  - Constant-column histogram edge case

Will fail collection until ibis_stat_pipeline.py exists. Implementation
in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the v1-style IbisAnalysis class hierarchy with @stat-decorated
functions executed through a new IbisStatPipeline that mirrors the
typed-DAG semantics of StatPipeline.

Two-phase execution:
  1. Batch aggregate — every @stat with an IbisColumn parameter is
     folded into a single table.aggregate() query and executed once.
  2. Per-column post-batch — computed stats and IbisTable-param stats
     (histograms) run through the standard typed-DAG executor.

Resolves the issues called out in the adversarial review:

  - Histogram is live (was dead code in v1: IbisHistogramStats was
    defined but never added to IBIS_ANALYSIS). Numeric and categorical
    paths now have test coverage that asserts populated histograms.

  - Structured error capture replaces the three `except Exception:
    continue` blocks. Failures inside an @stat or in the batch query
    surface as StatError entries; downstream computed stats still see
    UpstreamError propagation. Tests assert that an intentionally bad
    @stat shows up in the returned errors list.

  - DAG ordering is implicit. requires_summary is no longer a comment —
    parameter names ARE the dependency declaration, validated at
    construction via build_typed_dag.

  - Naming matches the v2 pattern: filename stays ibis_stats_v2 but the
    contents now match the @stat / TypedDict style of pd_stats_v2.

  - Empty-table length=0 fix: the previous `COUNT(c) + SUM(...IS NULL)`
    expression returned NULL on an empty table because SUM-over-empty
    is NULL in SQL. Coalesced to 0.

Adds IbisColumn / IbisTable marker types to stat_func.RAW_MARKER_TYPES
and a small IbisTable injection branch in _execute_stat_func so the
v2 pipeline can pass the table through to histogram-style stats.

Deletes the v1 IbisAnalysisPipeline / IbisAnalysis (no remaining
consumers in buckaroo/ or tests/).

25/25 ibis tests pass; full unit suite passes (modulo the pre-existing
unrelated MCP uvx flake on main).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Asserts the DfStats-shaped wrapper that lets DataFlow / BuckarooWidget
consume an ibis.Table: .sdf, .errs (v1 ErrDict shape), the
verify_analysis_objects classmethod, and .ap.ordered_a_objs.

Will fail collection until IbisDfStatsV2 is added in df_stats_v2.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Makes the ibis pipeline reachable from the rest of buckaroo:

  - Adds IbisDfStatsV2 in df_stats_v2.py mirroring DfStatsV2 /
    PlDfStatsV2 (.sdf, .errs, .ap.ordered_a_objs,
    verify_analysis_objects). Any DataFlow consumer that takes an
    arbitrary DFStatsClass now also accepts the ibis path.

  - Adds process_table_v1_compat to IbisStatPipeline so the wrapper
    returns the same {(col, stat): (Exception, kls)} ErrDict shape
    that AnalysisPipeline produced.

  - Adds ordered_a_objs property to IbisStatPipeline so the
    add_analysis path matches StatPipeline.

ibis is still imported lazily inside IbisDfStatsV2 — the rest of the
df_stats_v2 module stays ibis-free.

End-to-end test asserts the round-trip (table -> .sdf with length=5
and mean=3.0) and that v1 errs shape matches the existing convention.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- IbisStatPipeline -> XorqStatPipeline
- IbisColumn / IbisTable -> XorqColumn / XorqTable
- IbisDfStatsV2 -> XorqDfStatsV2
- IBIS_STATS_V2 -> XORQ_STATS_V2
- ibis_stat_pipeline.py / ibis_stats_v2.py and tests -> xorq_*

`import ibis` and HAS_IBIS kept — those reference the actual ibis-framework
package, which is what the pipeline imports under the hood. The user-facing
identifiers all move to Xorq to match buckaroo's branding.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
XorqStatPipeline accepts a `backend=` and uses it for the batch aggregate
via `self._execute`, but the histogram @stat calls `query.execute()`
directly three times (bounds, numeric bucket, categorical group_by).

A TrackingBackend should see >1 call (batch + at least one histogram
query); today it sees exactly 1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Histogram had three direct `query.execute()` calls (bounds, numeric
bucket, categorical group_by) that bypassed `XorqStatPipeline._execute`,
ignoring any user-supplied `backend=` argument.

Adds a new marker type `XorqExecute` to the v2 framework. The pipeline
injects its bound `_execute` method into any @stat that declares an
`execute: XorqExecute` parameter, exactly like XorqTable / RawSeries.

`histogram` and the helper functions now route every query through the
injected callable. The TrackingBackend test now sees all 9 queries
(1 batch + 4 columns × ~2 histogram queries) instead of just the batch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- #6: @stat(provides='foo') should override the auto-derived key
- #4: histogram must not issue redundant bounds query (5 queries, not 7)
- #7: XorqDfStatsV2.add_analysis must extend the DAG and reprocess
- #10: StatPipeline.process_df must keep column keys for empty-but-typed dfs

#5 has no behavior test — pure optimization, existing tests verify correctness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s=, add_analysis, empty df

#6@stat(provides='key') as a clean alternative to the single-key
TypedDict charade. Drops _LengthResult / _MinResult / etc. in
xorq_stats_v2.py; functions now declare their actual scalar return type
and use provides= to name the accumulator key.

#4 — XorqStatPipeline pre-populates min=None, max=None per column and
adds them to EXTERNAL_KEYS. Histogram declares min: float, max: float
deps and uses them directly. base_min / base_max overwrite the None for
numeric cols; non-numeric cols keep None and fall into the categorical
branch. Removes the redundant per-column bounds query — query count
drops from 7 to 5 on the 4-col fixture (regression test in place).

#5 — length hoisted out of per-column @stat. The batch query now
includes one ``__total_length__`` expression (table.count()) regardless
of column count, replacing N per-column length expressions. base_length
removed from XORQ_STATS_V2; length lives in EXTERNAL_KEYS and is
populated post-batch.

#7 — XorqDfStatsV2.add_analysis + XorqStatPipeline.add_stat. Skips the
PERVERSE_DF unit-test that StatPipeline.add_stat runs (no ibis-side
equivalent). Behavioral parity with DfStatsV2.add_analysis.

#10 — StatPipeline.process_df short-circuits on len(df.columns) == 0
instead of len(df) == 0, so an empty-but-typed df produces per-column
entries instead of {}. Brings the pandas pipeline in line with
XorqStatPipeline's behavior on empty tables.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
StatPipeline runs a PERVERSE_DF unit_test in __init__ to catch dumb stat
bugs early. XorqStatPipeline doesn't yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#9 — XorqStatPipeline.__init__ now runs a unit_test against ibis.memtable
wrapping PERVERSE_DF and stores the result in _unit_test_result. Catches
dumb stat bugs (typos, wrong-dtype assumptions) at construction. The
smoke check temporarily swaps backend=None so it doesn't inflate the
caller's backend traffic — internal validation should run on the
in-memory ibis backend, not the user's pipeline target.

#8 — pyproject.toml now exposes two extras:
  - buckaroo[ibis]: just ibis-framework[duckdb] (minimal, sufficient for
    a single-backend workflow)
  - buckaroo[xorq]: ibis + xorq (cross-backend compute)
The XorqStatPipeline ImportError now points at both options.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- _to_python_scalar must coerce pd.NA / pd.NaT to None (currently returns
  the singleton, which fails the framework's isinstance type check
  downstream with a confusing TypeError)
- typing_stats must classify 'interval' as datetime-ish (today's
  substring-based check misses it)

#11.3 (lazy-import comment) and #11.4 (XORQ_STATS_V2 docstring) are
non-behavior changes — no failing-test commit for those.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ORQ_STATS_V2 note

#11.1 — _to_python_scalar coalesces pd.NA / pd.NaT to None. Without this
they pass through the helper unchanged and trip the framework's
isinstance type check downstream with a confusing TypeError. np.nan is
left alone since it's a valid float.

#11.2 — typing_stats uses startswith for temporal classification rather
than ``in``. Adds 'interval' which was previously missed. Tighter match
guards against future ibis types with substring overlap.

#11.3 — df_stats_v2 imports XorqStatPipeline at module level. The lazy
import inside the methods was only ever decorative — xorq_stat_pipeline.py
already wraps ``import ibis`` in try/except, so the module is import-safe
regardless of whether ibis is installed.

#11.4 — XORQ_STATS_V2 gets a NOTE explaining it's xorq-only:
typing_stats requires 'dtype', histogram requires length/min/max — all
externally pre-populated by XorqStatPipeline but not in StatPipeline's
EXTERNAL_KEYS. Passing this list to the pandas StatPipeline raises
DAGConfigError; use PD_ANALYSIS_V2 / PL_ANALYSIS_V2 there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ruff format applied default style (single → double quotes, etc.) on the
files this branch touched. uv regenerated lock entries for the new
``buckaroo[ibis]`` extras group added in #8.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Architectural rule: the xorq pipeline operates on xorq expressions only,
so buckaroo's source must not ``import ibis`` / ``from ibis ...`` —
xorq.api (which re-exports xorq.vendor.ibis) is the only sanctioned
surface.

Fails today on three call sites in xorq_stat_pipeline.py and
xorq_stats_v2.py. Fix lands in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s only

The xorq stat pipeline talks to xorq expressions (xorq.vendor.ibis), so
buckaroo's source must not import raw ``ibis-framework`` — the two
namespaces yield distinct, incompatible Table / dtype / op classes, and
mixing them silently couples us to a second package whose identities
don't match xorq's vendored copies.

Changes:
- ``xorq_stat_pipeline.py``: ``import ibis`` → ``import xorq.api as xo``;
  ``HAS_IBIS`` → ``HAS_XORQ``; ``ibis.memtable(PERVERSE_DF)`` (in
  ``unit_test()``) → ``xo.memtable(PERVERSE_DF)``; install hint now
  points only at ``buckaroo[xorq]``.
- ``xorq_stats_v2.py``: same import swap; ``ibis.desc("__count")`` →
  ``xo.desc("__count")`` in the categorical histogram order_by.
- Test fixtures: ``pytest.importorskip("ibis")`` →
  ``pytest.importorskip("xorq.api")``; ``ibis.memtable(...)`` →
  ``xo.memtable(...)``.
- ``pyproject.toml``: drop the ``[ibis]`` extra entirely; the ``[xorq]``
  extra now declares only ``xorq>=0.1.0`` (xorq vendors ibis, no
  separate ibis-framework dep needed). uv.lock regen drops 75 lines and
  removes ``duckdb`` and ``ibis-framework`` from the resolution.

Verified locally: raw ``import ibis`` raises ImportError in the synced
venv; the 116 paf/xorq tests all pass against xorq-only.

Closes the rule introduced by the prior TDD-red commit
(test_no_raw_ibis_imports_in_source).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On Windows ``path.read_text()`` defaults to the system encoding
(cp1252), and buckaroo source files contain UTF-8 em-dashes in
docstrings, so the test crashed with a ``UnicodeDecodeError`` on the
Windows leg of CI even though the rule itself was satisfied.

Pin the read to ``encoding="utf-8"``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
xorq 0.3.21 (latest as of writing) caps ``pyarrow<22`` and declares
``requires-python = '<3.14,>=3.10'``. Combined with the new
``pyarrow>=22`` floor for python>=3.14 from #689 (just merged into
main), the [xorq] extra is unsatisfiable on 3.14 — uv lock failed
with: "xorq>=0.3.1 depends on pyarrow<22 ... we can conclude that
xorq>=0.1.14 depends on pyarrow<22" against the 3.14 floor.

Gate the extra on ``python_version < '3.14'`` so:
  - 3.10–3.13 install buckaroo[xorq] as before (xorq + pyarrow 21)
  - 3.14 installs nothing extra; pyarrow 22 lands via the main deps;
    xorq tests skip via the existing ``pytest.importorskip("xorq.api")``
  - When xorq ships 3.14 support and lifts the pyarrow cap, drop the
    gate.

Also a clean lock regen (``rm uv.lock && uv lock``) — the rebase used
``-X theirs`` to drive uv.lock conflicts deterministically across 19
commits, but the resulting lock had a dangling source dup that would
not parse. Regen produces a single coherent lock that satisfies all
matrix Pythons (3.11, 3.12, 3.13, 3.14, plus Max-Versions × 4) on
linux/macos/windows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@paddymul
Copy link
Copy Markdown
Collaborator Author

paddymul commented May 1, 2026

Replaced by #691 — fresh PR off main with only the xorq-substantive changes. The history on this branch accumulated a lot of unrelated reformatting churn (ruff-version skew vs main's #688 paddy-format roll-out), making review noisy. #691 is the same feature, trimmed.

@paddymul paddymul closed this May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant