Skip to content

feat(xorq-search): plumb search term to JS as highlight_phrase via sd channel#765

Open
paddymul wants to merge 3 commits into
mainfrom
feat/xorq-search-highlight-sd
Open

feat(xorq-search): plumb search term to JS as highlight_phrase via sd channel#765
paddymul wants to merge 3 commits into
mainfrom
feat/xorq-search-highlight-sd

Conversation

@paddymul
Copy link
Copy Markdown
Collaborator

Summary

Mirrors #758 for the xorq backend. The xorq path doesn't go through configure_buckaroo's lisp interpreter (ibis exprs can't .copy()), so it can't reuse the SDResult machinery from #755 directly. Instead this PR adds an analogous sd channel inside XorqAutocleaning:

  • Handlers in _XORQ_OP_HANDLERS may now return either a bare expr (legacy) or (expr, sd_updates). _apply_xorq_ops accumulates the per-column sd entries across ops, merging col-by-col.
  • handle_ops_and_clean runs the accumulated updates through _rekey_op_sd_to_internal (the same helper used by PandasAutocleaning.handle_ops_and_clean in feat(polars-search): plumb search term to JS as highlight_regex via SDResult #758) so the orig-named entries land on buckaroo's internal a/b/c letter keys and compose cleanly with the summary_sd that XorqDataflow._get_summary_sd produces.
  • _xorq_search returns the filtered expr plus {col: {'highlight_phrase': [val]}} for every ibis-String column.

Uses highlight_phrase (list of literal needles) rather than highlight_regex: ibis StringValue.contains is a literal substring match, so a phrase highlight matches the actual filter semantics.

Scope: only the search command is wired today. The sd channel itself is generic — other ops can opt in by returning (expr, sd_updates).

Test plan

  • pytest tests/unit/test_xorq_buckaroo_widget.py tests/unit/test_xorq_stats_v2.py tests/unit/test_xorq_df_stats_v2.py — 78 pass locally
  • CI green (xorq jobs)

Tests added

  • TestSearchHighlight.test_search_op_delivers_highlight_phrase_into_displayer_args — end-to-end through XorqBuckarooWidget with quick_command_args = {'search': ['admin']}. Asserts highlight_phrase == ['admin'] lands in displayer_args for both ibis-String columns (a=name, b=role) and the integer column (c=score) stays clean.
  • TestSearchHighlight.test_empty_search_drops_highlight_from_displayer_args — clearing the search box ("") pulls the highlight back out, symmetric to the empty-val short-circuit in _xorq_search.

TDD: failing-tests commit was pushed first; CI run on that commit will be visible failing before the implementation commit lands.

🤖 Generated with Claude Code

Pins the xorq equivalent of #758 polars Search → SDResult, but using
`highlight_phrase` (list) rather than `highlight_regex` — ibis
`StringValue.contains` is a literal substring match, so a phrase match
on the JS side matches the actual filter semantics.

Both tests fail today: the xorq path (XorqAutocleaning._apply_xorq_ops
bypassing the lisp interpreter) has no sd channel and the search
handler returns a bare expr.

- test_search_op_delivers_highlight_phrase_into_displayer_args:
  end-to-end through XorqBuckarooWidget with quick_command_args, asserts
  highlight_phrase lands in displayer_args for each ibis-String column
  (a=name, b=role) and the integer column (c=score) stays clean.
- test_empty_search_drops_highlight_from_displayer_args: clearing the
  search box pulls the highlight back out — symmetric to the empty-val
  short-circuit in _xorq_search.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: acbdd8eb66

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/unit/test_xorq_buckaroo_widget.py
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 17, 2026

📦 TestPyPI package published

pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.14.2.dev26006473751

or with uv:

uv pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.14.2.dev26006473751

MCP server for Claude Code

claude mcp add buckaroo-table -- uvx --from "buckaroo[mcp]==0.14.2.dev26006473751" --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo-table

📖 Docs preview

🎨 Storybook preview

… channel

Mirrors #758 for the xorq backend. The xorq path doesn't go through
configure_buckaroo's lisp interpreter (ibis exprs can't .copy()), so it
can't reuse the SDResult machinery from #755 directly. Instead this
adds an analogous sd channel inside XorqAutocleaning:

- Handlers in _XORQ_OP_HANDLERS may now return either a bare expr
  (legacy) or (expr, sd_updates). _apply_xorq_ops accumulates the
  per-column sd entries across ops, merging col-by-col.
- handle_ops_and_clean runs the accumulated updates through
  _rekey_op_sd_to_internal (the same helper PandasAutocleaning uses
  since #758) so orig-named entries land on buckaroo's internal a/b/c
  letter keys and compose cleanly with the summary_sd that
  XorqDataflow._get_summary_sd produces (also keyed by letter).
- _xorq_search returns the filtered expr plus {col: {'highlight_phrase':
  [val]}} for every ibis-String column.

Uses highlight_phrase (list of literal needles) rather than
highlight_regex because ibis StringValue.contains is a literal
substring match — matching the filter semantics on the highlight side
avoids regex-metacharacter divergence.

Scope: only the search command is wired today. The sd channel itself
is generic — other ops can opt in by returning (expr, sd_updates).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add an assert in handle_ops_and_clean that result_expr.columns ==
  df.columns. The rekey runs against the input expr — correct today
  because the only sd-producing handler is _xorq_search (filter
  preserves identity), but a future op that renames/drops columns
  would silently mis-map the sd entries onto the wrong letter keys.
- _apply_xorq_ops: type the signature (-> Tuple[Any, Dict[...]]) and
  collapse the dict accumulation to setdefault().update() — same
  semantics, less noise.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@paddymul paddymul deployed to testpypi May 17, 2026 23:58 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant