Skip to content

feat(polars-search): plumb search term to JS as highlight_regex#745

Closed
paddymul wants to merge 5 commits into
mainfrom
feat/polars-search-highlight-sd
Closed

feat(polars-search): plumb search term to JS as highlight_regex#745
paddymul wants to merge 5 commits into
mainfrom
feat/polars-search-highlight-sd

Conversation

@paddymul
Copy link
Copy Markdown
Collaborator

Summary

Polars Search.transform now returns (filtered_df, sd_updates) so the search term flows into cleaning_sd as highlight_regex on every polars-String column. DefaultMainStyling.style_column reads highlight_phrase / highlight_regex / highlight_color off col_meta and threads them into the string displayer_args, where the JS side already renders matches as <mark>.

This is the first concrete consumer of the (df, sd_updates) 2-tuple transform contract from #744.

Depends on

#744 — the jlisp interpreter contract change must merge first. CI here will fail until that lands; after it does, this PR will rebase and the duplicate commit drops via patch-id.

Pieces

  • feat(jlisp) — temporarily included; drops on rebase after feat(jlisp): transform may return (df, sd_updates) tuple #744.
  • fix: rename op-contributed sd keys + harden style_column against missing _type — autocleaning rewrites the op-supplied keys via old_col_new_col(cleaned_df) so they line up with buckaroo's internal a/b/c names; style_column falls back to obj when _type is absent.
  • feat(polars-search): plumb search term to JS highlight via sd — the Search transform change + styling-layer extraction.
  • test(polars): end-to-end highlight delivery into df_viewer_config — integration test asserting the full pipeline lands highlight_regex on the correct column in df_viewer_config.

Test plan

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

📦 TestPyPI package published

pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.13.5.dev25992030988

or with uv:

uv pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.13.5.dev25992030988

MCP server for Claude Code

claude mcp add buckaroo-table -- uvx --from "buckaroo[mcp]==0.13.5.dev25992030988" --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo-table

📖 Docs preview

🎨 Storybook preview

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 80ee37703c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +106 to +108
for k in ('highlight_phrase', 'highlight_regex', 'highlight_color'):
if k in column_metadata:
disp[k] = column_metadata[k]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add frontend handling for highlight fields

When a search term is present, these keys are now copied into displayer_args, but the current JS string displayer does not consume them: I checked packages/buckaroo-js-core/src/components/DFViewerParts/DFWhole.ts and Displayer.ts, where StringDisplayerA only defines max_length and getStringFormatter only slices/returns the raw value. In any Polars widget using the search op, the backend will send highlight_regex but the cells still render as plain strings, so the new highlighting feature has no visible effect until the frontend formatter/renderer is updated to use these fields.

Useful? React with 👍 / 👎.

paddymul and others added 5 commits May 17, 2026 09:20
Lets a Command thread styling-relevant metadata alongside its df result,
without a separate channel through the dataflow. configure_buckaroo wraps
each registered primitive: if a transform returns a 2-tuple ending in a
dict, the second element is captured in a per-call sd_accumulator and only
the df flows to the lisp interpreter. buckaroo_transform exposes
get_last_sd_updates() for the caller; autocleaning's handle_ops_and_clean
merges the snapshot into cleaning_sd via merge_sds.

Single-return transforms (all existing ones) are unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ing _type

The rest of buckaroo's sd is keyed by internal a/b/c column names (the
analysis classes run against a renamed df), but a Command's transform sees
only the original column names. After collecting (df, sd_updates) from
the interpreter, autocleaning now rewrites the op-supplied keys via the
cleaned_df's column→letter mapping so the updates merge into the existing
sd entries instead of sitting alongside as orphans.

Belt-and-suspenders: DefaultMainStyling.style_column falls back to obj
when `_type` is missing, rather than KeyError-ing. This shouldn't trigger
in normal flow now that the rename is correct, but it stops a noisy
warning if an op contributes a column not present in summary_sd.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Search.transform now returns (filtered_df, {string_col: {'highlight_regex': val}})
so the search term flows into cleaning_sd through the interpreter's 2-tuple
return contract (jlisp). DefaultMainStyling.style_column picks up
highlight_phrase / highlight_regex / highlight_color from col_meta and
copies them into the string displayer_args, which is already understood
by the JS-side string displayer.

No changes to dataflow.py or the widget — the lowcode op contributes
styling metadata via the sd merge, which the styling layer reads natively.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two widget-level tests pinning both delivery paths:

- Search op → cleaning_sd → merged_sd → renamed col_meta → string
  displayer_args.highlight_regex on every polars-String column (and
  not on integer columns).
- column_config_overrides with highlight_phrase + highlight_color
  survives the merge_column_config step intact.

Both inspect df_display_args['main']['df_viewer_config']['column_config']
— the same structure that ships to the JS-side renderer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@paddymul
Copy link
Copy Markdown
Collaborator Author

Superseded by #758, which is the same work rebased onto main after #755 merged. The obsolete tuple-contract commit (feat(jlisp): transform may return (df, sd_updates) tuple) is dropped; Search.transform now returns SDResult from #755 instead.

@paddymul paddymul closed this May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant