feat(styling): init_sd as augmentation channel (nested merge + delete_keys + demo)#748
feat(styling): init_sd as augmentation channel (nested merge + delete_keys + demo)#748paddymul wants to merge 9 commits into
Conversation
📦 TestPyPI package publishedpip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.13.5.dev25992032577or with uv: uv pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.13.5.dev25992032577MCP server for Claude Codeclaude mcp add buckaroo-table -- uvx --from "buckaroo[mcp]==0.13.5.dev25992032577" --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo-table📖 Docs preview🎨 Storybook preview |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1721611fcf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Lets a Command thread styling-relevant metadata alongside its df result, without a separate channel through the dataflow. configure_buckaroo wraps each registered primitive: if a transform returns a 2-tuple ending in a dict, the second element is captured in a per-call sd_accumulator and only the df flows to the lisp interpreter. buckaroo_transform exposes get_last_sd_updates() for the caller; autocleaning's handle_ops_and_clean merges the snapshot into cleaning_sd via merge_sds. Single-return transforms (all existing ones) are unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ing _type The rest of buckaroo's sd is keyed by internal a/b/c column names (the analysis classes run against a renamed df), but a Command's transform sees only the original column names. After collecting (df, sd_updates) from the interpreter, autocleaning now rewrites the op-supplied keys via the cleaned_df's column→letter mapping so the updates merge into the existing sd entries instead of sitting alongside as orphans. Belt-and-suspenders: DefaultMainStyling.style_column falls back to obj when `_type` is missing, rather than KeyError-ing. This shouldn't trigger in normal flow now that the rename is correct, but it stops a noisy warning if an op contributes a column not present in summary_sd. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Search.transform now returns (filtered_df, {string_col: {'highlight_regex': val}})
so the search term flows into cleaning_sd through the interpreter's 2-tuple
return contract (jlisp). DefaultMainStyling.style_column picks up
highlight_phrase / highlight_regex / highlight_color from col_meta and
copies them into the string displayer_args, which is already understood
by the JS-side string displayer.
No changes to dataflow.py or the widget — the lowcode op contributes
styling metadata via the sd merge, which the styling layer reads natively.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two widget-level tests pinning both delivery paths: - Search op → cleaning_sd → merged_sd → renamed col_meta → string displayer_args.highlight_regex on every polars-String column (and not on integer columns). - column_config_overrides with highlight_phrase + highlight_color survives the merge_column_config step intact. Both inspect df_display_args['main']['df_viewer_config']['column_config'] — the same structure that ships to the JS-side renderer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
init_sd users can already carry the same nested shape they'd put in
column_config_overrides — e.g. {'comments': {'displayer_args': {'max_length':
2000}, 'ag_grid_specs': {'wrapText': True, 'width': 400}}}. But where
column_config_overrides REPLACES the styled bag via shallow row.update(),
init_sd needs to AUGMENT it so the op-supplied highlights from Search (which
contribute flat top-level keys onto the same column_metadata) coexist with
the user's per-column displayer config.
Two shallow merges in DefaultMainStyling.style_column:
- column_metadata['displayer_args'] (dict) merges into the styled disp
bag — caller wins per-key. Keeps highlight_regex / highlight_phrase /
highlight_color (read separately from flat top-level keys) intact.
- column_metadata['ag_grid_specs'] merges into base_config['ag_grid_specs']
— caller wins per-key, on top of styling's computed minWidth.
This is what makes init_sd a viable augmentation channel for per-column
display config that needs to play nice with lowcode ops.
Tests:
- test_style_column_merges_nested_displayer_args_and_ag_grid_specs:
nested displayer_args.max_length and ag_grid_specs.{wrapText,width,
maxWidth} all flow through, minWidth still computed.
- test_init_sd_displayer_args_and_search_highlight_coexist_on_same_column:
the actual point — Search's highlight_regex AND user's max_length both
end up on the same column's displayer_args after merge_sds.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
style_column's type dispatch unconditionally attaches tooltip_config to string / time / categorical / period / interval / binary / fallback columns. init_sd's nested-merge channel can augment displayer_args and ag_grid_specs, but had no way to *remove* a styled default — so a user who wanted a string column without the permanent tooltip had no path short of subclassing the styling analysis. delete_keys is a top-level list on column_metadata; after all merging, each listed key is popped from base_config. Operates on top-level keys only (tooltip_config, ag_grid_specs, displayer_args, col_name) — nested removal isn't needed for the motivating case and would complicate the merge semantics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Working example of init_sd carrying displayer_args + ag_grid_specs + delete_keys (drops the auto-attached tooltip_config) on the comments column. Pairs with rowHeight: 105 and wrapText so long complaint text wraps inside a tall cell. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add df.filter(pl.col('viol_status') == 'Fail') to demonstrate the
typical "filter + view" flow with the live-search widget. Tighten
rowHeight to 70px (was 105) — five lines was overkill for the median
comment length. Note in a comment that the Polars Enum is case-sensitive
('Fail' not 'fail') — previously committed version used 'fail' which
silently filters to 0 rows.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
b7cb29d to
7e84cbf
Compare
|
Resume note (2026-05-17): what's left to salvage from this branch after #758 lands. Main + #758 already cover the first 4 commits here (older parallel impl of the SDResult/highlight plumbing — Net-new content worth keeping:
Plan for resumption: after #758 merges, rebase this branch and drop the four superseded commits — only the styling.py augmentation features + their tests + the demo notebook should remain. |
Summary
Turns
init_sdinto a proper augmentation channel for per-column display config — one that plays nice with lowcode ops (Search → highlight) instead of clobbering them likecolumn_config_overridesdoes. Two style_column changes + a delete-keys escape hatch + a working demo notebook.Depends on
#744 (jlisp 2-tuple transform contract) and #745 (polars Search → highlight_regex via sd). The test
test_init_sd_displayer_args_and_search_highlight_coexist_on_same_columnreferences Search's contribution shape; the notebook demo relies on the styling layer readinghighlight_*from merged_sd.After #744 and #745 merge, this PR will rebase and the duplicate commits drop via patch-id.
Pieces
feat(styling): merge column_metadata displayer_args + ag_grid_specs— two shallow merges inDefaultMainStyling.style_column. Whencolumn_metadatacarriesdisplayer_args(e.g.{'max_length': 5000}) orag_grid_specs(e.g.{'wrapText': True, 'width': 400}), they merge into the styled bag instead of replacing. Caller wins per-key. Critically, this lets a Search op'shighlight_regexand a user'smax_lengthcoexist on the same column —column_config_overrides's replace semantics would have clobbered the highlight.feat(styling): init_sd delete_keys—column_metadata.delete_keys: list[str]pops the named top-level keys from the finalbase_config. Motivating case: drop the auto-attachedtooltip_configon a string column the user doesn't want a permanent tooltip on, without subclassing the styling analysis. Operates on top-level only (tooltip_config, ag_grid_specs, displayer_args, col_name) — nested removal isn't needed for the motivating case.Restaurant-Complaints demo notebook — working example combining
init_sd(nested displayer_args + ag_grid_specs + delete_keys) withextra_grid_config(rowHeight, pinnedRowHeight) and aviol_status == 'Fail'filter on the polars Enum.Usage
This shape is intentionally the same as
column_config_overrideswould take — butinit_sdaugments viamerge_sds(each key wins per-call), whilecolumn_config_overridesreplaces via shallowrow.update().Test plan
pytest tests/unit/dataflow/— 118 passpnpm test+pnpm run build:tsc— 212 pass, TS cleancommentscolumn, wraps long text,max_length: 5000honored