feat: paddy-format — lisp-style closing-bracket formatter#684
Conversation
paddy-format will rewrite Python source so closing brackets ) ] } stack on the previous line instead of dangling on their own (the Black/ruff convention). This commit is the failing-test half of TDD: golden input/output cases for Call/List/Dict/Tuple/Set/FunctionDef/ Import, plus idempotence and graceful-syntax-error tests. The implementation is a stub passthrough so every interesting case fails — fix lands in the next commit. uv.lock churn includes incidental schema bump (revision 2 → 3, upload_time → upload-time) from current uv; libcst + transitive pyyaml-ft are the only real additions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pre-push ruff-format hook required this. Mostly blank lines after docstrings. Once paddy-format is the canonical formatter for these files we'll undo this. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📦 TestPyPI package publishedpip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.13.5.dev25181081785or with uv: uv pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.13.5.dev25181081785MCP server for Claude Codeclaude mcp add buckaroo-table -- uvx --from "buckaroo[mcp]==0.13.5.dev25181081785" --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo-table📖 Docs preview🎨 Storybook preview |
Walks the CST and rewrites the whitespace immediately before each closing bracket so that it stacks on the previous line. Two patterns: Type A — Call / FunctionDef params / ImportFrom: the dangling-close whitespace lives in `last_item.comma.whitespace_after` (or in `whitespace_after_arg` when there is no trailing comma). Drop the trailing comma and clear the post-arg whitespace. Type B — List / Set / Dict / Tuple-with-parens: the whitespace lives on the close-bracket node itself (rbracket / rbrace / rpar[0].whitespace_before). Drop the trailing comma and clear that whitespace. Skipped when a comment lives in the affected whitespace — never absorb a comment by stacking the close. Returns input unchanged on syntax errors. Idempotent: a second pass is a no-op because the whitespace is no longer a ParenthesizedWhitespace. Also adds a CLI: `paddy-format file.py` rewrites in place, `--check` exits 1 if any file would change. Smoke-tested against all 92 .py files in the buckaroo package: every file parses after the rewrite, every file is idempotent, 52 would be lispified. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7370898c97
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if not params.params: | ||
| return updated |
There was a problem hiding this comment.
Handle function defs without regular
params.params
Avoid returning early when params.params is empty, because multiline function signatures can still carry closing-paren whitespace in other parameter groups (for example keyword-only-only signatures like def f(*, a, b,\n):). In those cases this branch skips formatting entirely, so the tool does not apply its advertised closing-bracket rewrite to valid function definitions.
Useful? React with 👍 / 👎.
New rule (and inverse of Black): a trailing comma in a multiline bracket group signals "this fits on a line" — collapse it. Updates only the call_with_trailing_comma case for now; this commit is purposefully red so the fix lands as a separate commit per TDD. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two-rule transformer per bracket group (Call args, Function params,
List/Set/Dict/Tuple literals, parenthesized ImportFrom):
1. Trailing comma → collapse the whole group to one line, drop
the trailing comma. Trailing comma is the "this fits" signal —
the inverse of Black's magic-trailing-comma convention.
2. No trailing comma + multiline → stack the close on the previous
line (the existing behavior).
Comments still block both transforms — never absorb a comment by
moving whitespace. Single-element tuples (`(x,)`) keep their trailing
comma since it's semantic. Idempotent.
Updates the other golden test cases to expect the collapsed form,
matching the call_with_trailing_comma case set up in the previous
commit.
Smoke-tested across all 92 .py files in the buckaroo package: zero
parse failures, zero idempotency failures, 54 files would change.
Note that without a line-length budget, some long signatures
collapse to very long single lines — a length cap is a reasonable
follow-up if real use surfaces problems.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
func(\n) should become func(). Failing test, fix lands next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
func(\n) → func(), [\n] → [], {\n} → {}, def f(\n): → def f():.
The trailing-comma rule doesn't apply (there's no comma), so this
is a separate path: empty body + multiline interior whitespace =>
flatten to single line.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four new golden cases: - long_call_greedy_wrap: 203-char call wraps greedily; continuation lines align one column past the open paren. - multiline_collapse_target_too_long_wraps_instead: trailing-comma multiline whose collapsed form exceeds 120 — wraps greedily, trailing comma is dropped (the collapse rule loses to the budget). - long_list_greedy_wrap: 162-char list wraps greedily. - unsplittable_single_arg_overflows: a single arg longer than 120 has nothing to break on; line stays as-is. Three of the four are red against the current implementation — fix lands next commit. The fourth passes incidentally because the current code returns the input unchanged for that shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a second pass that runs after the existing collapse / stack /
empty-flatten transforms. The wrap pass:
1. Re-parse the source with libcst + PositionProvider metadata.
2. Find every wrappable bracket group (Call, List, Set, Dict with
>= 2 items) whose containing line exceeds the budget.
3. Pick the outermost (leftmost start column, earliest line).
4. Greedy-pack: lay items left-to-right at column (open_bracket+1);
break the line whenever adding the next item with its trailing
comma would push past 120.
5. Continuation lines are aligned with the column right after the
open bracket — the lispy style discussed.
6. Repeat until no more over-budget lines are wrappable.
Trailing-comma collapse from pass 1 still wins when the collapsed
form fits in 120 chars; when it doesn't, pass 2 breaks the line and
the trailing comma stays dropped.
Falls back gracefully on unsplittable cases (single arg longer than
the budget — the line stays over-budget).
This commit handles Call/List/Set/Dict. FunctionDef params, Tuple,
and parenthesized ImportFrom remain on the table as follow-ups —
they're not covered by the test fixtures yet.
All 19 tests pass. Smoke-tested against the buckaroo package: 92
files parse, 92 idempotent.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A multiline Call whose continuation line sits at column 0 (legal inside parens, visually broken) should be re-indented to original_indent + 4 spaces. Trailing whitespace after the comma is cleaned up too. This is a new transformation, distinct from wrap-at-120 (which aligns continuation with the open paren). Filed as a failing test; implementation deliberately not in this commit so we can talk about the apparent conflict with the existing wrap style first — see PR discussion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Unifies wrap and re-indent under one rule per discussion: continuation lines always sit at line_indent + 4, never aligned with the open bracket. Updates the existing wrap test fixtures to expect the new indent (col 4 instead of cols 28 / 11). Implementation follows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related changes that share a single rule: continuation lines of
any multi-line bracket group sit at line_indent + 4.
1. New _reindent_pass: walks every multi-line Call/List/Set/Dict and
normalizes its continuation whitespace to line_indent + 4.
ParenthesizedWhitespace.indent is forced to False so the indent
count is exactly the last_line value (libcst would otherwise add
the parent statement's indent on top, breaking idempotency on the
second run).
2. Wrap pass updated: greedy_pack now takes separate first_col and
continuation_col. first_col stays "right after the open bracket"
(where line 1 places its first item); continuation_col is
line_indent + 4 (same rule as re-indent). Continuation lines no
longer align with the open bracket.
Idempotency assertion added to every fixture in
test_paddy_format_golden — every case is now automatically checked
to be a no-op on a second pass.
20 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A new directive that, when placed on the line above an assignment, formats the value as a column table — one element per row, decimal points / least-significant digits aligned vertically. Three failing fixtures: - single-col floats: decimal points line up - mixed ints + floats: ints align with the column where the decimal point would have been (least-significant digit) - multi-col tuples: each tuple position is an independent column, right-aligned Design choices baked in (push back if wrong): - Directive syntax: comment `# table-format` on the line above. - Output: trailing comma kept, close bracket on its own line at the original indent (Black-style data block) — overrides the usual trailing-comma-collapse rule when the directive is present. - Per-column right-alignment of the integer part. Inter-column separator stays a plain ", "; tests use examples where col widths happen to align — the right-pad-vs-no-trailing-space tradeoff for variable col widths is deliberately not in any fixture yet. Implementation deferred to the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ings Same fixtures, easier to read — no \n escapes. Matches the style of the other parametrized cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous hand-typed fixture had a math error (29 input items vs 25
in expected) and inconsistent decimal strides. Regenerated
programmatically with path-1 rules:
* Each cell is a uniform max_int_width + 1 + max_frac_width chars
wide (here, 3 + 1 + 2 = 6).
* Cells are left-padded for the integer part and right-padded for
the fraction part. Cells like " 45.6 " have a trailing space
inside before the comma — the cost of strict cross-row alignment.
* Continuation indent = position right after the open bracket
(col 8 for "data = [") so the first decimal of each row sits
at the same column.
* Decimal stride across cells is exactly 8 chars (cell + ", ").
Input: 24 items (six full cycles of [1.23, 45.6, 7.89, 100.5]).
Output: 14 items on row 1 (length 119), 10 items on row 2
(length 87, ends with `]`).
Implementation pending — multi_col_tuples fixture left as-is for now;
will revisit once the rule for "directive on a short list" is settled.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per option (1) — directive is a wrap-time hint, not a force-expand. Short list with `# table-format` stays on a single line; only the long-input variants get the table layout. Changes: - table_format_multi_col_tuples: now a no-op (input == expected). - table_format_multi_col_tuples_wrap: new fixture, 12 tuples, single-line form exceeds 120 chars, expected output is one tuple per line with cells aligned across rows. Continuation indent is the standard line_indent + 4 (every row has the same shape, so cross-row alignment is automatic). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a `# table-format` (or `#table-format`) comment sits on the line immediately above an assignment whose value is a List, and that list's single-line form would exceed the 120-char budget, the list is laid out as a column table instead of greedy-wrapped. Two shapes: Single-column list of numbers — uniform cells of width max_int_width + 1 + max_frac_width. Integer parts are left-padded and fraction parts are right-padded, so decimals (or the position the decimal would occupy for an int) line up at a fixed offset within every cell. Cells are packed greedily into rows; the continuation column for row 2+ is the column right after `[`, so decimals also line up across rows at exactly an 8-char stride. Trailing spaces inside cells before commas are accepted as the cost of strict alignment. Multi-column list of tuples — each tuple gets its own row at the standard line_indent + 4 continuation. Each tuple position is an independent column with its own padding; cells inside a row line up across rows. Output is data-block style: trailing comma after every tuple, close bracket on its own line at the original statement indent. Detection: `# table-format` may live in the module header (for the first statement) or in a statement's leading_lines (for subsequent statements, including those nested in IndentedBlocks). Comment text is matched permissively (with or without space after `#`). Implementation runs as the final pass in paddy_format(), so it overrides any prior transforms — Pass 1's collapse rule, Pass 2's re-indent, and Pass 3's wrap will all touch a directive-marked list on a re-parse, but Pass 4 always reasserts the table layout, so the final output is idempotent. Path-1 alignment chosen per the design discussion in the PR. 25/25 unit tests pass including the per-fixture idempotency check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two minimal fixtures distilled from buckaroo files that the smoke test flagged as non-idempotent: - idempotent_outer_call_continuation_shifts_inner_dict (from pluggable_analysis_framework/safe_summary_df.py): an outer Call with a continuation row, containing a Dict whose inner key sits at a column that's tied to the OLD outer continuation. After the outer continuation gets re-indented to line_indent+4, the inner key still references the old column, and a second pass is needed to re-resolve. - idempotent_nested_list_inside_dict_value (from ddd_library.py): same root cause — Pass 1 collapses an outer trailing-comma block, shifting the line that holds the start of an inner List. The inner List's continuation lines are computed off the inner List's start line, which moves between passes. Both expected outputs are the steady-state (run-2) form. With the current implementation, paddy_format(input) != steady_state, so the golden assertion fails AND the in-fixture idempotency check would fail. Fix follows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two interacting passes were each idempotent on their own but together
required a second run to settle:
1. _reindent_pass uses the *current* line indent of each multi-line
bracket group's start line. When an outer bracket is re-indented,
an inner bracket's start line shifts, but the inner indent was
already computed off the old line.
2. _wrap_pass on a long single-line collapsed list (Pass 1's output)
turns it multi-line, which moves the start line of any nested
multi-line value. The next re-indent run then sees a different
line_indent and produces a different continuation column.
Two changes:
* _reindent_pass now wraps a single sweep (_reindent_pass_once) in
a fixed-point loop. Handles purely nested re-indent cascades.
* paddy_format() now loops re-indent + wrap + table-format until
the source stops changing. Handles the cross-pass interaction
where wrap reveals a new line_indent for an inner group.
Both red repros (idempotent_outer_call_continuation_shifts_inner_dict
and idempotent_nested_list_inside_dict_value) now pass. Smoke test
across all 92 buckaroo .py files: 60 would change, 0 non-idempotent,
0 parse failures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two additions to lock down current table-format behavior on number shapes the existing fixtures didn't cover: - table_format_ints_only_wrap: max_int_width = 5, max_frac = 0, cell width 5. Cells right-aligned to width 5; least-significant digits line up across rows at a 7-char stride. - table_format_mixed_ints_floats_wrap: max_int_width = 2, max_frac_width = 3, cell width 6. Ints in a float column are padded out with trailing spaces to fill the cell (so "1" renders as " 1 " — leading 1, value, then four trailing spaces for the missing ".XXX"). Decimal column lines up at offset 2 within every cell. Both pass against the current implementation — they're regression locks, not red repros. Idempotency assertion in test_paddy_format_golden covers them automatically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a failing fixture for `# table-format` applied to a list of dicts that share keys and have numeric values. Per the design discussion: each dict < 100 chars, no nested dicts. Each key becomes a column; values per column are decimal-aligned with uniform cell widths so the keys themselves line up across rows. Currently `_TableFormatter.leave_List` only routes to the multi-col path when every element is a `cst.Tuple`; dict elements fall through to the single-col / no-op branches. The implementation to make this pass follows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When the directive's target is a list of Dicts that share the same keys in the same order, lay it out one dict per line with values in each column decimal-aligned (uniform cells, leading int-pad + trailing frac-pad). Each key — `'a':`, `'b':` — therefore lines up across rows because the preceding cells are all the same width. Constraint per the design discussion: each dict has to be a flat mapping of atom keys (strings or names) to atom values (numbers or tuples of numbers). Mismatched keys, nested dicts, or non-atom values trip the function and it returns None — falls through to the no-op branch in `_TableFormatter.leave_List`. `_atom_text` extended to render Dict / SimpleString / Name so the budget check (`_list_compact_length`) knows what a flat dict costs. The leading int-pad of each value lives in `whitespace_after_colon` (default one space + the leading-pad width); the trailing frac-pad lives in the comma's `whitespace_before` for non-last columns and in the dict's `rbrace.whitespace_before` for the last column. Outer list uses the same data-block style as the multi-col tuples path: trailing comma after every dict, close bracket on its own line. 30/30 unit tests pass with idempotency. Smoke-tested across all 92 buckaroo .py files: 0 non-idempotent, 0 parse failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y collapse Four parametrized cases that all fail today: - comment_in_args_blocks_collapse_close_stays_at_col0: comment-protected collapse leaves the source as-is, but the reindent pass still drifts the close bracket from col 0 to col 4. - kwonly_only / kwonly_after_regular / posonly: _collapse_funcdef only walks Parameters.params, ignoring posonly_params, kwonly_params, and star_arg, so signatures using `*` or `/` get partial / broken collapses. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- test_table_format_directive_outside_top_level_is_ignored: pin that `# table-format` only fires on top-level / IndentedBlock Assign+List; a directive in front of a list nested inside a Call argument is a no-op by design. - test_cli_*: --check returns 1 when changes needed, 0 otherwise; the unflagged form rewrites in place and handles a multi-file run. - test_paddy_format_smoke_on_buckaroo: parametrized round-trip over every `buckaroo/**/*.py` file, asserting the output still parses and is idempotent. Marked `slow` (~30s for ~90 files); CI's `-m "not slow"` skips it. Run locally with `pytest -m slow`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five collapse paths (Call, FunctionDef, Collection, Tuple, ImportFrom) shared the same shape: validate trailing comma + cleanliness of every whitespace slot, then rebuild items with `, ` separators and an empty post-item whitespace. Pull the shape into two module-level helpers and let each method shrink to ~10 lines. Tuple's single-element trailing- comma preservation is now a flag (`preserve_singleton_comma=True`). No behavior change — all existing tests still pass, the four RED parametrize cases for the comment+reindent and kwonly/posonly bugs still fail the same way. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lapse Two fixes for the RED tests in 06557d0: 1. _Reindenter no longer reindents the LAST item's comma.whitespace_after on a Call / List / Set / Dict. That whitespace sits before the close bracket, not before another item — when a comment had blocked the collapse pass and left the close on its own line at the user's chosen column, the reindent was relocating it to indent+4. Skip the last item's comma; whitespace_before for the close stays put. 2. _collapse_funcdef now walks every parameter slot — posonly_params, posonly_ind (`/`), params, star_arg (`*` or `*args`), kwonly_params, star_kwarg (`**kwargs`) — instead of just `params.params`. New `_iter_param_slots` helper yields (kind, item) in source order; the collapse rebuilds Parameters with `, ` separators on every non-last slot and DEFAULT on the last. Signatures using `*` or `/` now collapse fully instead of producing partial multi-line layouts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Adds
scripts/paddy_format.py— a libcst-based formatter that rewrites Python source toward compact lisp-style brackets:) ] }moves up to the previous line.func(\n)becomesfunc(), same for[\n]/{\n}.Comments anywhere in the affected whitespace block all four transforms — never absorb a comment. Idempotent. Returns input unchanged on syntax errors.
Bracket types covered:
Call,List,Set,Dict,Tuple,FunctionDef.params, parenthesizedImportFrom(collapse + stack);Call,List,Set,Dict(wrap). Extending wrap toFunctionDef/Tuple/ImportFromis a follow-up.Scope (intentionally narrow)
--check. Not run across the codebase.buckaroo/andtests/and addpaddy-format --checktoLintPython.libcstas a dev dep.CLI
Testing
buckaroo/: zero parse failures, zero idempotency failures.Notes
uv.lockdiff is large because currentuvbumps the schema (revision 2 → 3,upload_time → upload-time). Real additions arelibcst+ transitivepyyaml-ft.🤖 Generated with Claude Code