fix(#969): canonize after meet compression so --compress still dedups#970
Conversation
--canonize renames each step's λ-bindings to F1, F2, … numbering by traversal position and restarting per expression, so the same logical lambda got a different Fn in different steps. The meet pass compares sub-expressions by structural equality, so those renamed lambdas never matched and identical sub-expressions were never factored into \phinoMeet/\phinoAgain — --canonize silently defeated --compress. Canonization now runs after the meet compression instead of before it: the meet pass sees the original (stable) lambda names, and only its output is canonized. The _canonize flag is threaded through PrintContext/LatexContext and applied at render time (inside rewrittensToLatex after the meet for the LaTeX sequence path, and in printRewrittens for the other formats) rather than pre-applied to the chain in the runners. canonizeExpression also gains ExPhiMeet/ExPhiAgain cases so lambdas inside a factored-out \phinoMeet body still get renamed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01JJxZDChgfupZEqUMbtSzHm
Benchmark comparisonRegression threshold: +20%. Same-runner A/B against the PR merge-base; the bench fixture is frozen across both runs.
OK: no case regressed above the +20% threshold. |
|
@rultor merge |
@yegor256 Oops, I failed. You can see the full log here (spent 19min). |
There was a problem hiding this comment.
Pull request overview
Reorders --canonize relative to LaTeX meet compression so --compress can still deduplicate structurally identical sub-expressions across sequence steps (producing \phinoMeet/\phinoAgain) even when canonization is enabled.
Changes:
- Threaded a
_canonizeflag throughPrintContext→LatexContextand moved canonization to render-time, instead of canonizing the rewritten chain up-front in CLI runners. - In the LaTeX sequence pipeline, applied canonization only after meet compression (
meetInExpressions) for both whole-expression and focused-expression sequences. - Extended the canonizer to traverse
ExPhiMeet/ExPhiAgainand added a regression CLI test asserting\phinoMeetis produced fordataize --output=latex --sequence --compress --canonize.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| test/CLISpec.hs | Adds regression coverage for --compress --canonize --sequence producing a \phinoMeet. |
| src/LaTeX.hs | Implements “compress first, canonize after” for LaTeX sequence rendering and threads _canonize via LatexContext. |
| src/CLI/Types.hs | Adds _canonize to PrintContext so printing decisions can be made at render time. |
| src/CLI/Runners.hs | Stops canonizing rewritten chains up-front; passes _canonize through PrintContext. |
| src/CLI/Helpers.hs | Canonizes right before rendering for non-LaTeX-sequence formats; passes _canonize into LatexContext. |
| src/Canonizer.hs | Exports canonizeExpr, reuses it in canonize, and canonizes inside ExPhiMeet/ExPhiAgain wrappers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@yegor256 Thanks for the contribution! You've earned +8 points for this: +16 as a basis; -8 for the lack of code review. Please, keep them coming. Your running score is +1707; don't forget to check your Zerocracy account too). |
Closes #969.
Problem
When
--compressand--canonizeare combined for LaTeX sequence output(e.g.
dataize --output=latex --sequence --compress --canonize),structurally identical sub-expressions across reduction steps were not
factored out into
\phinoMeet/\phinoAgain— they were printed in full overand over.
The cause was ordering:
canonizeran before the meet compression. Itrenames each step's λ-bindings to
F1,F2, … numbering by traversalposition and restarting the counter for every expression. Because the
counter restarts per step and numbers by position, the same logical lambda
got a different
Fnin different steps. The meet pass compares candidates bystructural equality, so
Function "F3"andFunction "F4"looked distinct andthe enclosing sub-expressions never matched —
--canonizesilently sabotagedthe deduplication
--compressis meant to perform.Fix
Canonization now runs after the meet compression, so the meet pass sees the
stable, original lambda names:
_canonizeflag is threaded throughPrintContext/LatexContextandapplied at render time rather than pre-applied to the chain in the runners
(
runRewrite/runDataizeno longer canonize up front).rewrittensToLatexfeeds un-canonizedexpressions to
meetInExpressions(viacompressedRewrittens/compressedExpressions) and canonizes only its output(
canonizedRewrittens/canonizedExpressions).printRewrittenscanonizes justbefore rendering, preserving the previous behavior.
canonizeExpressiongainsExPhiMeet/ExPhiAgaincases so lambdas inside afactored-out
\phinoMeet{…}body still get renamed.Tests
Added a
dataize --output=latex --compress --canonize --sequenceCLI test overa whole-expression sequence that asserts a
\phinoMeetis produced — thescenario that previously collapsed only when
--canonizewas dropped. Theexisting focused compress+canonize test continues to pass.
🤖 Generated with Claude Code
https://claude.ai/code/session_01JJxZDChgfupZEqUMbtSzHm
Generated by Claude Code