Skip to content

refactor(ecospold2): factor parseWithXeno into SAX combinators#97

Merged
ccomb merged 1 commit into
advanced_haskellfrom
worktree-ecospold2parser
May 28, 2026
Merged

refactor(ecospold2): factor parseWithXeno into SAX combinators#97
ccomb merged 1 commit into
advanced_haskellfrom
worktree-ecospold2parser

Conversation

@ccomb
Copy link
Copy Markdown
Owner

@ccomb ccomb commented May 28, 2026

Summary

parseWithXeno was a single ~465-line function. Its closeTag handler held two near-duplicate ~115-line exchange-finalization blocks plus a dozen field setters, all repeating the same three idioms: text harvest, path pop, and per-kind InIntermediateExchange/InElementaryExchange dispatch.

This extracts small top-level pure combinators so each block now carries only its domain decision logic:

  • idiomsaccumText, popPath, popText, pathAt
  • dispatchonExchange (over mapExchange); the exhaustive ElementContext matches live in currentIntermediate / currentElementary / inGeneralComment, so call sites stay wildcard-free
  • exchange closemkUnit, resolveGroups, missingUnitWarning, parseUUIDOrNil, finishExchange, and the add* state pushes
  • plumbing — result handling becomes an Either-monad do-block (Data.Bifunctor.first); buildResult's tail becomes an fmap

Combinators are kept allocation-neutral (no lens on this hot strict fold). Net −29 lines, single file.

Test plan

  • cabal build clean — only the 2 pre-existing MCP warnings, none from this file
  • cabal test — 1107/1107 hspec green, 1 pending (unchanged from baseline)
  • EcoSpold2 group green: per-exchange comments, property-comment isolation, waste patterns A & B, native activityType, malformed-input Left

parseWithXeno was a single ~465-line function whose closeTag handler held
two near-duplicate ~115-line exchange-finalization blocks plus a dozen field
setters all repeating the same three idioms: text harvest
(T.concat . reverse . map bsToText), path pop (drop 1 on psPath), and per-kind
InIntermediateExchange/InElementaryExchange context dispatch.

Extract small top-level pure combinators: accumText / popPath / popText /
pathAt for the idioms; onExchange (over mapExchange) for the per-kind dispatch,
with currentIntermediate / currentElementary / inGeneralComment holding the
exhaustive ElementContext matches so the call sites stay wildcard-free; and
mkUnit / resolveGroups / missingUnitWarning / parseUUIDOrNil / finishExchange
plus the add* pushes for the shared exchange-close scaffolding. The result
plumbing becomes an Either-monad do-block (Data.Bifunctor.first) and
buildResult's tail an fmap.

Helpers are kept allocation-neutral (no lens on this hot strict fold). Each big
block now carries only its domain decision logic. Behaviour is unchanged:
1107/1107 hspec green, net -29 lines.
@ccomb ccomb merged commit 8950615 into advanced_haskell May 28, 2026
@ccomb ccomb deleted the worktree-ecospold2parser branch May 28, 2026 15:52
ccomb added a commit that referenced this pull request May 28, 2026
…off (#99)

## What

The cut-off post-processing — reducing a multi-output dataset to a
single reference product so the engine sees a single-output process —
was duplicated **verbatim** between the EcoSpold1 and EcoSpold2 parsers
(~70 lines, the same eight-function chain in each). This moves that
block into one shared `EcoSpold.Cutoff` module that both parsers import.

This is the last remaining piece of the EcoSpold refactor series (after
the Parser2 split #97 and the Parser1 fold dedup #94), which left the
cut-off block still copy-pasted in both files.

## Changes

- **New `src/EcoSpold/Cutoff.hs`** — holds `applyCutoffStrategy`,
`hasReferenceProduct`, `removeZeroAmountCoproducts`,
`assignSingleProductAsReference`, `isProductionExchange` (exported) plus
the internal `updateReferenceProduct` / `markAsReference` /
`unmarkAsReference`.
- **`Parser1.hs` / `Parser2.hs`** — delete the duplicated block, `import
EcoSpold.Cutoff (applyCutoffStrategy)`. Parser1 drops its "exported for
testing" cut-off helpers.
- **`EcoSpold1Spec.hs`** — imports the cut-off helpers from
`EcoSpold.Cutoff` instead of the old `EcoSpold.Parser1` re-export.
- **`volca.cabal`** — registers the new library module.

Pure post-fold logic, runs once per activity (not in the SAX loop), so
no behaviour or throughput change. Net: ~154 duplicated lines removed
across the two parsers, replaced by one ~95-line module.

## Verification

- `cabal build lib:volca` + `exe:volca`: clean (only pre-existing
unrelated MCP warnings).
- `cabal test lca-tests`: **1107 examples, 0 failures, 1 pending** —
including the EcoSpold1 cut-off-helper tests now resolving against the
new module.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant