refactor(ecospold2): factor parseWithXeno into SAX combinators#97
Merged
Conversation
parseWithXeno was a single ~465-line function whose closeTag handler held two near-duplicate ~115-line exchange-finalization blocks plus a dozen field setters all repeating the same three idioms: text harvest (T.concat . reverse . map bsToText), path pop (drop 1 on psPath), and per-kind InIntermediateExchange/InElementaryExchange context dispatch. Extract small top-level pure combinators: accumText / popPath / popText / pathAt for the idioms; onExchange (over mapExchange) for the per-kind dispatch, with currentIntermediate / currentElementary / inGeneralComment holding the exhaustive ElementContext matches so the call sites stay wildcard-free; and mkUnit / resolveGroups / missingUnitWarning / parseUUIDOrNil / finishExchange plus the add* pushes for the shared exchange-close scaffolding. The result plumbing becomes an Either-monad do-block (Data.Bifunctor.first) and buildResult's tail an fmap. Helpers are kept allocation-neutral (no lens on this hot strict fold). Each big block now carries only its domain decision logic. Behaviour is unchanged: 1107/1107 hspec green, net -29 lines.
ccomb
added a commit
that referenced
this pull request
May 28, 2026
…off (#99) ## What The cut-off post-processing — reducing a multi-output dataset to a single reference product so the engine sees a single-output process — was duplicated **verbatim** between the EcoSpold1 and EcoSpold2 parsers (~70 lines, the same eight-function chain in each). This moves that block into one shared `EcoSpold.Cutoff` module that both parsers import. This is the last remaining piece of the EcoSpold refactor series (after the Parser2 split #97 and the Parser1 fold dedup #94), which left the cut-off block still copy-pasted in both files. ## Changes - **New `src/EcoSpold/Cutoff.hs`** — holds `applyCutoffStrategy`, `hasReferenceProduct`, `removeZeroAmountCoproducts`, `assignSingleProductAsReference`, `isProductionExchange` (exported) plus the internal `updateReferenceProduct` / `markAsReference` / `unmarkAsReference`. - **`Parser1.hs` / `Parser2.hs`** — delete the duplicated block, `import EcoSpold.Cutoff (applyCutoffStrategy)`. Parser1 drops its "exported for testing" cut-off helpers. - **`EcoSpold1Spec.hs`** — imports the cut-off helpers from `EcoSpold.Cutoff` instead of the old `EcoSpold.Parser1` re-export. - **`volca.cabal`** — registers the new library module. Pure post-fold logic, runs once per activity (not in the SAX loop), so no behaviour or throughput change. Net: ~154 duplicated lines removed across the two parsers, replaced by one ~95-line module. ## Verification - `cabal build lib:volca` + `exe:volca`: clean (only pre-existing unrelated MCP warnings). - `cabal test lca-tests`: **1107 examples, 0 failures, 1 pending** — including the EcoSpold1 cut-off-helper tests now resolving against the new module.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
parseWithXenowas a single ~465-line function. ItscloseTaghandler held two near-duplicate ~115-line exchange-finalization blocks plus a dozen field setters, all repeating the same three idioms: text harvest, path pop, and per-kindInIntermediateExchange/InElementaryExchangedispatch.This extracts small top-level pure combinators so each block now carries only its domain decision logic:
accumText,popPath,popText,pathAtonExchange(overmapExchange); the exhaustiveElementContextmatches live incurrentIntermediate/currentElementary/inGeneralComment, so call sites stay wildcard-freemkUnit,resolveGroups,missingUnitWarning,parseUUIDOrNil,finishExchange, and theadd*state pushesEither-monad do-block (Data.Bifunctor.first);buildResult's tail becomes anfmapCombinators are kept allocation-neutral (no lens on this hot strict fold). Net −29 lines, single file.
Test plan
cabal buildclean — only the 2 pre-existing MCP warnings, none from this filecabal test— 1107/1107 hspec green, 1 pending (unchanged from baseline)Left