refactor(validate): extract phase2 evaluator and surface failure context#751
Conversation
Move the amaru-uplc invocation out of `tx.rs` into a dedicated
`phase2::evaluator` module exposing `eval_script` -> `ScriptEvalResult`.
This isolates the script-evaluation surface from the rest of phase-two
plumbing and makes it directly testable.
Promote machine failures from a silently-discarded `Err` on
`result.term` into structured data: `MachineFailure { message, budget,
logs }` is propagated up and surfaced on `TxEvalResult` via a new
`failure_message: Option<String>` field. Previously a failing script
would yield `success: false` with no diagnostic; callers now receive
the evaluator's error message and the same line is emitted at debug
level via `tracing`.
Restructure `Error::Machine` from an opaque tuple wrapping the
arena-borrowed `MachineError` into a `{ message, budget, logs }` struct
with a clean `Display` impl backed by free `format_machine_traces` and
`indent_trace` helpers. The previous variant could never actually be
constructed (the arena lifetime didn't permit it) and was dead.
Drop the `* 11 / 10` budget inflation hack in `tx.rs` — the evaluator's
reported `consumed_budget` is now passed through verbatim. Callers
that relied on the old margin will see slightly lower numbers.
Add unit tests in `evaluator.rs` covering: CBOR decode failure typing,
flat decode failure typing, machine failure capturing logs and a
non-empty message, V2 application order (datum -> redeemer -> context),
V2 with missing datum, and a regression test confirming `serialiseData`
on V3 evaluates without panicking.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughA new Plutus phase2 script evaluator is introduced that decodes UPLC programs, applies version-specific arguments, evaluates programs, captures logs, and returns structured results. The error variant for machine failures is refactored to use named fields with helper trace formatting. ChangesPlutus Phase2 Evaluator Refactoring
Sequence DiagramsequenceDiagram
participant Caller as Caller
participant TxEval as execute_script<br/>(tx.rs)
participant Evaluator as eval_script<br/>(evaluator.rs)
participant UPLC as UPLC<br/>Decoder/Evaluator
participant Result as ScriptEvalResult
Caller->>TxEval: Language, TxInfo, script bytes, datum, redeemer
TxEval->>TxEval: Build script context from TxInfo
TxEval->>Evaluator: eval_script(Language, script_bytes, datum, redeemer, context)
Evaluator->>UPLC: Map Language to PlutusVersion
Evaluator->>UPLC: CBOR decode script bytes
Evaluator->>UPLC: Flat-decode to UPLC Program
Evaluator->>UPLC: Convert datum/redeemer/context to terms
Evaluator->>UPLC: Apply arguments (version-specific order)
UPLC-->>Evaluator: Evaluation result & budget
Evaluator->>Evaluator: Convert budget to ExUnits
Evaluator->>Evaluator: Determine success (V3 requires Unit term)
Evaluator-->>Result: ScriptEvalResult {success, units, logs, failure}
Result-->>TxEval: Return result
TxEval->>TxEval: Populate TxEvalResult with failure_message
TxEval-->>Caller: TxEvalResult
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pallas-validate/src/phase2/evaluator.rs (1)
77-91: 💤 Low valueV3 success logic correctly implements CIP-0069 semantics.
For PlutusV3, scripts must return
Unitto indicate success, not just complete without error. However, note that when a V3 script returnsOk(non-Unit),failurewill beNonebutsuccesswill befalse. This may be intentional, but callers won't have diagnostic information for this failure mode.Consider capturing non-Unit V3 results as a failure
If you want to surface why a V3 script failed when it returns a non-Unit value:
let failure = result.term.as_ref().err().map(|err| MachineFailure { message: err.to_string(), budget: units, logs: logs.clone(), }); let success = match (&result.term, &language) { (Ok(_), Language::PlutusV1 | Language::PlutusV2) => true, (Ok(term), Language::PlutusV3) => matches!( term, Term::Constant(constant) if matches!(**constant, Constant::Unit) ), (Err(_), _) => false, }; + // For V3, if evaluation succeeded but didn't return Unit, create a failure + let failure = match (&result.term, &language, failure) { + (Ok(_), Language::PlutusV3, None) if !success => Some(MachineFailure { + message: "PlutusV3 script did not return Unit".to_string(), + budget: units, + logs: logs.clone(), + }), + (_, _, f) => f, + };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pallas-validate/src/phase2/evaluator.rs` around lines 77 - 91, The current match sets success=false for PlutusV3 when the returned term is non-Unit but leaves failure as-is (often None), which loses diagnostic info; update the evaluation logic around the match on (&result.term, &language) (the block assigning success) to also populate failure (the variable used in the ScriptEvalResult) when Language::PlutusV3 returns Ok(term) that is not Constant::Unit — e.g., detect the Ok(term) non-Unit branch (the same place using Term::Constant and Constant::Unit) and set failure to Some descriptive error (e.g., "PlutusV3 script must return Unit") while keeping success=false, then return ScriptEvalResult with the updated failure field.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pallas-validate/src/phase2/evaluator.rs`:
- Around line 77-91: The current match sets success=false for PlutusV3 when the
returned term is non-Unit but leaves failure as-is (often None), which loses
diagnostic info; update the evaluation logic around the match on (&result.term,
&language) (the block assigning success) to also populate failure (the variable
used in the ScriptEvalResult) when Language::PlutusV3 returns Ok(term) that is
not Constant::Unit — e.g., detect the Ok(term) non-Unit branch (the same place
using Term::Constant and Constant::Unit) and set failure to Some descriptive
error (e.g., "PlutusV3 script must return Unit") while keeping success=false,
then return ScriptEvalResult with the updated failure field.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 245b6c8f-1c02-4126-b2a7-49eb78234a34
📒 Files selected for processing (4)
pallas-validate/src/phase2/error.rspallas-validate/src/phase2/evaluator.rspallas-validate/src/phase2/mod.rspallas-validate/src/phase2/tx.rs
Summary
phase2/tx.rsinto a dedicatedphase2::evaluatormodule returning a structuredScriptEvalResult.TxEvalResultgainsfailure_message: Option<String>, and the same context is emitted at debug level viatracing.Error::Machineinto{ message, budget, logs }with a cleanDisplayimpl. The previous opaque tuple variant was dead code (its arena lifetime made it un-constructible).* 11 / 10budget inflation hack intx.rs— the evaluator's reportedconsumed_budgetis now passed through verbatim. Behavior change: callers will see slightly lowerunitsvalues than before.Context
These are the architectural improvements proposed in #739, ported on top of the recently merged amaru-uplc switch (#749). #739 itself is now obsolete — the dependency change it introduced is superseded — but the error-handling refactor, the dedicated evaluator module, and the test coverage were still missing from main.
Test plan
cargo test -p pallas-validate --features phase2— 6 new evaluator unit tests + 121 existing tests all passcargo clippy -p pallas-validate --features phase2 --all-targets -- -D warningscleanserialise_data_v3_script_evaluates_without_panic) confirms the original motivation behind fix(validate): migrate phase2 evaluation to uplc #739 — thatserialiseDatano longer panics — holds under amaru-uplc🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes