refactor(validate): extract phase2 evaluator and surface failure context by scarmuega · Pull Request #751 · txpipe/pallas

scarmuega · 2026-05-04T16:44:14Z

Summary

Extracts the amaru-uplc invocation out of phase2/tx.rs into a dedicated phase2::evaluator module returning a structured ScriptEvalResult.
Surfaces machine-failure diagnostics that were previously discarded: TxEvalResult gains failure_message: Option<String>, and the same context is emitted at debug level via tracing.
Restructures Error::Machine into { message, budget, logs } with a clean Display impl. The previous opaque tuple variant was dead code (its arena lifetime made it un-constructible).
Drops the legacy * 11 / 10 budget inflation hack in tx.rs — the evaluator's reported consumed_budget is now passed through verbatim. Behavior change: callers will see slightly lower units values than before.

Context

These are the architectural improvements proposed in #739, ported on top of the recently merged amaru-uplc switch (#749). #739 itself is now obsolete — the dependency change it introduced is superseded — but the error-handling refactor, the dedicated evaluator module, and the test coverage were still missing from main.

Test plan

cargo test -p pallas-validate --features phase2 — 6 new evaluator unit tests + 121 existing tests all pass
cargo clippy -p pallas-validate --features phase2 --all-targets -- -D warnings clean
New regression test (serialise_data_v3_script_evaluates_without_panic) confirms the original motivation behind fix(validate): migrate phase2 evaluation to uplc #739 — that serialiseData no longer panics — holds under amaru-uplc

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added Plutus script evaluation functionality with support for multiple Plutus versions and comprehensive error tracking.
- Enhanced failure reporting to include detailed error messages and execution logs.
Bug Fixes
- Improved error handling and logging capture during script execution.

Move the amaru-uplc invocation out of `tx.rs` into a dedicated `phase2::evaluator` module exposing `eval_script` -> `ScriptEvalResult`. This isolates the script-evaluation surface from the rest of phase-two plumbing and makes it directly testable. Promote machine failures from a silently-discarded `Err` on `result.term` into structured data: `MachineFailure { message, budget, logs }` is propagated up and surfaced on `TxEvalResult` via a new `failure_message: Option<String>` field. Previously a failing script would yield `success: false` with no diagnostic; callers now receive the evaluator's error message and the same line is emitted at debug level via `tracing`. Restructure `Error::Machine` from an opaque tuple wrapping the arena-borrowed `MachineError` into a `{ message, budget, logs }` struct with a clean `Display` impl backed by free `format_machine_traces` and `indent_trace` helpers. The previous variant could never actually be constructed (the arena lifetime didn't permit it) and was dead. Drop the `* 11 / 10` budget inflation hack in `tx.rs` — the evaluator's reported `consumed_budget` is now passed through verbatim. Callers that relied on the old margin will see slightly lower numbers. Add unit tests in `evaluator.rs` covering: CBOR decode failure typing, flat decode failure typing, machine failure capturing logs and a non-empty message, V2 application order (datum -> redeemer -> context), V2 with missing datum, and a regression test confirming `serialiseData` on V3 evaluates without panicking. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-04T16:44:29Z

📝 Walkthrough

Walkthrough

A new Plutus phase2 script evaluator is introduced that decodes UPLC programs, applies version-specific arguments, evaluates programs, captures logs, and returns structured results. The error variant for machine failures is refactored to use named fields with helper trace formatting.

Changes

Plutus Phase2 Evaluator Refactoring

Layer / File(s)	Summary
Data Shape `pallas-validate/src/phase2/error.rs`	`Error::Machine` variant changes from tuple form `(MachineError, ExBudget, Vec<String>)` to struct form with `message: String`, `budget: ExUnits`, and `logs: Vec<String>`. Helper functions `format_machine_traces` and `indent_trace` format log output.
Core Evaluator Implementation `pallas-validate/src/phase2/evaluator.rs`	New module implements `eval_script` to decode CBOR/flat-encoded UPLC programs, convert datum/redeemer/context into terms, apply arguments in version-specific order (V1/V2: datum+redeemer+context; V3: context only), evaluate programs, and convert budgets to `ExUnits`. Introduces `ScriptEvalResult` and `MachineFailure` structures. Includes helpers for `PlutusData` term conversion, budget conversion, and language mapping.
Script Execution Integration `pallas-validate/src/phase2/tx.rs`	`execute_script` is refactored to accept `Language` instead of `PlutusVersion`, delegate to `evaluator::eval_script`, and return `TxEvalResult` with new `failure_message` field. `eval_redeemer` passes `Language::PlutusV` and corresponding `TxInfoV` into the updated `execute_script`.
Module Registration `pallas-validate/src/phase2/mod.rs`	Internal `evaluator` module is declared.
Tests & Helpers `pallas-validate/src/phase2/evaluator.rs`	Unit tests verify decode/flat-encode error typing, log preservation on machine failure, V3 `serialiseData` no-panic behavior, and V2 argument-application order with optional datum.

Sequence Diagram

sequenceDiagram
    participant Caller as Caller
    participant TxEval as execute_script<br/>(tx.rs)
    participant Evaluator as eval_script<br/>(evaluator.rs)
    participant UPLC as UPLC<br/>Decoder/Evaluator
    participant Result as ScriptEvalResult

    Caller->>TxEval: Language, TxInfo, script bytes, datum, redeemer
    TxEval->>TxEval: Build script context from TxInfo
    TxEval->>Evaluator: eval_script(Language, script_bytes, datum, redeemer, context)
    Evaluator->>UPLC: Map Language to PlutusVersion
    Evaluator->>UPLC: CBOR decode script bytes
    Evaluator->>UPLC: Flat-decode to UPLC Program
    Evaluator->>UPLC: Convert datum/redeemer/context to terms
    Evaluator->>UPLC: Apply arguments (version-specific order)
    UPLC-->>Evaluator: Evaluation result & budget
    Evaluator->>Evaluator: Convert budget to ExUnits
    Evaluator->>Evaluator: Determine success (V3 requires Unit term)
    Evaluator-->>Result: ScriptEvalResult {success, units, logs, failure}
    Result-->>TxEval: Return result
    TxEval->>TxEval: Populate TxEvalResult with failure_message
    TxEval-->>Caller: TxEvalResult

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

chore(validate): switch from txpipe uplc-turbo fork to pragma-org amaru-uplc #749: Modifies the same phase2 code paths, including execute_script/eval_redeemer signatures and Machine-related error handling and imports.

Poem

🐰 A script evaluator hops in,
With UPLC terms and budgets thin,
V1, V2, V3 all aligned,
Plutus phases now refined! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main refactoring: extracting phase2 evaluator logic and surfacing failure context in error handling.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch validate/phase2-evaluator-refactor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 60 minutes.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

pallas-validate/src/phase2/evaluator.rs (1)

77-91: 💤 Low value

V3 success logic correctly implements CIP-0069 semantics.

For PlutusV3, scripts must return Unit to indicate success, not just complete without error. However, note that when a V3 script returns Ok(non-Unit), failure will be None but success will be false. This may be intentional, but callers won't have diagnostic information for this failure mode.

Consider capturing non-Unit V3 results as a failure

If you want to surface why a V3 script failed when it returns a non-Unit value:

     let failure = result.term.as_ref().err().map(|err| MachineFailure {
         message: err.to_string(),
         budget: units,
         logs: logs.clone(),
     });

     let success = match (&result.term, &language) {
         (Ok(_), Language::PlutusV1 | Language::PlutusV2) => true,
         (Ok(term), Language::PlutusV3) => matches!(
             term,
             Term::Constant(constant) if matches!(**constant, Constant::Unit)
         ),
         (Err(_), _) => false,
     };

+    // For V3, if evaluation succeeded but didn't return Unit, create a failure
+    let failure = match (&result.term, &language, failure) {
+        (Ok(_), Language::PlutusV3, None) if !success => Some(MachineFailure {
+            message: "PlutusV3 script did not return Unit".to_string(),
+            budget: units,
+            logs: logs.clone(),
+        }),
+        (_, _, f) => f,
+    };

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pallas-validate/src/phase2/evaluator.rs` around lines 77 - 91, The current
match sets success=false for PlutusV3 when the returned term is non-Unit but
leaves failure as-is (often None), which loses diagnostic info; update the
evaluation logic around the match on (&result.term, &language) (the block
assigning success) to also populate failure (the variable used in the
ScriptEvalResult) when Language::PlutusV3 returns Ok(term) that is not
Constant::Unit — e.g., detect the Ok(term) non-Unit branch (the same place using
Term::Constant and Constant::Unit) and set failure to Some descriptive error
(e.g., "PlutusV3 script must return Unit") while keeping success=false, then
return ScriptEvalResult with the updated failure field.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pallas-validate/src/phase2/evaluator.rs`:
- Around line 77-91: The current match sets success=false for PlutusV3 when the
returned term is non-Unit but leaves failure as-is (often None), which loses
diagnostic info; update the evaluation logic around the match on (&result.term,
&language) (the block assigning success) to also populate failure (the variable
used in the ScriptEvalResult) when Language::PlutusV3 returns Ok(term) that is
not Constant::Unit — e.g., detect the Ok(term) non-Unit branch (the same place
using Term::Constant and Constant::Unit) and set failure to Some descriptive
error (e.g., "PlutusV3 script must return Unit") while keeping success=false,
then return ScriptEvalResult with the updated failure field.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 245b6c8f-1c02-4126-b2a7-49eb78234a34

📥 Commits

Reviewing files that changed from the base of the PR and between a7b5a86 and bf4aaa1.

📒 Files selected for processing (4)

pallas-validate/src/phase2/error.rs
pallas-validate/src/phase2/evaluator.rs
pallas-validate/src/phase2/mod.rs
pallas-validate/src/phase2/tx.rs

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

scarmuega mentioned this pull request May 4, 2026

fix(validate): migrate phase2 evaluation to uplc #739

Closed

scarmuega merged commit e1d0555 into main May 8, 2026
15 checks passed

scarmuega deleted the validate/phase2-evaluator-refactor branch May 8, 2026 16:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(validate): extract phase2 evaluator and surface failure context#751

refactor(validate): extract phase2 evaluator and surface failure context#751
scarmuega merged 1 commit into
mainfrom
validate/phase2-evaluator-refactor

scarmuega commented May 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 4, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

scarmuega commented May 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

scarmuega commented May 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 4, 2026 •

edited

Loading