Skip to content

refactor(validate): extract phase2 evaluator and surface failure context#751

Merged
scarmuega merged 1 commit into
mainfrom
validate/phase2-evaluator-refactor
May 8, 2026
Merged

refactor(validate): extract phase2 evaluator and surface failure context#751
scarmuega merged 1 commit into
mainfrom
validate/phase2-evaluator-refactor

Conversation

@scarmuega
Copy link
Copy Markdown
Member

@scarmuega scarmuega commented May 4, 2026

Summary

  • Extracts the amaru-uplc invocation out of phase2/tx.rs into a dedicated phase2::evaluator module returning a structured ScriptEvalResult.
  • Surfaces machine-failure diagnostics that were previously discarded: TxEvalResult gains failure_message: Option<String>, and the same context is emitted at debug level via tracing.
  • Restructures Error::Machine into { message, budget, logs } with a clean Display impl. The previous opaque tuple variant was dead code (its arena lifetime made it un-constructible).
  • Drops the legacy * 11 / 10 budget inflation hack in tx.rs — the evaluator's reported consumed_budget is now passed through verbatim. Behavior change: callers will see slightly lower units values than before.

Context

These are the architectural improvements proposed in #739, ported on top of the recently merged amaru-uplc switch (#749). #739 itself is now obsolete — the dependency change it introduced is superseded — but the error-handling refactor, the dedicated evaluator module, and the test coverage were still missing from main.

Test plan

  • cargo test -p pallas-validate --features phase2 — 6 new evaluator unit tests + 121 existing tests all pass
  • cargo clippy -p pallas-validate --features phase2 --all-targets -- -D warnings clean
  • New regression test (serialise_data_v3_script_evaluates_without_panic) confirms the original motivation behind fix(validate): migrate phase2 evaluation to uplc #739 — that serialiseData no longer panics — holds under amaru-uplc

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added Plutus script evaluation functionality with support for multiple Plutus versions and comprehensive error tracking.
    • Enhanced failure reporting to include detailed error messages and execution logs.
  • Bug Fixes

    • Improved error handling and logging capture during script execution.

Move the amaru-uplc invocation out of `tx.rs` into a dedicated
`phase2::evaluator` module exposing `eval_script` -> `ScriptEvalResult`.
This isolates the script-evaluation surface from the rest of phase-two
plumbing and makes it directly testable.

Promote machine failures from a silently-discarded `Err` on
`result.term` into structured data: `MachineFailure { message, budget,
logs }` is propagated up and surfaced on `TxEvalResult` via a new
`failure_message: Option<String>` field. Previously a failing script
would yield `success: false` with no diagnostic; callers now receive
the evaluator's error message and the same line is emitted at debug
level via `tracing`.

Restructure `Error::Machine` from an opaque tuple wrapping the
arena-borrowed `MachineError` into a `{ message, budget, logs }` struct
with a clean `Display` impl backed by free `format_machine_traces` and
`indent_trace` helpers. The previous variant could never actually be
constructed (the arena lifetime didn't permit it) and was dead.

Drop the `* 11 / 10` budget inflation hack in `tx.rs` — the evaluator's
reported `consumed_budget` is now passed through verbatim. Callers
that relied on the old margin will see slightly lower numbers.

Add unit tests in `evaluator.rs` covering: CBOR decode failure typing,
flat decode failure typing, machine failure capturing logs and a
non-empty message, V2 application order (datum -> redeemer -> context),
V2 with missing datum, and a regression test confirming `serialiseData`
on V3 evaluates without panicking.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

📝 Walkthrough

Walkthrough

A new Plutus phase2 script evaluator is introduced that decodes UPLC programs, applies version-specific arguments, evaluates programs, captures logs, and returns structured results. The error variant for machine failures is refactored to use named fields with helper trace formatting.

Changes

Plutus Phase2 Evaluator Refactoring

Layer / File(s) Summary
Data Shape
pallas-validate/src/phase2/error.rs
Error::Machine variant changes from tuple form (MachineError, ExBudget, Vec<String>) to struct form with message: String, budget: ExUnits, and logs: Vec<String>. Helper functions format_machine_traces and indent_trace format log output.
Core Evaluator Implementation
pallas-validate/src/phase2/evaluator.rs
New module implements eval_script to decode CBOR/flat-encoded UPLC programs, convert datum/redeemer/context into terms, apply arguments in version-specific order (V1/V2: datum+redeemer+context; V3: context only), evaluate programs, and convert budgets to ExUnits. Introduces ScriptEvalResult and MachineFailure structures. Includes helpers for PlutusData term conversion, budget conversion, and language mapping.
Script Execution Integration
pallas-validate/src/phase2/tx.rs
execute_script is refactored to accept Language instead of PlutusVersion, delegate to evaluator::eval_script, and return TxEvalResult with new failure_message field. eval_redeemer passes Language::PlutusV* and corresponding TxInfoV* into the updated execute_script.
Module Registration
pallas-validate/src/phase2/mod.rs
Internal evaluator module is declared.
Tests & Helpers
pallas-validate/src/phase2/evaluator.rs
Unit tests verify decode/flat-encode error typing, log preservation on machine failure, V3 serialiseData no-panic behavior, and V2 argument-application order with optional datum.

Sequence Diagram

sequenceDiagram
    participant Caller as Caller
    participant TxEval as execute_script<br/>(tx.rs)
    participant Evaluator as eval_script<br/>(evaluator.rs)
    participant UPLC as UPLC<br/>Decoder/Evaluator
    participant Result as ScriptEvalResult

    Caller->>TxEval: Language, TxInfo, script bytes, datum, redeemer
    TxEval->>TxEval: Build script context from TxInfo
    TxEval->>Evaluator: eval_script(Language, script_bytes, datum, redeemer, context)
    Evaluator->>UPLC: Map Language to PlutusVersion
    Evaluator->>UPLC: CBOR decode script bytes
    Evaluator->>UPLC: Flat-decode to UPLC Program
    Evaluator->>UPLC: Convert datum/redeemer/context to terms
    Evaluator->>UPLC: Apply arguments (version-specific order)
    UPLC-->>Evaluator: Evaluation result & budget
    Evaluator->>Evaluator: Convert budget to ExUnits
    Evaluator->>Evaluator: Determine success (V3 requires Unit term)
    Evaluator-->>Result: ScriptEvalResult {success, units, logs, failure}
    Result-->>TxEval: Return result
    TxEval->>TxEval: Populate TxEvalResult with failure_message
    TxEval-->>Caller: TxEvalResult
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 A script evaluator hops in,
With UPLC terms and budgets thin,
V1, V2, V3 all aligned,
Plutus phases now refined! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main refactoring: extracting phase2 evaluator logic and surfacing failure context in error handling.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch validate/phase2-evaluator-refactor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pallas-validate/src/phase2/evaluator.rs (1)

77-91: 💤 Low value

V3 success logic correctly implements CIP-0069 semantics.

For PlutusV3, scripts must return Unit to indicate success, not just complete without error. However, note that when a V3 script returns Ok(non-Unit), failure will be None but success will be false. This may be intentional, but callers won't have diagnostic information for this failure mode.

Consider capturing non-Unit V3 results as a failure

If you want to surface why a V3 script failed when it returns a non-Unit value:

     let failure = result.term.as_ref().err().map(|err| MachineFailure {
         message: err.to_string(),
         budget: units,
         logs: logs.clone(),
     });

     let success = match (&result.term, &language) {
         (Ok(_), Language::PlutusV1 | Language::PlutusV2) => true,
         (Ok(term), Language::PlutusV3) => matches!(
             term,
             Term::Constant(constant) if matches!(**constant, Constant::Unit)
         ),
         (Err(_), _) => false,
     };

+    // For V3, if evaluation succeeded but didn't return Unit, create a failure
+    let failure = match (&result.term, &language, failure) {
+        (Ok(_), Language::PlutusV3, None) if !success => Some(MachineFailure {
+            message: "PlutusV3 script did not return Unit".to_string(),
+            budget: units,
+            logs: logs.clone(),
+        }),
+        (_, _, f) => f,
+    };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pallas-validate/src/phase2/evaluator.rs` around lines 77 - 91, The current
match sets success=false for PlutusV3 when the returned term is non-Unit but
leaves failure as-is (often None), which loses diagnostic info; update the
evaluation logic around the match on (&result.term, &language) (the block
assigning success) to also populate failure (the variable used in the
ScriptEvalResult) when Language::PlutusV3 returns Ok(term) that is not
Constant::Unit — e.g., detect the Ok(term) non-Unit branch (the same place using
Term::Constant and Constant::Unit) and set failure to Some descriptive error
(e.g., "PlutusV3 script must return Unit") while keeping success=false, then
return ScriptEvalResult with the updated failure field.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pallas-validate/src/phase2/evaluator.rs`:
- Around line 77-91: The current match sets success=false for PlutusV3 when the
returned term is non-Unit but leaves failure as-is (often None), which loses
diagnostic info; update the evaluation logic around the match on (&result.term,
&language) (the block assigning success) to also populate failure (the variable
used in the ScriptEvalResult) when Language::PlutusV3 returns Ok(term) that is
not Constant::Unit — e.g., detect the Ok(term) non-Unit branch (the same place
using Term::Constant and Constant::Unit) and set failure to Some descriptive
error (e.g., "PlutusV3 script must return Unit") while keeping success=false,
then return ScriptEvalResult with the updated failure field.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 245b6c8f-1c02-4126-b2a7-49eb78234a34

📥 Commits

Reviewing files that changed from the base of the PR and between a7b5a86 and bf4aaa1.

📒 Files selected for processing (4)
  • pallas-validate/src/phase2/error.rs
  • pallas-validate/src/phase2/evaluator.rs
  • pallas-validate/src/phase2/mod.rs
  • pallas-validate/src/phase2/tx.rs

@scarmuega scarmuega merged commit e1d0555 into main May 8, 2026
15 checks passed
@scarmuega scarmuega deleted the validate/phase2-evaluator-refactor branch May 8, 2026 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant