Port recent changes from zk-prove-any/sp1-runner to master#4521
Port recent changes from zk-prove-any/sp1-runner to master#4521pmikolajczyk41 merged 21 commits intomasterfrom
Conversation
…re containing pure data
…-port-changes-to-master
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #4521 +/- ##
==========================================
+ Coverage 32.67% 32.98% +0.31%
==========================================
Files 497 497
Lines 58849 58849
==========================================
+ Hits 19227 19414 +187
+ Misses 36245 36036 -209
- Partials 3377 3399 +22 |
❌ 7 Tests Failed:
View the top 3 failed tests by shortest run time
📣 Thoughts on this report? Let Codecov know! | Powered by Codecov |
bragaigor
left a comment
There was a problem hiding this comment.
LGTM overall just a few minor comments
crates/validation/src/lib.rs
Outdated
| if req.delayed_msg_nr != 0 && !req.delayed_msg.is_empty() { | ||
| delayed_messages.insert(req.delayed_msg_nr, req.delayed_msg.clone()); | ||
| } |
There was a problem hiding this comment.
should we also be checking req.has_delayed_msg? Or would that be incorrect if delayed_msg_nr == 0?
There was a problem hiding this comment.
good point, I think that we should actually inspect has_delayed_msg instead of delayed_msg_nr - fixed in 7187218
crates/validation/src/lib.rs
Outdated
|
|
||
| impl ValidationInput { | ||
| /// Extract runtime data from a request for the given target architecture. | ||
| pub fn from_request(req: &ValidationRequest, target: &str) -> Self { |
There was a problem hiding this comment.
should we add a few unit tests for this from_request(...)? At least one that test the case from my comment below?
| last_block_hash: Bytes32(env.input.large_globals[0]), | ||
| last_send_root: Bytes32(env.input.large_globals[1]), | ||
| inbox_position: env.input.small_globals[0], | ||
| position_within_message: env.input.small_globals[1], |
There was a problem hiding this comment.
nitpick: wondering if it would improve readability if we introduced const's for these magic indexes?
impl ValidationInput {
pub const INBOX_POSITION: usize = 0;
pub const POSITION_WITHIN_MESSAGE: usize = 1;
pub const LAST_BLOCK_HASH: usize = 0;
pub const LAST_SEND_ROOT: usize = 1;
}There was a problem hiding this comment.
hmm, hard to say, as these indices are used everywhere (nitro, JIT, prover, SP1, validation data structures conversion, transfer protocol); I'm open to do this, but I'd prefer to do this in another PR
…-port-changes-to-master # Conflicts: # Cargo.lock # crates/jit/src/machine.rs
eljobe
left a comment
There was a problem hiding this comment.
Before I do my own manual review, I wanted to get your opinion on the 8 ideas that Claude's pr-review feature came up with:
PR #4521 Review Summary
"Port recent changes from zk-prove-any/sp1-runner to master" — introduces ValidationInput intermediate struct, renames old ValidationInput to ValidationRequest, refactors
JIT/prover/validator crates.
Critical Issues (2 found)
- Delayed message semantics changed silently
crates/validation/src/lib.rs:47 | from_request
The new from_request checks req.has_delayed_msg && !req.delayed_msg.is_empty(), but the old continuous-mode path (wavmio.rs) only checked has_delayed_msg. If Go sends has_delayed_msg=true
with an empty body (plausible edge case), the new code silently drops it. The old prepare.rs also used a different condition (delayed_msg_nr != 0). This is now the unified path — it should
match the most permissive old behavior or log a warning when discarding.
Recommendation: Either trust has_delayed_msg alone, or add a warning log when has_delayed_msg == true && delayed_msg.is_empty().
- Arch validation missing in native mode
crates/validator/src/engine/execution.rs:45 | validate_native
The stylus arch mismatch check was moved to machine.rs feed_machine (continuous mode only). In native mode, validate_native calls ValidationInput::from_request(&input, local_target())
directly — if user_wasms contains programs for the wrong arch, from_request silently returns empty module_asms. The old native path would panic (user_wasms[local_target()].clone()), which is
bad but at least visible. Now it's silent.
Recommendation: Move the arch validation into from_request itself, or add it to validate_native too.
Important Issues (3 found)
- Removed "multiple delayed batches" wire protocol validation
crates/validation/src/transfer/receiver.rs | receive_inbox
The old receive_delayed_message enforced at-most-one delayed message at the protocol level. The new generic receive_inbox accepts any number silently. If a buggy sender sends multiple
delayed messages, they'll be silently accepted.
Recommendation: Add a post-receive check delayed_messages.len() <= 1 in receive_validation_input.
- Preimage type no longer validated on receive
crates/validation/src/transfer/receiver.rs | receive_preimages
Old code: PreimageType::try_from(read_u8(reader)?) — rejected unknown types. New code: accepts any u8. Invalid preimage types are carried through until lookup, where they produce eprintln! +
Ok(0) (a silent failure disguised as success).
Recommendation: Keep PreimageType::try_from validation in the receiver, or improve the lookup-site error handling.
- No test for relocated arch validation
crates/validator/src/engine/machine.rs
The test local_stylus_target_must_be_present_if_some_target_is_present was removed from transfer/tests.rs but no replacement was added for the arch check now in feed_machine. The
load_validation_input function is also untested.
Suggestions (3 found)
- load_validation_input clone-then-drain is wasteful
crates/jit/src/machine.rs:337 — Clones all WASM binaries then immediately drains them. Consider taking ValidationInput by value where possible to avoid the clone.
- Positional globals are fragile
small_globals[0] = batch, [1] = pos_in_batch — purely by convention. A named struct (still rkyv-compatible with primitive fields) would be safer and more readable.
- HashMap vs BTreeMap inconsistency
module_asms uses HashMap while all other maps use BTreeMap. Non-deterministic iteration could matter if rkyv-serialized before draining. Document the choice or unify.
Strengths
- Consolidation of scattered state into ValidationInput eliminates 3-4 duplicate conversion paths
- Fixed old serialization bug where send_user_wasms returned early without writing a count byte
- Good test coverage for from_request (8 focused unit tests)
- Clean module extractions (InternalFunc, MemoryType) — pure moves with no logic changes
- The wasip1_stub macro simplification is a nice cleanup
- Wire format sender/receiver stay in lockstep
Recommended Action
Yes, this is a bit messy. Especially that part of our codebase supports multiple delayed messages here, while nitro will never send more than 1. The checks were not consistent either. For now, I simplified it even more, in d9a3ce6, to checking just
Good point, in 4e7cde2 I made
Same answer as to 1.
This is not a protocol responsibility. I moved checks to other, more relevant places.
This is tested in
Good point. Applied in 3345bec and 7ced81b
See #4521 (comment)
This inconsistency is very old... anyway, unified in 4a8530b |
Porting changes from:
closes NIT-4671