Conversation
WalkthroughAdds a wasmtime/Cranelift stack/value-stack limit patch, a new AFL-style GenVM fuzzing executor with assets, refactors Host socket abstraction/visibility, extends fuzzing/dev tooling and tests for recursion/stack-overflow, and updates related build/nix and runtime fingerprint derives. Changes
Sequence Diagram(s)sequenceDiagram
participant AFL as AFL Harness
participant GenExec as genvm-executor (Tokio)
participant Host as Mock Host (Sock)
participant VM as VM Runner (leader/validator)
participant Comparator as Comparator
Note over AFL,GenExec: start fuzz iteration
AFL->>GenExec: supply FuzzingInput
GenExec->>Host: construct/inject Sock (DuplexStream)
GenExec->>VM: spawn leader run (async)
GenExec->>VM: spawn validator run (async)
VM->>Host: perform host syscalls/events
VM-->>GenExec: return result + fingerprint + events (leader)
VM-->>GenExec: return result + fingerprint + events (validator)
GenExec->>Comparator: compare fingerprints, outputs, events
Comparator-->>GenExec: report mismatch? (interesting/crash)
GenExec-->>AFL: emit outcome
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
tests/runner/run.py (1)
406-418: Consider adding validation for storage_json format.The implementation correctly handles preloading storage from JSON, but lacks validation that could help catch configuration errors early.
Consider adding basic validation:
if storage_json := first_conf.get('storage_json'): + if not Path(storage_json).exists(): + raise FileNotFoundError(f"storage_json file not found: {storage_json}") storage_b64 = json.loads(Path(storage_json).read_text()) + if not isinstance(storage_b64, dict): + raise ValueError("storage_json must contain a JSON object") base_mock_storage._storages = {This would provide clearer error messages if the configuration is malformed.
.git-third-party/patches/executor/third-party/wasmtime/5 (1)
75-126: Consider clarifying the error message.The logic correctly enforces the 65535 limit on total locals + stack usage, but the error message could be more informative.
Consider updating the error message to better reflect what's being limited:
if total_subtract > 65535 { - return Err(wasmtime_types::WasmError::InvalidWebAssembly { message: "too many locals".to_owned(), offset: 0 }); + return Err(wasmtime_types::WasmError::InvalidWebAssembly { + message: format!("stack and locals exceed limit: {} + {} > 65535", total_locals, max_stack), + offset: 0 + }); }This would help developers understand which part of their code (locals vs stack depth) is causing the issue.
executor/fuzz/genvm-executor.rs (1)
48-67: Clarify commented-out non-blocking setup.The
fcntlcalls to setO_NONBLOCKare commented out. Is this intentional? If the pipes don't need non-blocking mode, consider removing the commented code for clarity. If they should be non-blocking, the calls should be uncommented.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
executor/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (35)
.git-third-party/config.json(1 hunks).git-third-party/patches/executor/third-party/wasmtime/5(1 hunks)executor/Cargo.toml(3 hunks)executor/default.nix(1 hunks)executor/fuzz/genvm-executor.rs(1 hunks)executor/fuzz/genvm-executor/a-exports.wat(1 hunks)executor/fuzz/genvm-executor/a-imports.wat(1 hunks)executor/fuzz/genvm-executor/b-exports.wat(1 hunks)executor/fuzz/genvm-executor/b-imports.wat(1 hunks)executor/fuzz/genvm-executor/registry/all.json(1 hunks)executor/fuzz/genvm-executor/registry/latest.json(1 hunks)executor/fuzz/inputs-genvm-executor/0a62ce8989f2e131ad0e6641a48b80a525bba57ef526a7994375f768182bd953(1 hunks)executor/src/exe/run.rs(2 hunks)executor/src/host/message.rs(2 hunks)executor/src/host/mod.rs(3 hunks)executor/src/lib.rs(1 hunks)executor/src/rt/errors.rs(3 hunks)executor/src/rt/vm/mod.rs(2 hunks)executor/src/runners/cache.rs(1 hunks)modules/install/config/genvm-llm-default.lua(1 hunks)runners/support/all/generation-00.nix(1 hunks)tests/cases/stable/exploits/rec.jsonnet(1 hunks)tests/cases/stable/exploits/rec.stdout(1 hunks)tests/cases/stable/exploits/rec.wat(1 hunks)tests/cases/stable/exploits/rec_1023.jsonnet(1 hunks)tests/cases/stable/exploits/rec_1023.stdout(1 hunks)tests/cases/stable/exploits/rec_1023.wat(1 hunks)tests/cases/stable/exploits/rec_1024.jsonnet(1 hunks)tests/cases/stable/exploits/rec_1024.stdout(1 hunks)tests/cases/stable/exploits/rec_1024.wat(1 hunks)tests/cases/stable/exploits/rec_tail.jsonnet(1 hunks)tests/cases/stable/exploits/rec_tail.stdout(1 hunks)tests/cases/stable/exploits/rec_tail.wat(1 hunks)tests/runner/run.py(2 hunks)tests/rust.sh(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
tests/rust.sh (2)
executor/fuzz/genvm-executor.rs (1)
main(874-895)modules/implementation/fuzz/genvm-modules-normalize-whitespace.rs (1)
main(83-103)
executor/src/exe/run.rs (1)
executor/src/host/mod.rs (1)
connect(40-58)
executor/src/host/mod.rs (1)
executor/fuzz/genvm-executor.rs (1)
new(127-141)
executor/src/rt/vm/mod.rs (2)
executor/src/exe/run.rs (1)
fmt(18-20)executor/src/rt/errors.rs (2)
fmt(12-14)fmt(105-107)
tests/runner/run.py (1)
runners/genlayer-py-std/src/genlayer/py/types.py (1)
Address(122-254)
executor/fuzz/genvm-executor.rs (5)
executor/src/host/message.rs (5)
arbitrary(84-102)len(21-23)len(50-52)zero(17-19)zero(46-48)executor/src/host/mod.rs (3)
new(37-39)read_u32(61-65)write_slice(76-83)tests/runner/origin/host_fns.py (2)
Errors(25-30)Methods(7-22)executor/common/src/cancellation.rs (1)
make(25-44)executor/src/lib.rs (1)
run_with(133-251)
🪛 Ruff (0.14.4)
tests/runner/run.py
447-454: Starting a process with a partial executable path
(S607)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: module-test-python
🔇 Additional comments (68)
tests/cases/stable/exploits/rec.wat (1)
1-17: WebAssembly stack overflow test correctly implemented.The recursive function design properly triggers unbounded recursion for stack overflow detection. The module structure, type definitions, exports, and memory declarations are all syntactically correct and align with the PR objective to extend test coverage with stack overflow scenarios.
tests/cases/stable/exploits/rec.stdout (1)
1-1: Expected output correctly specifies stack overflow trap.The test expectation aligns with the recursive execution in rec.wat and matches the VMError format for stack overflow traps introduced by stack-limiting enforcement.
tests/cases/stable/exploits/rec_1023.stdout (1)
1-1: LGTM! Test expectation is clear.The expected output indicates successful execution with a zero return value, which aligns with the recursion test completing without stack overflow.
tests/cases/stable/exploits/rec.jsonnet (1)
1-2: LGTM! Test configuration follows the established pattern.The dynamic path construction using
${jsonnetDir}and${fileBaseName}enables flexible test file discovery. This pattern is consistent across multiple test files in this PR.tests/cases/stable/exploits/rec_1023.jsonnet (1)
1-2: LGTM! Configuration is consistent with test patterns.The structure mirrors rec.jsonnet, maintaining consistency across recursion test variants.
tests/cases/stable/exploits/rec_1023.wat (3)
3-17: LGTM! Recursion logic is correct.The recursive function correctly implements a countdown pattern: when called with parameter N > 0, it calls itself with N-1, terminating when the parameter reaches 0. This creates exactly 1023 recursive calls (1022 → 1021 → ... → 1 → 0), matching the test name.
19-23: LGTM! Entry point correctly initiates the recursion test.The
_startfunction properly initiates the recursion with 1022, testing the stack limit enforcement mentioned in the PR objectives.
25-26: Memory size is intentional and standardized.The 257-page memory allocation is not arbitrary. This size is uniformly applied across all recursion-related test files in the exploit suite (rec_tail.wat, rec_1024.wat, rec_1023.wat, rec.wat, and flt.wat), indicating a deliberate, standardized test setup. The rec_1023.wat test specifically exercises recursion at depth 1022 as a boundary condition; using consistent, sufficient memory across these tests ensures reproducible test conditions and prevents stack/memory exhaustion from interfering with the recursion depth testing itself.
tests/rust.sh (2)
111-111: LGTM: AFL benchmarking mode enabled.The
AFL_BENCH_UNTIL_CRASH=1flag appropriately configures AFL to stop after detecting the first crash, which is useful for CI environments and focused crash detection.
192-199: Verify multi-process fuzzing configuration.The
-M mainflag designates this as the main fuzzer instance for AFL's multi-core fuzzing mode. However, I don't see any secondary fuzzer instances (typically launched with-Sflags) in this script.
- If you intend to run only a single fuzzer instance, the
-Mflag is unnecessary (AFL defaults to single-instance mode without it).- If you plan to run multiple parallel fuzzers, you'll need to launch secondary instances with
-S secondary_nameflags, either in this script or elsewhere.The 5-second timeout (
-t "5000") appears reasonable for fuzzing targets.Please confirm whether:
- This is intended as a single-instance setup (in which case
-M maincan be removed), or- Secondary fuzzer instances are launched by another mechanism not visible in this script.
.git-third-party/config.json (1)
6-6: LGTM: Wasmtime patch version incremented.The patch version bump from 4 to 5 aligns with the stack-limiting enforcement mentioned in the PR summary. This is a standard dependency version update.
executor/fuzz/inputs-genvm-executor/0a62ce8989f2e131ad0e6641a48b80a525bba57ef526a7994375f768182bd953 (1)
1-1: Fuzzing input artifact - no review needed.This is a fuzzing test case input file, part of the expanded fuzzing corpus. These binary/text inputs are typically generated or curated automatically and don't require code review.
runners/support/all/generation-00.nix (1)
9-11: LGTM: Version revisions added.The three new revision hashes for versions 0.2.0, 0.2.3, and 0.2.4 are properly formatted and follow the existing pattern. This maintains the version history tracking.
executor/src/runners/cache.rs (2)
25-28: LGTM: Enhanced error context for file operations.Adding
.with_context()to provide the registry path in error messages is a good practice. This will make debugging file access issues significantly easier.
34-37: LGTM: Consistent error context addition.The error context enhancement for
latest.jsonmatches the pattern used forall.json, ensuring consistent error reporting across the registry file operations.executor/src/rt/errors.rs (3)
110-114: LGTM: Equality derives added to Frame.The
PartialEqandEqderives are appropriate forFramesince both fields (module_name: Stringandfunc: u32) implement these traits. This enables comparison and testing of stack frames.
94-94: LGTM: Enhanced fingerprint logging.Adding
frames = fingerprint.frames.len()to the debug log provides useful diagnostic information about the captured stack trace depth.
119-124: LGTM: Equality derives added to Fingerprint.The
PartialEqandEqderives are valid forFingerprint. The constituent types support these traits:
Vec<Frame>supports Eq (Frame now derives Eq)BTreeMap<String, wasmtime::ModuleFingerprint>supports Eq (per the Wasmtime changes in this PR)This enables deterministic comparison of error fingerprints, which is valuable for fuzzing validation.
executor/src/rt/vm/mod.rs (2)
6-6: LGTM: Debug derive added to RunOk.Adding the
Debugderive alongside the existing serialization provides better debugging support. RunOk now has bothDebug(for development) andDisplay(for user-facing output), which is a common and recommended pattern.
36-62: Change verified: no breaking changes detected.The implementation is sound.
RunOkretains itsDebugderive macro while adding the newDisplayimplementation, so both traits coexist safely. Existing code like the fuzz test (ReturnToComparestruct) that relies on Debug formatting continues to work without modification. The Display implementation properly escapes control characters, backslashes, and invalid UTF-8 sequences, providing the intended user-friendly output.executor/default.nix (1)
37-37: LGTM: Derivation name updated for clarity.The rename from
genvm-modules-${target}togenvm-executor-${target}better reflects the package's purpose and aligns with the PR's focus on executor fuzzing infrastructure. This is purely a naming change with no functional impact.executor/Cargo.toml (3)
19-21: LGTM: Fuzzing example entry added correctly.The example entry follows the standard format and correctly points to the fuzzing executor implementation.
49-49: LGTM: Tokio "fs" feature added.The "fs" feature is appropriately added to support file system operations in the fuzzing infrastructure.
64-73: LGTM: Fuzzing dependencies added correctly.The dependencies are appropriate for implementing comprehensive fuzzing:
arbitrarywith "derive" for generating test inputsaflfor fuzzing infrastructurewasm-smith,wasm-encoder,watfor WASM generation and manipulationlibcfor low-level operationsAll dependencies correctly use path references to third-party submodules.
tests/runner/run.py (1)
447-455: LGTM: Tail-call support added for WAT compilation.The addition of
--enable-tail-callflag is appropriate for testing the new stack limiting functionality. The reformatting to a multi-line list also improves readability..git-third-party/patches/executor/third-party/wasmtime/5 (6)
19-31: LGTM: Fixes incorrect wast dependency path.The patch correctly fixes the
wastdependency path from pointing to thewatcrate to the correctwastcrate.
32-52: LGTM: FunctionBuilder extended for stack size tracking.The new
value_stack_size_instfield appropriately tracks the placeholder instruction for the stack size value, which is computed after translation completes.
156-201: LGTM: Stack depth tracking implementation.The
max_stacktracking correctly maintains the peak stack depth by updating on every push operation. The use ofmax()ensures accurate tracking across all control flow paths.
204-232: LGTM: Compiler integration for stack limits.The global values are correctly initialized to point to the runtime limit fields in
VMRuntimeLimits, using appropriate pointer arithmetic and offsets.
332-354: Verify overflow handling in before_return is intentional.The
before_returnmethod increments stack limits usinguadd_overflowbut doesn't trap on overflow. This could theoretically wrap around on extremely deep (malformed) call stacks, though this would require exceeding u32::MAX calls.Is the silent wraparound on
uadd_overflowintentional? If the stack depth tracking becomes corrupted (e.g., through a bug), this could allow bypassing limits. Consider whether to trap on overflow:let (new_val, overflowed) = builder.ins().uadd_overflow(old_val, one); self.trapnz(builder, overflowed, ir::TrapCode::HeapMisaligned); // Or a custom trap code builder.ins().store(ir::MemFlags::trusted(), new_val, limit_addr, 0);Alternatively, document why silent wraparound is acceptable here.
414-416: Stack limits are hardcoded without configurability or documentation.The default values (1024 for call stack, 65535 for value stack) are hardcoded in wasmtime's
VMRuntimeLimits::Defaultimplementation via the patch. No configuration mechanism exists in GenVM to adjust these per execution context, and no documentation explains the rationale for these specific values.Verification findings:
- Values are hardcoded in the patch with no conditional or configuration logic
- GenVM codebase contains no config struct or runtime setting for stack limits
- No documentation (README, ARCHITECTURE, or comments) explains these limits
- The actual enforcement of these limits occurs within wasmtime's internal logic, which is not visible in GenVM code
tests/cases/stable/exploits/rec_tail.stdout (1)
1-1: LGTM: Test expectation for recursive tail-call timeout.This test correctly validates that recursive tail calls are handled and timeout as expected.
executor/fuzz/genvm-executor/registry/all.json (1)
1-1: LGTM: Fuzzing registry initialization.The empty JSON object serves as the initial state for the fuzzing registry, which will be populated during fuzzing runs.
executor/fuzz/genvm-executor/registry/latest.json (1)
1-1: LGTM: Fuzzing registry initialization.The empty JSON object serves as the initial state for the latest registry, tracking the most recent fuzzing state.
tests/cases/stable/exploits/rec_1024.stdout (1)
1-1: LGTM: Test validates stack overflow protection.This test correctly validates that the new stack limiting enforcement triggers StackOverflow traps on deep recursion, demonstrating the security improvement.
tests/cases/stable/exploits/rec_1024.jsonnet (1)
1-2: LGTM: Test case configuration.The test case correctly uses the template system to run the recursion exploit test, validating stack overflow protection.
executor/fuzz/genvm-executor/a-exports.wat (1)
1-5: LGTM! Appropriate fuzzing artifact.This module provides a minimal export surface for fuzzing scenarios. The
unreachablebodies are intentional to test trap handling and module instantiation.executor/fuzz/genvm-executor/b-exports.wat (1)
1-3: LGTM! Appropriate fuzzing artifact.This module provides a minimal
_startentry point for fuzzing. Theunreachableinstruction is intentional for testing trap scenarios.executor/src/lib.rs (1)
3-3: LGTM! Necessary API exposure.Making the host module public enables external access to the Host constructors and socket abstractions, which is essential for the fuzzing infrastructure introduced in this PR.
tests/cases/stable/exploits/rec_tail.jsonnet (1)
1-4: LGTM! Appropriate test configuration.The test driver correctly configures a deadline-based execution for the tail recursion exploit test. The 4-second deadline ensures the test completes within a reasonable timeframe while allowing the stack overflow behavior to manifest.
executor/fuzz/genvm-executor/a-imports.wat (1)
1-54: LGTM! Comprehensive import surface for fuzzing.This module declares the complete WASI and GenLayer SDK import surface, enabling the fuzzing infrastructure to test import resolution, linking behavior, and host function interactions without executing actual code.
executor/src/exe/run.rs (2)
118-118: LGTM! Aligns with Host API refactoring.The change from
Host::newtoHost::connectis consistent with the socket abstraction refactoring that decouples socket creation from Host initialization.
170-170: LGTM! Cleaner output format.Switching from Debug (
{res:?}) to Display ({res:}) provides cleaner user-facing output. Verbose debug output remains available via the--print stderr-fulloption.tests/cases/stable/exploits/rec_tail.wat (1)
1-14: LGTM! Appropriate exploit test for stack overflow.This test module intentionally implements infinite tail recursion to exercise stack-limiting enforcement. Key features:
$fooperforms unbounded tail recursion to trigger stack limits- The 257-page memory allocation tests memory boundary handling
- All patterns align with the PR's objective of testing stack-limiting enforcement in Wasmtime/Cranelift
The test correctly triggers timeout behavior as expected (per the corresponding
.stdoutfile).executor/fuzz/genvm-executor/b-imports.wat (1)
1-56: LGTM! Comprehensive fuzzing artifact with cross-module imports.This module extends the import surface with a cross-module dependency on
mod_a.foo, enabling the fuzzing infrastructure to test multi-module linking scenarios. The combination witha-exports.watcreates realistic module interaction patterns for fuzzing.tests/cases/stable/exploits/rec_1024.wat (1)
1-27: LGTM! Well-structured stack overflow test.This WebAssembly module correctly implements a deep recursion test with 1023 recursive calls, designed to test stack limit enforcement. The structure is appropriate for validating that the runtime properly handles stack overflow scenarios.
executor/src/host/message.rs (2)
1-9: LGTM! Appropriate fuzzing support.Adding the
Arbitraryderive toAccountAddressenables random instance generation for fuzzing, which aligns with the PR's fuzzing objectives.
83-103: LGTM! Well-implemented fuzzing support for MessageData.The custom
Arbitraryimplementation properly handles:
- Timestamp validation with fallible conversion
- Chain ID generation from random bytes via U256
- Random generation of all address and optional fields
The implementation correctly uses
?for error propagation when arbitrary data is insufficient.executor/src/host/mod.rs (3)
1-1: LGTM! Exposes host_fns for external use.Making the
host_fnsmodule public enables external code (like the fuzzing harness) to access method and error definitions, which is necessary for implementing mock hosts.
20-24: LGTM! Public Sock trait enables mock implementations.Making the
Socktrait public allows external code to implement custom socket types for testing and fuzzing, which aligns with the PR's objectives.
37-58: Excellent refactoring for testability.The decoupling of socket creation from Host instantiation is a clean design improvement:
Host::new(sock, metrics)accepts a pre-constructed socket for dependency injectionHost::connect(addr, metrics)provides the original connection logicThis enables mock socket injection for testing while maintaining backward compatibility through the
connectmethod.executor/fuzz/genvm-executor.rs (17)
14-26: LGTM! Conservative WASM configuration for fuzzing.The configuration appropriately disables advanced WebAssembly features (SIMD, threads, memory64, etc.) while enabling bulk memory operations. This simplifies the fuzzing surface and focuses on core execution paths.
28-41: LGTM! Clean error type for fuzzing flow control.The
FuzzNexterror type provides a clean way to signal that a fuzz iteration should be skipped (e.g., due to timeout) without treating it as a failure.
112-124: LGTM! Standard pipe creation pattern.The
connected_files()function correctly useslibc::pipeto create bidirectional IPC channels. The unsafe blocks are properly justified, and error handling with panic is appropriate for initialization code.
144-162: LGTM! Clear data structures for determinism verification.The
SequentialEventenum andHostAccumulatedDatastruct provide clean abstractions for tracking and comparing execution outputs between leader and validator runs.
242-279: LGTM! Page-based storage implementation.The
write_storage_slicemethod correctly implements page-based storage with 32-byte pages. It handles:
- Page alignment and offsets
- Partial page writes
- Read-modify-write for existing pages
281-315: LGTM! Consistent storage read implementation.The
read_storage_slicemethod mirrors the write implementation with appropriate handling of page boundaries and missing pages (zero-filled).
317-326: LGTM! Proper code storage setup.The
write_codemethod correctly stores contract code using the CODE slot indirection with length prefix, consistent with the GenVM storage model.
336-420: LGTM! Core host method handlers implemented correctly.The mock host properly handles:
- Calldata retrieval
- Storage read/write with page-based model
- Result consumption with acknowledgment
- Leader/validator nondet result synchronization with appropriate error codes
422-503: LGTM! Message and contract operation handlers.The handlers correctly:
- Collect nondet results for later comparison
- Track posted messages and deployments
- Provide mock responses for EthCall/EthSend
505-556: LGTM! Event and utility handlers.The remaining handlers appropriately:
- Collect event logs with topics
- Return mock balance and fuel values
- Error on nondet disagreement notifications (expected not to occur)
577-635: LGTM! Contract generation from fuzzing input.The
generate_contractfunction correctly:
- Configures wasm-smith with appropriate imports/exports
- Generates two modules from fuzzing data
- Packages them in a ZIP with runner configuration
The two-module approach (link + start) provides good coverage of the GenVM module system.
649-669: LGTM! Timeout mechanism with cancellation.The
start_timeoutsfunction properly:
- Creates a cancellation token for graceful shutdown
- Sets a 20-second timeout for each fuzz iteration
- Signals timeout via atomic flag
671-760: LGTM! Contract execution with mock host.The
run_withfunction correctly:
- Sets up GenVM configuration and shared state
- Injects the mock host socket
- Spawns mock host in separate task
- Runs contract execution via supervisor
- Collects and returns results with timeout checking
762-770: LGTM! RAII pattern for cleanup.The
CallOnDroputility ensures the cancellation callback is invoked on scope exit, providing proper resource cleanup.
772-823: LGTM! Determinism verification logic.The
do_fuzzingfunction properly:
- Runs contract as leader (generating nondet results)
- Runs contract as validator (consuming leader's nondet results)
- Asserts equality of fingerprints, messages, results, and return values
- Distinguishes between different RunOk variants for precise comparison
This correctly validates deterministic execution between leader and validator modes.
825-841: LGTM! Fuzzing orchestration with runtime management.The
runfunction correctly:
- Generates contract code from fuzzing input
- Creates a dedicated Tokio runtime with appropriate thread counts
- Executes the fuzzing iteration
- Explicitly drops the runtime for cleanup
874-895: LGTM! AFL fuzzing integration with error handling.The
mainfunction correctly:
- Initializes logging with memlimiter filtering
- Uses the AFL fuzzing macro with
FuzzingInput- Distinguishes
FuzzNexterrors (timeouts) from real failures- Panics on actual errors for AFL to detect bugs
| args.prompt = lib.rs.filter_text(args.prompt, { | ||
| 'NFKC', | ||
| 'RmZeroWidth', | ||
| 'NormalizeWS' | ||
| }) |
There was a problem hiding this comment.
🧩 Analysis chain
Verify sanitization consistency and error handling.
The sanitization logic looks appropriate for preventing injection attacks and handling fuzzing edge cases. However, please verify:
-
Inconsistency:
ExecPromptTemplate(line 93) doesn't apply similar sanitization. Should template-based prompts also be sanitized before transformation? -
Error handling: What happens if
lib.rs.filter_textfails? Consider whether the call needs to be wrapped inpcallor if the function is guaranteed to succeed.
Run the following script to check if ExecPromptTemplate or other prompt-handling functions should have similar sanitization:
🏁 Script executed:
#!/bin/bash
# Description: Search for other prompt execution functions that might need sanitization
# Search for functions that handle prompts
ast-grep --pattern 'function $FUNC($$$, args, $$$)
$$$
llm.$_transform(args)
$$$'
# Search for direct usage of args.prompt without sanitization
rg -n -B3 -A3 'args\.prompt' --type=luaLength of output: 979
🏁 Script executed:
#!/bin/bash
# View both ExecPrompt and ExecPromptTemplate functions
echo "=== ExecPrompt and ExecPromptTemplate functions ==="
sed -n '78,110p' modules/install/config/genvm-llm-default.lua
echo ""
echo "=== Check filter_text implementation ==="
rg -n -B5 -A10 'filter_text' --type=lua modules/install/lib/
echo ""
echo "=== Check for pcall patterns around filter_text usage ==="
rg -n -B3 -A3 'pcall.*filter_text|filter_text.*pcall' --type=luaLength of output: 713
🏁 Script executed:
#!/bin/bash
# Find lib.rs module definition/import
echo "=== lib.rs import/definition ==="
rg -n 'lib\.rs|require.*rs|import.*rs' --type=lua modules/install/config/ modules/install/lib/
echo ""
echo "=== Type definitions for payloads ==="
rg -n 'LLMExecPromptPayload|LLMExecPromptTemplatePayload' --type=lua
echo ""
echo "=== Check exec_prompt_template_transform calls ==="
rg -n -B3 -A3 'exec_prompt_template_transform' --type=lua
echo ""
echo "=== Search for lib.rs in Rust files if present ==="
find . -name '*.rs' -type f | head -20Length of output: 3646
🏁 Script executed:
#!/bin/bash
# Search for all filter_text usages
echo "=== All filter_text usages ==="
rg -n -B2 -A2 'filter_text' --type=lua
echo ""
echo "=== Search Rust code for filter_text implementation ==="
rg -n -B5 -A10 'filter_text' --type=rs executor/
echo ""
echo "=== Check Lua-Rust binding hints ==="
rg -n 'filter_text' --type=lua modules/Length of output: 597
🏁 Script executed:
#!/bin/bash
# Try searching Rust files with find and grep instead
echo "=== Search for filter_text in Rust files ==="
find . -name "*.rs" -type f -exec grep -l "filter_text" {} \;
echo ""
echo "=== Check if filter_text might be in a different module ==="
find . -path ./target -prune -o -name "*.rs" -type f -print | xargs grep -n "fn filter_text" 2>/dev/null | head -20Length of output: 156
🏁 Script executed:
#!/bin/bash
# Examine filter_text implementation
echo "=== filter_text implementation ==="
grep -n -B10 -A20 "fn filter_text\|filter_text.*fn" ./modules/implementation/src/scripting/ctx/dflt.rs
echo ""
echo "=== Check for Result<> or error return types in surrounding context ==="
grep -n -B5 -A5 "filter_text" ./modules/implementation/src/scripting/ctx/dflt.rsLength of output: 535
Wrap filter_text call in pcall to handle potential failures.
The type inconsistency concern is invalid—ExecPromptTemplate processes templates (different data structure) rather than prompts, so no sanitization should apply there.
However, the error handling concern is valid: the Rust implementation of filter_text can fail (the underlying vm.from_value or to_str() calls propagate errors via the ? operator). The call at lines 82–86 should be wrapped in pcall:
local success, result = pcall(lib.rs.filter_text, args.prompt, {
'NFKC',
'RmZeroWidth',
'NormalizeWS'
})
if not success then
lib.rs.user_error({code = 'FILTER_FAILED', message = 'Prompt filtering failed'})
end
args.prompt = resultThis matches the error-handling pattern already used elsewhere in the codebase (e.g., genvm-web-default.lua line 57).
🤖 Prompt for AI Agents
In modules/install/config/genvm-llm-default.lua around lines 82–86, wrap the
lib.rs.filter_text(args.prompt, {...}) call in pcall to catch Rust-side
failures; on failure call lib.rs.user_error with {code='FILTER_FAILED',
message='Prompt filtering failed'} and on success assign the returned result
back to args.prompt, preserving the same normalization options
('NFKC','RmZeroWidth','NormalizeWS').
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
libs/libc.nix (1)
29-29: Fix path version mismatch in LDFLAGS.The LDFLAGS reference
clang/20in the compiler-rt path, but the compiler-rt package was updated to version 21.1.5-1 (Line 12). This path mismatch will likely cause build failures or incorrect linking when building for arm64 targets.Apply this diff to fix the path:
- LDFLAGS="-fuse-ld=lld ${if name-target == "arm64" then "./compiler-rt/usr/lib/clang/20/lib/linux/libclang_rt.builtins-aarch64.a" else ""}" \ + LDFLAGS="-fuse-ld=lld ${if name-target == "arm64" then "./compiler-rt/usr/lib/clang/21/lib/linux/libclang_rt.builtins-aarch64.a" else ""}" \
🧹 Nitpick comments (4)
executor/fuzz/genvm-executor.rs (4)
14-26: Remove duplicatesimd_enabledassignment.Line 16 and Line 21 both set
config.simd_enabled = false, which is redundant.Apply this diff to remove the duplicate:
fn wasm_smith_config() -> wasm_smith::Config { let mut config = wasm_smith::Config::default(); config.simd_enabled = false; config.bulk_memory_enabled = true; config.reference_types_enabled = false; config.relaxed_simd_enabled = false; config.saturating_float_to_int_enabled = false; - config.simd_enabled = false; config.threads_enabled = false; config.memory64_enabled = false; config }
49-66: Consider removing or documenting commented O_NONBLOCK code.Lines 50-61 contain commented-out code for setting
O_NONBLOCKflags. If this was intentionally disabled, add a comment explaining why. Otherwise, remove the dead code.
459-474: Clarify the purpose ofcalldata.extend_from_slice(&code).Line 464 has a cryptic comment "// just ok" after extending calldata with code. Consider adding a more descriptive comment explaining why this concatenation is necessary for the mock deployment behavior.
843-851: Clarify thewscncomment.Line 848 has a comment
// wscnwhich is unclear. If this represents permission flags (write, send, ?, nondet), consider making it more explicit, such as// Permission flags: write, send, nondet (read is always enabled).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
executor/fuzz/genvm-executor.rs(1 hunks)libs/libc.nix(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
executor/fuzz/genvm-executor.rs (4)
executor/src/host/message.rs (4)
len(21-23)len(50-52)zero(17-19)zero(46-48)executor/src/host/mod.rs (3)
new(37-39)read_u32(61-65)write_slice(76-83)executor/common/src/cancellation.rs (1)
make(25-44)executor/src/lib.rs (1)
run_with(133-251)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: module-test-python
🔇 Additional comments (5)
executor/fuzz/genvm-executor.rs (5)
144-162: LGTM!The
SequentialEventenum andHostAccumulatedDatastruct are well-designed for tracking host interactions during fuzzing.
577-635: LGTM!The contract generation logic correctly packages two WASM modules with appropriate imports, exports, and runner configuration. The use of
CompressionMethod::Storedis appropriate for fuzzing where speed is prioritized.
649-669: LGTM!The timeout mechanism is well-implemented with proper cancellation token propagation. The 20-second timeout is reasonable for a fuzzing harness to avoid hanging on infinite loops or deadlocks.
772-841: LGTM!The fuzzing orchestration correctly executes leader and validator runs with proper result comparison. Creating a new Tokio runtime per iteration (lines 828-832) ensures clean state between fuzz inputs, which is appropriate for fuzzing despite the performance cost.
853-868: Verify permission coverage for fuzzing completeness.The
get_perms()method always includes'r'unconditionally and has nocan_readfield to disable it. This means fuzzing never tests scenarios without read permission. Additionally, the validation logic (per past review context) supports a'c'permission that is not represented inFuzzingInputfields.Consider whether the fuzzing harness should test:
- Scenarios without read permission (add
can_readfield, make'r'conditional)- The
'c'permission (add corresponding field)If always granting read permission is intentional for fuzzing, add a comment explaining this design choice.
Note: The past review comment about duplicate
'r'characters does not apply to the current code, which only includes one'r'(the unconditional initial one).
| afl::fuzz!(|data: FuzzingInput| { | ||
| let res = run(data); | ||
|
|
||
| if let Err(err) = &res { | ||
| if err.downcast_ref::<FuzzNext>().is_none() { | ||
| eprintln!("error: {err:?}"); | ||
| panic!("error detected!!!"); | ||
| } else { | ||
| log_info!(error:ah = err; "fuzz next"); | ||
| } | ||
| } | ||
| }); |
There was a problem hiding this comment.
Fix probable typo in logging macro.
Line 885 uses error:ah which appears to be a typo. This should likely be error:? for debug formatting.
Apply this diff:
} else {
- log_info!(error:ah = err; "fuzz next");
+ log_info!(error:? = err; "fuzz next");
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| afl::fuzz!(|data: FuzzingInput| { | |
| let res = run(data); | |
| if let Err(err) = &res { | |
| if err.downcast_ref::<FuzzNext>().is_none() { | |
| eprintln!("error: {err:?}"); | |
| panic!("error detected!!!"); | |
| } else { | |
| log_info!(error:ah = err; "fuzz next"); | |
| } | |
| } | |
| }); | |
| afl::fuzz!(|data: FuzzingInput| { | |
| let res = run(data); | |
| if let Err(err) = &res { | |
| if err.downcast_ref::<FuzzNext>().is_none() { | |
| eprintln!("error: {err:?}"); | |
| panic!("error detected!!!"); | |
| } else { | |
| log_info!(error:? = err; "fuzz next"); | |
| } | |
| } | |
| }); |
🤖 Prompt for AI Agents
In executor/fuzz/genvm-executor.rs around lines 877 to 888, the logging macro
uses a probable typo `error:ah` which is invalid — replace it with `error:?` (or
the correct key/value formatting for the logging macro you use) so the error is
formatted for debug output; update the log_info! invocation to pass the error
with `error:? = err` (or the macro-specific equivalent) and keep the message
"fuzz next".
e36f6ab to
ecba2d6
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
executor/fuzz/genvm-executor.rs (1)
877-888: Fix probable typo in logging macro.Line 885 uses
error:ahwhich appears to be a typo. This should likely beerror:?for debug formatting.Apply this diff:
} else { - log_info!(error:ah = err; "fuzz next"); + log_info!(error:? = err; "fuzz next"); }
🧹 Nitpick comments (6)
executor/fuzz/genvm-executor.rs (6)
14-26: Remove duplicatesimd_enabledassignment.Lines 16 and 21 both set
config.simd_enabled = false. The second assignment is redundant.Apply this diff:
fn wasm_smith_config() -> wasm_smith::Config { let mut config = wasm_smith::Config::default(); config.simd_enabled = false; config.bulk_memory_enabled = true; config.reference_types_enabled = false; config.relaxed_simd_enabled = false; config.saturating_float_to_int_enabled = false; - config.simd_enabled = false; config.threads_enabled = false; config.memory64_enabled = false; config }
48-66: Consider uncommenting the O_NONBLOCK setup or clarify why it's omitted.Lines 50-61 contain commented-out code that would set the file descriptors to non-blocking mode. Converting
std::fs::Filetotokio::fs::Filewithout settingO_NONBLOCKcan cause blocking I/O operations on the Tokio runtime, potentially leading to thread pool starvation and poor async performance.If non-blocking mode was intentionally omitted due to issues, please document why; otherwise, consider uncommenting and fixing those lines.
422-435: Consider more efficient vector resizing.Lines 426-428 grow
collected_eq_outputsby pushing emptyVecs in a loop. This is safe but could be replaced withresizefor clarity and efficiency.Apply this diff:
let call_no = self.sock.read_u32().await?; let result = self.sock.read_slice().await?; - while self.collected_eq_outputs.len() <= call_no as usize { - self.collected_eq_outputs.push(Vec::new()); - } + if self.collected_eq_outputs.len() <= call_no as usize { + self.collected_eq_outputs.resize(call_no as usize + 1, Vec::new()); + } // Store the result for later comparison self.collected_eq_outputs[call_no as usize] = result;
459-474: Clarify the calldata+code concatenation.Line 464 concatenates
calldataandcodewith the comment// just ok, which is unclear. This appears to be a mock simplification, but consider adding a more detailed comment explaining why this differs from the real implementation.
762-770: Remove unnecessaryClonebound onF.The
Clonebound onFinCallOnDrop<F: FnOnce() + Clone>is unnecessary since the closure is stored in anOptionand only taken once.FnOncealone is sufficient.Apply this diff:
-struct CallOnDrop<F: FnOnce() + Clone>(Option<F>); +struct CallOnDrop<F: FnOnce()>(Option<F>); -impl<F: FnOnce() + Clone> Drop for CallOnDrop<F> { +impl<F: FnOnce()> Drop for CallOnDrop<F> { fn drop(&mut self) { if let Some(foo) = self.0.take() { foo(); } } }
853-868: Unconditionally granting read permission may not be intended.Line 855 initializes the permission string with
"r", meaning read permission is always granted regardless of the fuzzing input. If read permission should be conditional (like write, send, and nondet), consider adding acan_readfield toFuzzingInputand building the permission string from an empty string.Based on learnings (from past review comments indicating the validation logic requires unique permissions), this aligns with the previous suggestion to start with an empty string.
Apply this diff if read permission should be conditional:
+ can_read: bool, can_write: bool, // wscn can_send: bool, can_nondet: bool, } impl FuzzingInput { fn get_perms(&self) -> String { - let mut perms = "r".to_owned(); + let mut perms = String::new(); + if self.can_read { + perms.push('r'); + } if self.can_write { perms.push('w'); }Alternatively, if read permission should always be granted, document this decision with a comment.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
executor/fuzz/genvm-executor.rs(1 hunks)libs/libc.nix(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
executor/fuzz/genvm-executor.rs (3)
executor/src/host/mod.rs (3)
new(37-39)read_u32(61-65)write_slice(76-83)executor/common/src/cancellation.rs (1)
make(25-44)executor/src/lib.rs (2)
run_with(133-251)create_supervisor(31-78)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: module-test-python
🔇 Additional comments (14)
libs/libc.nix (2)
29-29: Path reference correctly updated to match compiler-rt v21.The LDFLAGS path now correctly references
clang/21/to align with the compiler-rt version 21.1.5-1 update, resolving the previously flagged critical issue.
12-13: Version update is internally consistent.The compiler-rt version bump to 21.1.5-1 is correctly reflected in both the tarball URL and the library path reference.
Also applies to: 29-29
executor/fuzz/genvm-executor.rs (12)
28-41: LGTM!The
FuzzNexterror type cleanly signals when a fuzz input should be skipped without failing the fuzzer.
144-162: LGTM!The
SequentialEventtype cleanly models the observable events and messages from contract execution, and the data structures are appropriately defined.
180-239: LGTM!The mock host writer helpers correctly implement the length-prefixed binary protocol using async I/O.
241-326: LGTM!The paged storage implementation correctly handles cross-page reads and writes with proper chunking and zero-filling for missing pages. The 32-byte page size and arithmetic are consistent throughout.
328-403: LGTM!The mock host protocol handlers correctly implement the expected message flow, with appropriate simplifications (ignoring account/mode) for fuzzing purposes.
505-556: LGTM!The remaining mock host methods correctly handle events, balance queries, fuel, and appropriately fail on
NotifyNondetDisagreementwhich should not occur during fuzzing.
559-575: LGTM!The trait implementations correctly delegate to the underlying file descriptors.
577-635: LGTM!The contract generation correctly creates a runner ZIP with two wasm modules and configuration. The asymmetry in name section handling (only
mod_agets named) appears intentional.
637-669: LGTM!The timeout mechanism correctly sets up a 20-second deadline and provides both a cancellation token and a flag to distinguish timeout from normal completion.
671-760: LGTM!The
run_withfunction correctly orchestrates the GenVM execution with mock host, timeout handling, and result collection. The configuration appropriately disables caching and sets up the registry path.
772-823: LGTM!The fuzzing orchestration correctly runs leader and validator executions, then performs comprehensive deterministic comparison of fingerprints, messages, return values, and final results.
825-841: LGTM!The run function correctly generates the contract, creates a fresh Tokio runtime for isolation, and executes the fuzzing iteration.
ecba2d6 to
6a7c82b
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.git-third-party/config.json(1 hunks).git-third-party/patches/executor/third-party/wasm-tools/1(1 hunks)executor/fuzz/genvm-executor.rs(1 hunks)libs/libc.nix(2 hunks)modules/implementation/src/manager/handlers.rs(1 hunks)modules/implementation/src/manager/run.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- executor/fuzz/genvm-executor.rs
🧰 Additional context used
🧬 Code graph analysis (1)
modules/implementation/src/manager/handlers.rs (1)
modules/implementation/src/manager/run.rs (1)
get_current_permits(101-103)
🔇 Additional comments (6)
.git-third-party/config.json (1)
6-6: The patch counts are correct and match actual patch files.The original review comment was based on an incorrect assumption about patch file location. Patches are stored in
.git-third-party/patches/executor/third-party/{repo}/{number}, not inexecutor/third-party/{repo}/patches/.Verification confirms:
wasmtime: config declares 5 patches → 5 files present ✓wasm-tools: config declares 1 patch → 1 file present ✓Likely an incorrect or invalid review comment.
modules/implementation/src/manager/handlers.rs (1)
20-20: LGTM: correctly updated to match synchronous signature.The removal of
.awaitcorrectly aligns with the signature change ofget_current_permits()inrun.rs.modules/implementation/src/manager/run.rs (1)
101-103: Async keyword removal is correct and all call sites are properly updated.The change from
async fntofnis correct sinceSemaphore::available_permits()is synchronous. Verification confirms the only call site inhandlers.rs:20correctly calls the method without.await..git-third-party/patches/executor/third-party/wasm-tools/1 (1)
15-21: LGTM! Improved assertion diagnostic.Adding the debug output for
tywhen the assertion fails will help diagnose issues during fuzzing, making failures easier to troubleshoot.libs/libc.nix (2)
29-29: LGTM! Path now matches compiler-rt version 21.The LDFLAGS path correctly references
clang/21/lib/linux/, which is consistent with the compiler-rt version 21.1.5-1 update on lines 11-14. This ensures the library file will be found during the build for arm64 targets.
11-14: Verify sha256 checksum for compiler-rt-21.1.5-1 against official Arch repository.The package endpoint is reachable (HTTP 302 redirect confirmed), and compiler-rt 21 is generally compatible with musl. The version update to 21.1.5-1 paired with the path update to clang/21 on line 29 shows internal consistency.
However, the exact sha256 checksum cannot be verified in this environment. Manually confirm the checksum
1z3p3r8bbbhnh8sda5xp6n15n0593lbghrcn5p34mfifcriyr435against the official Arch Linux ARM package repository to ensure integrity before merging.
| @@ -6,7 +6,8 @@ impl Module { | ||
| self.encoded().finish() | ||
| } | ||
|
|
||
| - fn encoded(&self) -> wasm_encoder::Module { | ||
| + /// no doc | ||
| + pub fn encoded(&self) -> wasm_encoder::Module { |
There was a problem hiding this comment.
Replace placeholder doc comment with meaningful documentation.
The "/// no doc" comment is a placeholder. Since this method is being exposed publicly for fuzzing purposes, it should have proper documentation explaining:
- What the method does
- Why it's public (fuzzing use case)
- Any constraints or expected usage patterns
Apply this diff to improve the documentation:
- /// no doc
+ /// Returns the encoded WASM module for fuzzing/testing purposes.
+ ///
+ /// This method is exposed publicly to enable fuzzing workflows that need
+ /// access to the intermediate encoded module representation.
pub fn encoded(&self) -> wasm_encoder::Module {Note: Patching third-party dependencies requires maintenance when updating. Consider documenting the rationale for this patch in a README or comment file within the .git-third-party/patches/ directory.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @@ -6,7 +6,8 @@ impl Module { | |
| self.encoded().finish() | |
| } | |
| - fn encoded(&self) -> wasm_encoder::Module { | |
| + /// no doc | |
| + pub fn encoded(&self) -> wasm_encoder::Module { | |
| self.encoded().finish() | |
| } | |
| /// Returns the encoded WASM module for fuzzing/testing purposes. | |
| /// | |
| /// This method is exposed publicly to enable fuzzing workflows that need | |
| /// access to the intermediate encoded module representation. | |
| pub fn encoded(&self) -> wasm_encoder::Module { |
🤖 Prompt for AI Agents
.git-third-party/patches/executor/third-party/wasm-tools/1 around lines 28 to
34: replace the placeholder "/// no doc" with a proper doc comment that explains
what encoded(&self) returns (a wasm_encoder::Module representing this Module's
encoded bytes), why the method is pub (exposed for fuzzing/inspection by test
harnesses), and any constraints or expected usage (e.g., callers should not
mutate the returned Module, returned Module reflects current state and may be
expensive to build, and intended only for testing/fuzzing clients). Keep the doc
concise, mention the fuzzing rationale so future maintainers understand the
public visibility, and consider adding a short note near the patch metadata
about maintaining this third-party change on updates.
closes GVM-148
closes GVM-152
closes GVM-151
closes GVM-149
closes GVM-141
closes GVM-131
Summary by CodeRabbit
New Features
Tests
Improvements
Chores