fix(fuzz): make default pass selection complete and add runtime equivalence boundary

## Summary

During a non-invasive local review of Azoth’s fuzzing path, I found that the current fuzzer appears to miss one of the default obfuscation passes and that `--check-deploy` validates deployment success rather than runtime behavioral equivalence.

This is not a production exploit claim. The issue is about assurance coverage: Azoth is positioned as a deterministic EVM bytecode obfuscator intended to make Mirage execution contracts indistinguishable from ordinary unverified deployments. Because of that role, fuzzing should ideally exercise all default passes and distinguish deployability from semantic/runtime equivalence.

The concrete issue is that the default pass list contains four passes, but the fuzzing pass-selection mask appears to generate only three bits, making `string_obfuscate` unreachable through the current randomized pass selection path.

## Affected components

```text
crates/cli/src/commands/fuzz.rs
crates/cli/src/commands/mod.rs
````

## Technical description

The default pass string includes:

```text
arithmetic_chain, push_split, slot_shuffle, string_obfuscate
```

However, the fuzzer appears to select passes using:

```rust
let passes = passes_from_bits((rng.next_u32() % 8) as u8);
```

Since `% 8` only produces values in the range `0..7`, only three bits can be set. If the default passes are mapped sequentially, bit 3 is required to select the fourth pass, `string_obfuscate`. Under the current mask, that pass is not reachable through the randomized pass-combination path.

This narrows fuzzing coverage relative to the default transform set.

A second, broader assurance-boundary issue is that `--check-deploy` appears to compare deployment success, not runtime equivalence. Deployment success is useful, but it is weaker than checking that original and obfuscated bytecode behave equivalently over calldata, state, environment, reverts, logs, returndata, external calls, storage writes, and gas.

For a privacy-critical obfuscation pipeline, a contract can deploy successfully while still diverging at runtime or exposing stable runtime artifacts. Naturally, EVM bytecode finds a way to be annoying exactly where the happy path stops looking.

## Proof of Concept

Run the following local inspection commands:

```bash
rg -n "DEFAULT_PASSES|passes_from_bits|rng.next_u32\\(\\) % 8|Contract::ALL|check_deploy" crates/cli/src/commands
```

Expected relevant observations:

```text
DEFAULT_PASSES = "arithmetic_chain, push_split, slot_shuffle, string_obfuscate"
```

and:

```rust
let passes = passes_from_bits((rng.next_u32() % 8) as u8);
```

Because `% 8` yields only three usable bits, the fourth default pass is not selected by this fuzzing path.

Then run smoke fuzzing:

```bash
cargo run --bin azoth -- fuzz -i 100
cargo run --bin azoth -- fuzz -i 100 --check-deploy
```

Observed local result:

```text
cargo run --bin azoth -- fuzz -i 100

Iterations: 114
Successes: 100
Errors: 0
Unique crashes saved: 0
```

```text
cargo run --bin azoth -- fuzz -i 100 --check-deploy

Iterations: 114
Successes: 100
Errors: 0
Deployment mismatches: 0
Unique crashes saved: 0
```

The successful smoke runs are good, but they do not exercise `string_obfuscate` through the current bitmask and do not establish runtime equivalence.

The `Iterations: 114` value for `-i 100` also suggests the parallel worker counter can overshoot the requested fuzzing budget. That is not the main issue here, but exact iteration accounting would improve reproducibility for CI and research reporting.

## Trace / evidence

Default passes:

```text
arithmetic_chain
push_split
slot_shuffle
string_obfuscate
```

Current randomized mask:

```rust
rng.next_u32() % 8
```

Reachable bit positions:

```text
bit 0 -> reachable
bit 1 -> reachable
bit 2 -> reachable
bit 3 -> not reachable
```

Therefore:

```text
string_obfuscate -> not selected by current randomized pass mask
```

Runtime-equivalence boundary:

```text
--check-deploy checks deployment success
--check-deploy does not compare runtime traces
--check-deploy does not compare returndata
--check-deploy does not compare revert data
--check-deploy does not compare logs
--check-deploy does not compare storage writes
--check-deploy does not compare external-call effects
--check-deploy does not compare gas behavior
```

## Impact

The main impact is reduced fuzzing assurance.

If one default pass is unreachable, bugs or distinguishability artifacts specific to that pass may remain undetected. This is especially relevant for `string_obfuscate`, because revert strings, error payloads, and string-like byte sequences can be externally observable or classifier-visible depending on how they are transformed.

The runtime-equivalence gap is also important. Deployment success is necessary, but not sufficient, for an obfuscation system. Original and obfuscated bytecode should ideally be compared over runtime behavior, including success/revert status, returndata, revert payloads, logs, storage effects, external calls, and gas deltas.

At Mirage level, this matters because Azoth is intended to support indistinguishability of execution contracts. A transform can preserve deployability while still creating runtime-visible divergence or stable classifier features.

## Recommended mitigation

1. Replace the hardcoded `% 8` mask with a mask derived from the number of default passes.

For example, derive the maximum mask from the pass count instead of hardcoding three bits:

```rust
let pass_count = DEFAULT_PASSES.split(',').count();
let mask_limit = 1u32.checked_shl(pass_count as u32).unwrap_or(0);
let mask = (rng.next_u32() % mask_limit) as u8;
let passes = passes_from_bits(mask);
```

2. Add a unit test proving every default pass is reachable through the fuzz pass-selection mechanism.

Suggested test intent:

```rust
#[test]
fn fuzz_pass_selection_can_reach_every_default_pass() {
    let default_passes: Vec<_> = DEFAULT_PASSES
        .split(',')
        .map(|p| p.trim())
        .collect();

    for expected_pass in &default_passes {
        let reachable = (0u32..(1u32 << default_passes.len()))
            .any(|mask| passes_from_bits(mask as u8).contains(expected_pass));

        assert!(
            reachable,
            "default pass {expected_pass} is not reachable by fuzz pass selection"
        );
    }
}
```

3. Consider renaming or documenting `--check-deploy` as a deployability check rather than an equivalence check.

4. Add a future `--check-runtime-equivalence` or equivalent mode using REVM differential execution.

That mode should compare original and obfuscated runtime behavior over generated calldata, state, and environment.

Suggested comparison fields:

```text
status: success / revert / halt / invalid / out-of-gas
returndata
revert data
logs and topics
storage writes
external calls and call outcomes
gas used
gas remaining
generated calldata
generated environment
```

5. Make fuzz iteration accounting exact where possible, especially for CI and research runs.

## Suggested regression tests

```text
1. every pass in DEFAULT_PASSES is reachable by fuzz pass selection
2. string_obfuscate appears in at least one generated fuzz pass combination
3. --check-deploy remains deployability-only and is documented as such
4. runtime differential smoke test compares original vs obfuscated execution for a small fixture
5. fuzzing with -i N reports an exact or explicitly documented iteration budget
```

## Suggested invariant

For a default pass set `P`, the fuzzing pass-selection function must be capable of selecting every pass `p ∈ P`.

For runtime assurance, for every fuzz-generated input where original deployment succeeds, the obfuscated deployment should succeed and generated runtime transactions should produce equivalent observable results under the selected equivalence relation:

```text
status
returndata
revert data
logs
storage delta
external call effects
gas policy
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(fuzz): make default pass selection complete and add runtime equivalence boundary #145

Summary

Affected components

Technical description

Proof of Concept

Trace / evidence

Impact

Recommended mitigation

Suggested regression tests

Suggested invariant

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fix(fuzz): make default pass selection complete and add runtime equivalence boundary #145

Description

Summary

Affected components

Technical description

Proof of Concept

Trace / evidence

Impact

Recommended mitigation

Suggested regression tests

Suggested invariant

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions