Skip to content

Interpreter fixes: Ore stack simplification, resolver improvements, and IDL parsing#60

Merged
adiman9 merged 60 commits intomainfrom
interpreter-fixes
Mar 19, 2026
Merged

Interpreter fixes: Ore stack simplification, resolver improvements, and IDL parsing#60
adiman9 merged 60 commits intomainfrom
interpreter-fixes

Conversation

@adiman9
Copy link
Copy Markdown
Contributor

@adiman9 adiman9 commented Mar 17, 2026

Summary

This PR introduces significant improvements to the interpreter and resolver system, particularly around the Ore stack implementation.

Key Changes

Ore Stack Simplification

  • Simplified the Ore stack architecture (internally referred to as keccak_rng - a working name for now)
  • Updated Ore stack IDL with program address and improved formatting
  • Added Yellowstone gRPC dependencies for enhanced streaming capabilities
  • Enabled treasury streaming in TypeScript examples

Resolver System Fixes

  • Fixed cross-account lookup resolution at round boundaries
  • Implemented skip_resolvers mechanism for stale data reprocessing
  • Removed duplicate URL path qualification in nested struct resolver
  • Restored proper entity rollover handling

Core Interpreter Improvements

  • Added Keccak256 hashing and slot hash caching
  • Improved slot scheduler with notification-based waiting
  • Added rt-multi-thread feature support
  • Enhanced logging throughout the resolver pipeline

IDL Parsing

  • Improved IDL parsing coverage and error handling
  • Added optional chaining for safer property access in examples

Cleanup

  • Removed excessive debug logging from resolver pipeline

Testing

  • Updated TypeScript examples with safer property access patterns

Note: While keccak_rng works as an internal name, it's acknowledged as temporary.

adiman9 added 15 commits March 16, 2026 01:18
Fixes three bugs introduced after the Scheduled URL Resolver merge that
broke cross-account field mapping (lookup_index with register_from) at
round boundaries:

1. Compiler key resolution priority swap — resolved_key_reg (PDA reverse
   lookup result) was given priority over the LookupIndex result, causing
   mutations to be keyed by round_address instead of round_id.

2. Null-key handler execution — when LookupIndex returned null, handlers
   continued with key=null creating phantom entities that prevented
   queueing for reprocessing. Added AbortIfNullKey opcode.

3. Stale lookup index entries — PDA mapping changes at round boundaries
   left old entries that shadowed the updated mapping. Added PDA change
   detection, stale index clearing, and last_account_data cache for
   reprocessing.
- Add Keccak256 computed expression for cryptographic hashing
- Implement slot_hash_bytes and expires_at_slot_hash fields in Ore stack
- Add pre_reveal_rng and pre_reveal_winning_square computed fields
- Add sha3 dependency to hyperstack runtime features
- Add skip_resolvers flag to UpdateContext for reprocessed cached data
- Create UpdateContext::new_reprocessed() for PDA mapping change scenarios
- Add is_stale_reprocess flag to PendingAccountUpdate
- Skip QueueResolver opcodes when processing stale reprocessed data
- Add slot_hash_cache module for efficient slot hash lookups
…nced logging

- Add slot tracker notification support with 5s polling fallback
- Export generate_computed_expr_code_with_cache for intra-section caching
- Add detailed tracing for scheduler callback processing
- Improve condition evaluation visibility with field value logging
- Add SetOnce guard logging to track skipped callbacks
- Add yellowstone-grpc-client and yellowstone-grpc-proto dependencies
- Update Ore React components with latest schema changes
- Update Ore TypeScript example with local development URL
- Update hyperstack-server with new health check improvements
- Sync all Cargo.lock files with new dependencies
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperstack-docs Ready Ready Preview, Comment Mar 19, 2026 8:57am

Request Review

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Mar 17, 2026

Greptile Summary

This PR is a substantial second iteration that resolves the majority of issues raised in the previous review cycle. It delivers Ore stack slot-hash infrastructure, resolver improvements, IDL parsing fixes, and a reworked AbortIfNullKey guard — with most prior concerns properly addressed.

What changed

  • AbortIfNullKey opcode now correctly returns Ok(Vec::new()) and uses a compile-time is_account_event flag (from SourceSpec) rather than a runtime string heuristic
  • SlotHashResolver now returns the correct { bytes: [...] } object shape matching the TypeScript interface; keccak_rng returns a String-serialized u64 to avoid JavaScript precision loss
  • json_array_to_bytes uses u8::try_from to reject out-of-range values; disc.value as u8 replaced with u8::try_from + warn fallback
  • Steel IDL detection switched to all() with an empty-array guard, fixing both the vacuous-true and single-instruction misclassification bugs
  • new_multi_entity() now allocates 256 registers (was 32); ore::RoundState hardcoded string removed from the generic VM
  • Generated code caches account data for all state IDs (not just state_id = 0), fixing round-boundary PDA mapping reprocessing
  • Slot subscription task: TLS is now conditional on endpoint scheme; _keep_alive (not _) retains the subscription sender; parse_and_cache_slot_hashes is a plain fn; checked_mul/checked_add prevent overflow
  • TypeScript SDK: SlotHashBytes/KeccakRngValue types used consistently; rng and pre_reveal_rng typed as string

Issues found

  • SlotTracker::record() calls notify_waiters() which requires a waiter to already be registered. If record() fires while the scheduler is between iterations (processing callbacks), the notification is lost and the scheduler delays up to 5 seconds. notify_one() would store a permit and avoid this race.
  • GLOBAL_SLOT_TRACKER, init_global_slot_tracker, and the module-level get_slot_hash function added to health.rs are dead code — never called from the generated code or anywhere else. The actual slot hash path goes through interpreter/src/slot_hash_cache.rs. The SlotTracker::record_slot_hash/get_slot_hash methods are similarly unreachable. This creates confusion about which cache is authoritative.
  • SlotTracker::record_slot_hash holds the write lock during a full O(n) traversal + collect + remove-loop; hashes.retain(|&s, _| s >= threshold) would be simpler and avoids the intermediate Vec.

Confidence Score: 4/5

  • Safe to merge with minor fixes — the notify_waiters race has a bounded 5-second fallback and the dead code is non-breaking.
  • The vast majority of prior review issues are correctly addressed. The two new concerns (missed-notification race, dead code in health.rs) are real but low-severity: the race only introduces a latency spike bounded by the 5-second fallback rather than data corruption, and the dead code is simply unused infrastructure rather than a correctness bug. Core VM, resolver, IDL, and TypeScript SDK changes are solid.
  • rust/hyperstack-server/src/health.rs — notify_waiters() race and unused GLOBAL_SLOT_TRACKER infrastructure need attention before the slot-notification path is relied upon in production.

Important Files Changed

Filename Overview
rust/hyperstack-server/src/health.rs Adds Notify to SlotTracker for event-driven wakeups, plus a GLOBAL_SLOT_TRACKER infrastructure with record_slot_hash/get_slot_hash. The notify_waiters() call can miss notifications if fired between scheduler iterations (should be notify_one()). The GLOBAL_SLOT_TRACKER and its helper functions are dead code — they are never called from anywhere in the codebase.
interpreter/src/slot_hash_cache.rs New module providing a global BTreeMap-backed slot hash cache. Uses pop_first() for O(log n) LRU eviction — correctly addresses the prior O(n) full-scan concern. Clean and well-tested.
interpreter/src/resolvers.rs Adds SlotHashResolver with slot_hash and keccak_rng computed methods. All previously flagged issues are resolved: evaluate_slot_hash returns Value::Object({bytes:[...]}) matching the TypeScript interface; keccak_rng returns Value::String (not Value::Number) avoiding u64 precision loss; json_array_to_bytes uses u8::try_from to reject out-of-range values; is_output_type uses the extra_output_types() trait method instead of raw string scanning.
interpreter/src/vm.rs Addresses all prior concerns: AbortIfNullKey now correctly returns Ok(Vec::new()) on null key; new_multi_entity() allocates 256 registers (not 32); ore::RoundState hardcoded string removed from is_fresh_update; skip_resolvers flag correctly prevents stale-data scheduling via new_reprocessed context.
interpreter/src/compiler.rs Adds AbortIfNullKey opcode. The is_account_event flag is now derived from SourceSpec::Source { is_account: true } at compile time — correctly replacing the fragile runtime string heuristic noted in the prior review.
hyperstack-macros/src/codegen/vixen_runtime.rs Adds a dedicated gRPC slot subscription task. All prior concerns addressed: TLS is now conditional on https:///grpcs:// prefix; _keep_alive binding (not _) keeps the subscription sender alive; parse_and_cache_slot_hashes is now a plain fn; integer overflow uses checked_mul/checked_add; cache_last_account_data now iterates all state IDs. The notified() wakeup from SlotTracker has a missed-notification race (see health.rs comment).
hyperstack-idl/src/snapshot.rs Fixes Steel IDL detection: all() replaces any() to avoid single-instruction misclassification; empty-array guard prevents vacuous-true; u8::try_from(disc.value) with a warning fallback replaces the silent truncating as u8 cast. All prior issues addressed.
stacks/sdk/typescript/src/ore/index.ts Prior undefined-identifier issues (SlotHash, SlotHashSchema, SlotHashTypes) all fixed; correct names SlotHashBytes/SlotHashBytesSchema used throughout. rng and pre_reveal_rng now typed as KeccakRngValue = string avoiding u64 precision loss. expires_at_slot_hash uses SlotHashBytesSchema (not z.array(z.any())). Schema definitions look correct and consistent.
hyperstack-idl/tests/parse_fixtures.rs Hardcoded developer machine path replaced with fixture_path("ore.json"). New tests add proper assertions for discriminant parsing, program_id from address field, and Steel discriminant_size = 1. Previously unresolved TODO comment removed.
interpreter/src/scheduler.rs Reworked SlotScheduler to use a reverse slot_index for O(1) targeted removal when re-registering a callback at a new slot. Dedup key now deterministically includes condition state, preventing stale callbacks from accumulating.

Sequence Diagram

sequenceDiagram
    participant GeyserGRPC as Yellowstone gRPC
    participant SlotSubTask as Slot Subscription Task
    participant SlotTracker as SlotTracker
    participant SlotHashCache as slot_hash_cache (global)
    participant SlotScheduler as Slot Scheduler Task
    participant VM as VmContext
    participant Resolver as SlotHashResolver

    GeyserGRPC->>SlotSubTask: SlotUpdate(slot)
    SlotSubTask->>SlotTracker: record(slot)
    SlotTracker->>SlotTracker: notify_waiters() ⚠️ (may miss if no waiter registered)
    SlotTracker-->>SlotScheduler: notified() OR 5s timeout

    GeyserGRPC->>SlotSubTask: AccountUpdate(SlotHashes sysvar)
    SlotSubTask->>SlotHashCache: record_slot_hash(slot, hash) × 512

    SlotScheduler->>SlotTracker: get() → current_slot
    SlotScheduler->>VM: process_event (due callbacks)

    Note over VM,Resolver: Computed field resolution
    VM->>Resolver: evaluate_computed("slot_hash", [slot])
    Resolver->>SlotHashCache: get_slot_hash(slot)
    SlotHashCache-->>Resolver: base58 hash string
    Resolver-->>VM: Value::Object { bytes: [...] }

    VM->>Resolver: evaluate_computed("keccak_rng", [hash, seed, samples])
    Resolver-->>VM: Value::String (u64 XOR-fold as decimal string)
Loading

Fix All in Claude Code

Prompt To Fix All With AI
This is a comment left during a code review.
Path: rust/hyperstack-server/src/health.rs
Line: 34-36

Comment:
**`notify_waiters()` can miss notifications under load**

`notify_waiters()` only wakes tasks that are *currently registered* as waiters (i.e., already awaiting `notified()`). If `record()` fires while the scheduler loop is between iterations — processing callbacks and not yet suspended in the `tokio::select!` — the notification is silently dropped and the scheduler must wait the full 5-second fallback timeout.

For Solana's ~400 ms slot cadence this means time-sensitive callbacks (e.g. Ore round-expiry) can be up to 5 seconds late in the worst case. `notify_one()` stores a permit and guarantees the *next* `notified()` call resolves immediately regardless of when the wakeup arrives:

```rust
pub fn record(&self, slot: u64) {
    let old = self.last_slot.fetch_max(slot, Ordering::Relaxed);
    if slot > old {
        self.notify.notify_one(); // stores a permit — never missed
    }
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: rust/hyperstack-server/src/health.rs
Line: 74-96

Comment:
**Dead code: `GLOBAL_SLOT_TRACKER` and related functions are never used**

`init_global_slot_tracker`, `GLOBAL_SLOT_TRACKER`, and the module-level `get_slot_hash` function are added in this PR but are never called from anywhere else in the codebase (confirmed by search). The actual slot-hash storage used by the resolvers is `interpreter/src/slot_hash_cache.rs` — the generated code calls `hyperstack_interpreter::record_slot_hash` and `SlotHashResolver::evaluate_slot_hash` reads from `crate::slot_hash_cache::get_slot_hash`, bypassing this infrastructure entirely.

Similarly, `SlotTracker::record_slot_hash` and `SlotTracker::get_slot_hash` are never called from the generated code. Leaving these in creates confusion about which cache is authoritative and adds maintenance surface. Consider either:
- Wiring up `init_global_slot_tracker` so this infrastructure is actually used, or
- Removing it until it is needed.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: rust/hyperstack-server/src/health.rs
Line: 39-52

Comment:
**`record_slot_hash` holds the write lock during O(n) full-map traversal**

The pruning logic holds the `RwLock` write lock while iterating every key to find stale entries, then removes them one by one. This approach acquires the lock, scans all entries, collects pruning candidates, and removes them — all while blocking every reader. For a `HashMap` keyed by slot, a more efficient approach is to track the watermark separately (since `slot` is monotonically increasing) so that entries older than `slot - 10000` can be removed without a full scan:

```rust
// Only needs to check entries older than the watermark; in practice 
// the collect+remove pattern is already the best option for HashMap,
// but holding the lock only after computing the removal list would help:
let threshold = slot.saturating_sub(10_000);
hashes.retain(|&s, _| s >= threshold);
```

Note: this function appears to be unreachable dead code currently (see the `GLOBAL_SLOT_TRACKER` comment above), so fixing the performance here is secondary to deciding whether this code path will ever be used.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "fix: Keccak types"

adiman9 added 2 commits March 18, 2026 23:35
- Add missing fields to IdlAccountSnapshot test initialization
- Implement automatic discriminant size detection based on instruction format
- Add address field to ore.json fixture for IdlSnapshot compatibility
- Remove non-standard discriminant_size field from ore.json
- Remove unused imports in hyperstack-macros and hyperstack-server
- Fix empty else branch in vm.rs
- Remove unused variables in scheduler.rs and vm.rs
- Fix let-and-return pattern in VmContext::new_multi_entity()
- Add #[allow(clippy::type_complexity)] for complex entity_evaluator type
…esolver output types

The `pre_reveal_rng` field (computed via `keccak_rng` resolver) was being
typed as `number` instead of `string`. The resolver returns `Value::String`
to avoid JavaScript precision loss for large u64 values.

Changes:
- Updated `is_output_type()` to check TypeScript interfaces for type aliases
- Fixed `add_unmapped_fields()` to use resolver types from field_mappings
- Updated `uses_builtin_type()` to check field_mappings for computed fields

This ensures computed fields that use resolver methods get the correct
TypeScript types generated.
adiman9 added 2 commits March 19, 2026 07:29
…t emitter

- Map `slot_hash` resolver method to `SlotHashBytes` (not `SlotHash`) so the
  interface and Zod schema names match what the resolver actually declares
- Extend `is_output_type()` to recognise types declared via `export interface`
  in addition to type aliases, so `SlotHashBytes` is found as a builtin type
- Extend `generate_builtin_resolver_interfaces/schemas` to emit the full
  resolver block whenever any type it declares is used (not just the primary
  output_type), covering the `KeccakRngValue`/`SlotHashBytes` split
- Filter the schemas const-block to only include `*Schema`-named entries in
  both the single-entity and multi-entity code paths, preventing bare group
  names like `SlotHashTypes` from leaking into the stack object
Comment on lines +34 to +36
self.notify.notify_waiters();
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 notify_waiters() can miss notifications under load

notify_waiters() only wakes tasks that are currently registered as waiters (i.e., already awaiting notified()). If record() fires while the scheduler loop is between iterations — processing callbacks and not yet suspended in the tokio::select! — the notification is silently dropped and the scheduler must wait the full 5-second fallback timeout.

For Solana's ~400 ms slot cadence this means time-sensitive callbacks (e.g. Ore round-expiry) can be up to 5 seconds late in the worst case. notify_one() stores a permit and guarantees the next notified() call resolves immediately regardless of when the wakeup arrives:

pub fn record(&self, slot: u64) {
    let old = self.last_slot.fetch_max(slot, Ordering::Relaxed);
    if slot > old {
        self.notify.notify_one(); // stores a permit — never missed
    }
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: rust/hyperstack-server/src/health.rs
Line: 34-36

Comment:
**`notify_waiters()` can miss notifications under load**

`notify_waiters()` only wakes tasks that are *currently registered* as waiters (i.e., already awaiting `notified()`). If `record()` fires while the scheduler loop is between iterations — processing callbacks and not yet suspended in the `tokio::select!` — the notification is silently dropped and the scheduler must wait the full 5-second fallback timeout.

For Solana's ~400 ms slot cadence this means time-sensitive callbacks (e.g. Ore round-expiry) can be up to 5 seconds late in the worst case. `notify_one()` stores a permit and guarantees the *next* `notified()` call resolves immediately regardless of when the wakeup arrives:

```rust
pub fn record(&self, slot: u64) {
    let old = self.last_slot.fetch_max(slot, Ordering::Relaxed);
    if slot > old {
        self.notify.notify_one(); // stores a permit — never missed
    }
}
```

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

Comment on lines +39 to +52
pub async fn record_slot_hash(&self, slot: u64, slot_hash: String) {
let mut hashes = self.slot_hashes.write().await;
hashes.insert(slot, slot_hash);

// Prune old entries to prevent unbounded growth (keep last 10000 slots)
let slots_to_remove: Vec<u64> = hashes
.keys()
.filter(|&&s| s < slot.saturating_sub(10000))
.copied()
.collect();
for s in slots_to_remove {
hashes.remove(&s);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 record_slot_hash holds the write lock during O(n) full-map traversal

The pruning logic holds the RwLock write lock while iterating every key to find stale entries, then removes them one by one. This approach acquires the lock, scans all entries, collects pruning candidates, and removes them — all while blocking every reader. For a HashMap keyed by slot, a more efficient approach is to track the watermark separately (since slot is monotonically increasing) so that entries older than slot - 10000 can be removed without a full scan:

// Only needs to check entries older than the watermark; in practice 
// the collect+remove pattern is already the best option for HashMap,
// but holding the lock only after computing the removal list would help:
let threshold = slot.saturating_sub(10_000);
hashes.retain(|&s, _| s >= threshold);

Note: this function appears to be unreachable dead code currently (see the GLOBAL_SLOT_TRACKER comment above), so fixing the performance here is secondary to deciding whether this code path will ever be used.

Prompt To Fix With AI
This is a comment left during a code review.
Path: rust/hyperstack-server/src/health.rs
Line: 39-52

Comment:
**`record_slot_hash` holds the write lock during O(n) full-map traversal**

The pruning logic holds the `RwLock` write lock while iterating every key to find stale entries, then removes them one by one. This approach acquires the lock, scans all entries, collects pruning candidates, and removes them — all while blocking every reader. For a `HashMap` keyed by slot, a more efficient approach is to track the watermark separately (since `slot` is monotonically increasing) so that entries older than `slot - 10000` can be removed without a full scan:

```rust
// Only needs to check entries older than the watermark; in practice 
// the collect+remove pattern is already the best option for HashMap,
// but holding the lock only after computing the removal list would help:
let threshold = slot.saturating_sub(10_000);
hashes.retain(|&s, _| s >= threshold);
```

Note: this function appears to be unreachable dead code currently (see the `GLOBAL_SLOT_TRACKER` comment above), so fixing the performance here is secondary to deciding whether this code path will ever be used.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

@adiman9 adiman9 merged commit 6ff70bc into main Mar 19, 2026
10 checks passed
@adiman9 adiman9 deleted the interpreter-fixes branch March 19, 2026 09:19
adiman9 added a commit that referenced this pull request Mar 25, 2026
Interpreter fixes: Ore stack simplification, resolver improvements, and IDL parsing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant