feat: genvm manager by kp2pml30 · Pull Request #243 · genlayerlabs/genvm

kp2pml30 · 2025-10-06T15:01:00Z

Summary by CodeRabbit

New Features
- HTTP Manager with full GenVM/module lifecycle, versioning, and many management APIs; CLI parse-version tool; async entrypoints for Web/LLM; module start/stop/status and GenVM lifecycle manager.
Bug Fixes
- Percent-sign escaping in LLM prompt templates; safer WASI pread boundary handling.
Documentation
- Architecture docs updated to include Manager and revised module links.
Tests
- Refactored test runner, new/semi-stable test cases and updated expectations.
Chores
- Updated configs/manifests, added runtime/test deps, CI/workflow and pre-commit tweaks.

coderabbitai · 2025-10-06T15:01:18Z

Caution

Review failed

The pull request is closed.

Walkthrough

Adds a Warp-based HTTP Manager with versioning and GenVM lifecycle control, refactors module entrypoints to async with cancellation, converts tests/runner and host to remote Manager use, adds parse-version CLI and logger/sync utilities, updates CI/workflows and deps, and includes many test-case and expectation updates.

Changes

Cohort / File(s)	Summary
CI & workflows `.github/workflows/incl_docs.yml`, `.github/workflows/release.yaml`, `.github/workflows/queue.yaml`	Docs deploy source path changed from `build/docs/html/` → `build/doc/html/`; stable tests now start manager (`--start-manager`); added disk-space-reclaimer step and Docker port mapping in queue workflow.
Pre-commit / deps / nix `.pre-commit-config.yaml`, `flake.nix`, `tests/runner/requirements.txt`	Added `pass_filenames: false` to a local hook; added `aiohttp` to Nix dev shells; added `aiohttp==3.12.15` to test requirements.
Build tooling `configure.rb`	Codegen build targets updated to use `tests/runner/origin/*` sources.
Docs `doc/website/src/impl-spec/.../01-components.rst`	Added Manager component and Host→Manager links; reorganized Module docs under Host and added Manager anchors.
Executor common utilities `executor/common/Cargo.toml`, `executor/common/src/cancellation.rs`, `executor/common/src/lib.rs`	Added `clap` derive dependency; new `CancellationDrop<F>` invoking Fn on Drop; extracted `populate_default_config_vars` and call from `load_config`.
Executor logging & sync `executor/common/src/logger/mod.rs`, `executor/common/src/sync.rs`	`Level` gains `clap::ValueEnum` and `#[repr(u32)]`; Logger.filter uses `AtomicU32` with `set_filter`/`get_filter`; added Capture macro arm; introduced `DArcStruct`, expanded `DArc` API and public re-exports.
Executor CLI & tooling `executor/src/exe/mod.rs`, `executor/src/exe/parse_version.rs`, `executor/src/main.rs`	Added `parse-version` subcommand and handler; new optional `--log-level`; split `main` into wrapper + `main_impl`.
Executor runtime & install scripts `executor/src/rt/memlimiter.rs`, `executor/src/wasi/preview1.rs`, `executor/install/config/genvm.yaml`, `executor/install/lib/python/post-install/__main__.py`	Minor formatting; `pread` uses `saturating_sub` for bounds; narrowed `log_disable` to `genvm::rt::memlimiter`; added `executor_root_dir`, `installation_root_dir`, `rpath_dir` and cached `get_interpreter_path`.
New Manager feature `*modules/implementation/src/manager/.rs**` (mod.rs, handlers.rs, run.rs, versioning.rs, modules.rs)	Added Warp-based HTTP Manager: config/context types, routes/handlers, GenVM lifecycle (start/status/shutdown), version manifest loader/resolution, module start/stop/status with cancellation, permits, graceful shutdown, CLI entrypoint.
Modules deps & manifests `modules/implementation/Cargo.toml`, `modules/install/config/genvm-manager.yaml`, `modules/install/data/manifest.yaml`	Added dependencies (warp, bytes, sysinfo, libc, dashmap, sha3); enabled tokio `process` and chrono `serde`; added manager config and manifest entry `v0.2.3`.
LLM / Web module refactors `modules/implementation/src/llm/mod.rs`, `modules/implementation/src/web/mod.rs`, `modules/install/config/genvm-llm-default.lua`, `modules/install/config/genvm-module-llm.yaml`, `modules/install/lib/genvm-lua/lib-llm.lua`, `modules/implementation/src/scripting/ctx/filters.rs`, `modules/implementation/fuzz/`	Added async `run_llm_module`/`run_web_module` with cancellation; moved runtime creation into callers; relaxed fatal provider error path (commented out user_error); updated Google model id; escape percent signs in Lua prompt substitution; small refactors.
Tests: runner, host, mock changes `tests/runner/run.py`, `tests/rust.sh`, `tests/runner/origin/`, `tests/runner/mock_host.py`, `tests/runner/logger.py`	Rewrote runner into modular async runner with `--start-manager/--start-modules`; origin host switched to manager-based GenVM orchestration and status polling; new logger abstraction; mocks adapted to `origin.*` and `public_abi` types.
Tests: cases & expectations `*tests/cases/semi-stable/llm/.jsonnet`, `/.py`, `/.stdout**`	Added percent-in-prompt and WizardOfCoin test cases and updated multiple expected stdout files.
Webdriver minor `modules/webdriver/src/prj/src/index.ts`	Added console.log in catch to report navigation errors.
Misc small edits & formatting `modules/implementation/fuzz/`, `executor/src/exe/`, other formatting-only lines**`	Small style/iterator/ownership changes with no semantic impact in affected spots.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor CI as CI / Runner
  participant TestRunner as Test Runner (run.py)
  participant Manager as Manager HTTP
  participant Executor as GenVM Executor
  participant Module as LLM/Web Module

  CI->>TestRunner: start --start-manager --start-modules
  TestRunner->>Manager: POST /module/start (LLM/Web)
  Manager-->>TestRunner: 200 OK
  TestRunner->>Manager: POST /genvm/run {code, host_data}
  Manager->>Executor: spawn genvm process (resolve version via manifest)
  loop execution lifecycle
    Executor-->>Manager: stdout/stderr (JSON log lines)
    TestRunner->>Manager: GET /genvm/{id} (poll status)
  end
  Manager-->>TestRunner: final status + outputs
  TestRunner->>CI: report results

sequenceDiagram
  autonumber
  participant User as CLI user
  participant Genvm as executor main
  participant PV as parse_version tool

  User->>Genvm: genvm parse-version-pattern < stdin
  Genvm->>PV: handle(args, config)
  PV->>PV: read stdin, parse arch, extract version
  PV-->>User: write version or "v*.*.*"

sequenceDiagram
  autonumber
  participant HostLoop as Origin Host
  participant Manager as Manager HTTP
  participant Versioning as Versioning Subsystem
  participant GenVM as GenVM Exec

  HostLoop->>Manager: POST /genvm/run (sync/async)
  Manager->>Versioning: resolve executor version (manifest + timestamp)
  Versioning-->>Manager: Version vM.m.p
  Manager->>GenVM: spawn executor with resolved version/env
  HostLoop->>Manager: GET /genvm/{id} (status poll)
  Manager-->>HostLoop: {running/done, outputs}

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120+ minutes

Possibly related PRs

Batch update #235 — Large overlap: tests/runner, origin host, and manager orchestration refactors touch the same host/runner/manager integration points.
batch update #226 — Overlaps on executor/common logger, public ABI and related type changes used in this PR.
Batch update #238 — Related CI/workflow edits (incl_docs/release) that modify docs deploy paths.

Poem

I am a rabbit, tidy and fleet,
I started a manager with tiny feet.
Versions parsed and logs went atomic,
Modules run async, the HTTP’s emphatic.
Tests hop through doors—hooray, what a treat! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title “feat: genvm manager” succinctly captures the primary new feature introduced by this changeset, namely the extensive GenVM management surface, and aligns with the conventional commit style without extraneous details.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 09ffae6 and 85770e2.

📒 Files selected for processing (5)

.github/workflows/queue.yaml (2 hunks)
flake.nix (3 hunks)
modules/implementation/src/llm/providers.rs (3 hunks)
modules/install/config/genvm-module-llm.yaml (1 hunks)
tests/rust.sh (3 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 24

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

executor/install/lib/python/post-install/__main__.py (1)

143-146: DynamicEntryRpath argument type likely wrong (list vs str).

LIEF expects a string path for RPATH; passing a list can fail or produce invalid entries.
-            rpath_entry = lief.ELF.DynamicEntryRpath([rpath_str])
+            rpath_entry = lief.ELF.DynamicEntryRpath(rpath_str)

tests/runner/run.py (1)

541-551: Exception reporting can crash on failures.

Expanding e.args[-1] assumes a dict; often it’s not, masking the original error.

Apply this diff:

-            except Exception as e:
-                time_elapsed = time.monotonic() - time_start
-                self.reporter.report_single(
-                    test_name,
-                    {
-                        'category': 'fail',
-                        'steps': steps,
-                        'exception': 'internal error',
-                        'exc': e,
-                        'elapsed': time_elapsed,
-                        **e.args[-1],
-                    },
-                )
+            except Exception as e:
+                time_elapsed = time.monotonic() - time_start
+                extra = e.args[-1] if e.args and isinstance(e.args[-1], dict) else {}
+                self.reporter.report_single(
+                    test_name,
+                    {
+                        'category': 'fail',
+                        'steps': steps,
+                        'exception': 'internal error',
+                        'exc': e,
+                        'elapsed': time_elapsed,
+                        **extra,
+                    },
+                )
                 return

executor/common/src/sync.rs (1)

242-264: into_gep_async lifetime comment indicates incorrect quantification

The current signature risks misuse. Prefer HRTB over 'a on the method:

Consider:

Keep into_gep_async but ensure the future is created and polled entirely within the scope of the borrow, or

Mark unsafe and document safety, or

Replace with a combinator built on gep + async block returning &'a R immediately.

Up to you; but don't ship with “impossible to fix” comment as-is.

🧹 Nitpick comments (29)

modules/webdriver/src/prj/src/index.ts (1)
82-82: Consider structured logging for better observability.

While adding error logging is valuable for debugging, console.log lacks structure and log levels. For production services, consider using a structured logger (e.g., winston, pino) with appropriate log levels (error, warn, info, debug). Also, be mindful that navigationError might contain sensitive information like URLs with credentials or internal paths.

Example with a structured logger:
+import winston from 'winston';
+
+const logger = winston.createLogger({
+  level: 'info',
+  format: winston.format.json(),
+  transports: [new winston.transports.Console()]
+});
Then replace the console.log:
-    console.log('Navigation Error', navigationError)
+    logger.error('Navigation error', { 
+      message: navigationError.message,
+      name: navigationError.name,
+      url: targetUrl // only if safe to log
+    });
executor/install/lib/python/post-install/__main__.py (3)
64-77: Polish get_interpreter_path and verify target interpreter.

Prefer sys.exit over exit.

Add return type.

Please verify libc.so is intended for PT_INTERP (on glibc it’s usually ld-linux-.so.2; on musl, ld-musl-.so.1). If not, compute the correct loader path per platform/arch.
-def get_interpreter_path():
+def get_interpreter_path() -> Path:
@@
-    exit(1)
+    sys.exit(1)
Also add:
import sys
near the imports.

157-159: Avoid duplicate Mach‑O RPATH entries.

Only add RPATH if not already present.

Replace the add block (around Lines 189-192) with:
if rpath_str in existing_rpaths:
    logger.info(f'RPATH already present for Mach-O binary: "{rpath_str}" — skipping')
else:
    rpath_cmd = lief.MachO.RPathCommand.create(rpath_str)
    binary.add(rpath_cmd)
    logger.info(f'Added RPATH to Mach-O binary: "{rpath_str}"')
164-167: Broaden handled Mach‑O dylib command types and dedupe code.

Include REEXPORT_DYLIB/LAZY_LOAD_DYLIB and avoid repeating the set.
# near other constants
DYLIB_CMDS = {
    lief.MachO.LoadCommand.TYPE.LOAD_DYLIB,
    lief.MachO.LoadCommand.TYPE.LOAD_WEAK_DYLIB,
    lief.MachO.LoadCommand.TYPE.REEXPORT_DYLIB,
    lief.MachO.LoadCommand.TYPE.LAZY_LOAD_DYLIB,
}

# use in both loops
if cmd.command in DYLIB_CMDS:
    ...
Also applies to: 173-176
.github/workflows/incl_docs.yml (1)
55-65: Guard against missing docs path when deploying.

Fail fast with a clear message if build/doc/html is absent.

Apply this diff:
       - name: Deploy version to target repository
         run: |
-          SRC=$(readlink -f build/doc/html/)
+          SRC=$(readlink -f build/doc/html/) || true
+          if [ ! -d "$SRC" ]; then
+            echo "Docs directory not found at build/doc/html. Ensure docs.sh outputs to this path." >&2
+            exit 1
+          fi
           cd target-repo
tests/rust.sh (1)
233-234: Forward script filter to run.py.

Let users limit JSONNET tests via the same FILTER flag.

Apply this diff:
-    python3 ./tests/runner/run.py --ci --start-manager --start-modules
+    python3 ./tests/runner/run.py --ci --start-manager --start-modules --filter "$FILTER"
tests/runner/run.py (2)
271-279: Clarify boolean precedence in condition.

Make intent explicit with parentheses.

Apply this diff:
-            if res['category'] == 'fail' and 'steps' in res or self.config.args.show_steps:
+            if (res['category'] == 'fail' and 'steps' in res) or self.config.args.show_steps:
139-155: Optional: replace fixed sleep with a health probe.

Poll the manager endpoint until ready instead of time.sleep(2) to reduce flakiness.

Example:
# after starting the process
for i in {1..40}; do
  if curl -sf "$config.manager/health" >/dev/null; then break; fi
  sleep 0.25
done
executor/src/exe/parse_version.rs (1)

19-23: Consider documenting the fallback version format.

The hardcoded fallback "v*.*.*" is used when no version is found in the architecture data. Consider:

Documenting this fallback behavior in a comment

Verifying that downstream consumers can handle this wildcard pattern

Whether a more specific error or empty output would be more appropriate
tests/runner/origin/logger.py (2)
68-78: Add newline/flush and a default serializer in StderrLogger.log

Without a newline and flush, JSON lines can coalesce or be buffered. Also, dumping non-JSON-serializable values will raise. Add both for robustness.
-        json.dump(
+        json.dump(
             {
                 'message': msg,
                 'level': level,
                 **kwargs,
             },
             sys.stderr,
+            default=str,
         )
+        sys.stderr.write('\n')
+        sys.stderr.flush()
51-52: Silence unused arg in NoLogger.with_keys (Ruff ARG002)

Rename the parameter to underscore-prefixed to avoid the linter warning.
-    def with_keys(self, keys: dict) -> 'Logger':
+    def with_keys(self, _keys: dict) -> 'Logger':
         return self
modules/implementation/src/llm/mod.rs (1)
129-135: Adjust log level when empty backends are allowed

Logging an error but proceeding is misleading. Use warn when allow_empty_backends is true.
-    if config.backends.is_empty() {
-        log_error!("no valid backend detected")
-    }
-
-    if !allow_empty_backends && config.backends.is_empty() {
-        anyhow::bail!("no valid backend detected");
-    }
+    if config.backends.is_empty() {
+        if allow_empty_backends {
+            log_warn!("no valid backend detected");
+        } else {
+            log_error!("no valid backend detected");
+        }
+    }
+
+    if !allow_empty_backends && config.backends.is_empty() {
+        anyhow::bail!("no valid backend detected");
+    }
modules/implementation/src/web/mod.rs (2)
31-31: Remove unused variable

_webdriver_host is never used. Drop it.
-    let _webdriver_host = config.webdriver_host.clone();
37-41: Minor clone cleanup in vm_pool closure

Avoid redundant clones; clone once per unique capture.
-    let vm_pool = scripting::pool::new(config.mod_base.vm_count, move || {
-        let moved_config_1 = moved_config.clone();
-        let moved_config_2 = moved_config.clone();
-        let moved_config = moved_config.clone();
+    let vm_pool = scripting::pool::new(config.mod_base.vm_count, move || {
+        let moved_config_1 = moved_config.clone();
+        let moved_config_2 = moved_config.clone();
(Use moved_config_1 and moved_config_2 directly where moved_config was used.)
executor/common/src/logger/mod.rs (1)

185-188: bytes capture currently prints Debug, not the $Bytes(...) encoding

If you intended a compact byte encoder (like write_bytes), this arm should route through that path. If Debug is fine, ignore this.
modules/implementation/src/manager/modules.rs (3)
39-47: Avoid holding the write lock across .await

start() holds the module write lock through sleep(5s).await. This blocks stop()/status calls unnecessarily. Acquire the lock to check/set state, then drop it before awaiting. Reacquire only when storing the canceller.

Also applies to: 106-114

106-110: Surface startup errors from the spawned task

If the task finishes early, capture and include its error instead of a generic message. Example:
if module_task.is_finished() {
    match module_task.await {
        Ok(Ok(())) => anyhow::bail!("module exited immediately"),
        Ok(Err(e)) => anyhow::bail!(e),
        Err(join_err) => anyhow::bail!(join_err),
    }
}
(Note: store the JoinHandle somewhere if you need it later; dropping the handle detaches the task.)

151-168: Refactor get_module_locks to return concrete lock types
Returning Option<impl Any> hides the actual (RwLockReadGuard<LLMModule>, RwLockReadGuard<WebModule>) tuple and forces callers to ignore or downcast. Instead, change the signature to:
pub async fn get_module_locks(
    zelf: sync::DArc<Ctx>,
) -> Option<(crate::llm::LLMModuleReadGuard, crate::web::WebModuleReadGuard)>
so callers can directly work with the two guards.
libs/install/bin/genvm-detect-version-server.py (3)
3-18: Use a threaded HTTP server for concurrency

Swap TCPServer for ThreadingHTTPServer.
 import http.server
+from http.server import ThreadingHTTPServer
@@
-	with socketserver.TCPServer((args.host, args.port), DetectVersionHandler) as httpd:
+	with ThreadingHTTPServer((args.host, args.port), DetectVersionHandler) as httpd:
Sanity-check that multi-request CI tests expect concurrent handling. If single-threaded is intentional, ignore.

Also applies to: 212-214

19-24: Verify data location: ROOT_PATH may not match manifest path in repo

This server reads libs/install/data/version-timestamps.json, while the PR adds modules/install/data/manifest.yaml. Ensure the JSON exists at the expected path or accept a --manifest/--versions-path argument.

Would you like a patch to add a CLI flag and YAML support?

Also applies to: 137-139

191-199: Minor cleanups

Rename unused signal args to underscores to appease linters.

Remove unused urlparse import.
-from urllib.parse import urlparse
+# from urllib.parse import urlparse  # unused
@@
-def signal_handler(signum, frame):
+def signal_handler(_signum, _frame):
executor/common/src/sync.rs (1)
435-440: Update call site if marking into_get_sub_async unsafe

If you adopt the unsafe mitigation, wrap the call:
-        let res = entry
-            .into_get_sub_async(|cell| cell.get_or_try_init(creator))
-            .await
+        let res = unsafe {
+            entry.into_get_sub_async(|cell| cell.get_or_try_init(creator)).await
+        }
             .lift_result()?
             .into_arc();
modules/implementation/src/manager/mod.rs (2)
38-49: Duplicate public Manifest/ExecutorVersion types across modules

This file defines Manifest/ExecutorVersion differing from versioning.rs (types/field shapes). This splits the source of truth and risks deserialization/runtime mismatches.

Remove/privatize these here and re-export the types from versioning.rs, or consolidate into a shared module.

Also applies to: 71-75

50-69: Compile the version regex once

Avoid recompiling the regex on every call.

Example:
+    static VERSION_RE: once_cell::sync::Lazy<regex::Regex> = once_cell::sync::Lazy::new(|| {
+        regex::Regex::new(r"^v(\d+|\*)\.(\d+|\*)\.(\d+|\*)$").unwrap()
+    });
-    let version_re = regex::Regex::new(r"^v(\d+|\*)\.(\d+|\*)\.(\d+|\*)$").unwrap();
-    let captures = version_re
+    let captures = VERSION_RE
         .captures(version)
         .ok_or_else(|| anyhow::anyhow!("Invalid version format: {}", version))?;
modules/implementation/src/manager/versioning.rs (1)

147-154: Avoid blocking_read in async contexts

Using blocking_read() on a Tokio RwLock can block the runtime thread.

Make get_latest_major async and use self.manifest.read().await.

modules/implementation/src/manager/run.rs (1)

25-37: Naming nit: finished_at holds elapsed ms

finished_at serializes “time since finish (ms)”. Consider renaming to finished_ago_ms or serializing an absolute timestamp for clarity.
modules/implementation/src/manager/handlers.rs (3)
29-29: Remove unnecessary clone.

The clone() is unnecessary since serde_json::from_value consumes the value.

Apply this diff:
-    let stop_request = serde_json::from_value::<StopRequest>(calldata.clone())?;
+    let stop_request = serde_json::from_value::<StopRequest>(calldata)?;
96-96: Use field init shorthand.

Redundant field name.

Apply this diff:
-        major: major,
+        major,
187-202: Enforce upper‐bound on permits and require auth

run_ctx.set_permits already rejects values <2 but allows arbitrarily large values; cap at a safe max (e.g. ≤1000) before calling it.

Ensure the /permits POST endpoint is accessible only to authenticated/authorized clients.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de381db and decff4a.

⛔ Files ignored due to path filters (2)

executor/Cargo.lock is excluded by !**/*.lock
modules/implementation/Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (49)

.github/workflows/incl_docs.yml (1 hunks)
.github/workflows/release.yaml (2 hunks)
.pre-commit-config.yaml (1 hunks)
configure.rb (2 hunks)
doc/website/src/impl-spec/01-core-architecture/01-components.rst (3 hunks)
executor/common/Cargo.toml (1 hunks)
executor/common/src/cancellation.rs (1 hunks)
executor/common/src/lib.rs (2 hunks)
executor/common/src/logger/mod.rs (4 hunks)
executor/common/src/sync.rs (6 hunks)
executor/install/config/genvm.yaml (1 hunks)
executor/install/lib/python/post-install/__main__.py (4 hunks)
executor/src/exe/mod.rs (1 hunks)
executor/src/exe/parse_version.rs (1 hunks)
executor/src/main.rs (3 hunks)
executor/src/rt/memlimiter.rs (1 hunks)
flake.nix (2 hunks)
libs/default.nix (1 hunks)
libs/install/bin/genvm-detect-version-server.py (1 hunks)
modules/implementation/Cargo.toml (2 hunks)
modules/implementation/src/llm/mod.rs (3 hunks)
modules/implementation/src/main.rs (3 hunks)
modules/implementation/src/manager/handlers.rs (1 hunks)
modules/implementation/src/manager/mod.rs (1 hunks)
modules/implementation/src/manager/modules.rs (1 hunks)
modules/implementation/src/manager/run.rs (1 hunks)
modules/implementation/src/manager/versioning.rs (1 hunks)
modules/implementation/src/web/mod.rs (2 hunks)
modules/install/config/genvm-manager.yaml (1 hunks)
modules/install/config/genvm-module-llm.yaml (1 hunks)
modules/install/data/manifest.yaml (1 hunks)
modules/install/lib/genvm-lua/lib-llm.lua (1 hunks)
modules/webdriver/src/prj/src/index.ts (1 hunks)
tests/cases/semi-stable/llm/percent_in_prompt.jsonnet (1 hunks)
tests/cases/semi-stable/llm/percent_in_prompt.py (1 hunks)
tests/cases/semi-stable/llm/percent_in_prompt.stdout (1 hunks)
tests/cases/semi-stable/llm/woc.0.stdout (1 hunks)
tests/cases/semi-stable/llm/woc.1.stdout (1 hunks)
tests/cases/semi-stable/llm/woc.2.stdout (1 hunks)
tests/cases/semi-stable/llm/woc.3.stdout (1 hunks)
tests/cases/semi-stable/llm/woc.jsonnet (1 hunks)
tests/cases/semi-stable/llm/woc.py (1 hunks)
tests/cases/semi-stable/llm/woc.stdout (1 hunks)
tests/runner/mock_host.py (5 hunks)
tests/runner/origin/base_host.py (13 hunks)
tests/runner/origin/logger.py (1 hunks)
tests/runner/requirements.txt (1 hunks)
tests/runner/run.py (4 hunks)
tests/rust.sh (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (19)

executor/common/src/cancellation.rs (1)

executor/common/src/sync.rs (2)

drop (112-130)

drop (314-332)

modules/implementation/src/main.rs (2)

modules/implementation/src/manager/mod.rs (1)

entrypoint (325-350)

modules/implementation/src/web/mod.rs (1)

entrypoint (91-107)

executor/install/lib/python/post-install/__main__.py (1)

tests/runner/origin/logger.py (1)

info (16-17)

tests/runner/origin/logger.py (1)

tests/runner/run.py (1)

log (177-191)

executor/src/main.rs (2)

executor/src/exe/parse_version.rs (1)

handle (11-26)

executor/common/src/logger/mod.rs (1)

std (82-82)

executor/src/exe/mod.rs (1)

modules/implementation/src/manager/mod.rs (1)

parse_version (50-69)

modules/implementation/src/manager/run.rs (3)

executor/common/src/sync.rs (10)

serialize (137-143)

into_arc (33-42)

new (150-178)

new (371-376)

new (416-418)

new (453-455)

Self (155-155)

drop (112-130)

drop (314-332)

ptr (168-168)

tests/runner/run.py (1)

log (177-191)

executor/common/src/cancellation.rs (1)

drop (20-22)

tests/runner/origin/base_host.py (4)

tests/runner/origin/logger.py (6)

Logger (6-26)

NoLogger (47-52)

trace (10-11)

debug (13-14)

warning (19-20)

error (22-23)

tests/runner/origin/host_fns.py (2)

Errors (25-30)

Methods (7-22)

tests/runner/origin/public_abi.py (2)

StorageType (14-17)

ResultCode (7-11)

tests/runner/mock_host.py (8)

loop_enter (115-134)

get_calldata (136-137)

consume_result (157-160)

get_leader_nondet_result (165-177)

post_nondet_result (179-180)

consume_gas (206-207)

get_balance (209-210)

remaining_fuel_as_gen (145-146)

modules/implementation/src/manager/handlers.rs (4)

executor/common/src/logger/mod.rs (2)

std (82-82)

from_str (36-46)

modules/implementation/src/manager/mod.rs (4)

anyhow_to_rejection (20-26)

warp (188-188)

warp (203-203)

warp (271-271)

modules/implementation/src/manager/modules.rs (3)

get_status (133-149)

get_module_locks (151-167)

new (30-36)

modules/implementation/src/manager/run.rs (2)

start_genvm (400-533)

new (177-213)

modules/implementation/src/llm/mod.rs (2)

modules/implementation/src/manager/modules.rs (1)

new (30-36)

modules/implementation/src/manager/versioning.rs (1)

new (68-74)

tests/cases/semi-stable/llm/woc.py (2)

tests/cases/semi-stable/web/file_schema_web.py (2)

Contract (6-11)

nondet (8-9)

runners/genlayer-py-std/tests/test_schema.py (1)

public (9-11)

modules/implementation/src/web/mod.rs (2)

modules/implementation/src/manager/mod.rs (2)

entrypoint (325-350)

args (306-306)

executor/common/src/lib.rs (1)

load_config (93-107)

executor/common/src/sync.rs (1)

modules/implementation/src/manager/versioning.rs (1)

new (68-74)

modules/implementation/src/manager/modules.rs (7)

modules/implementation/src/manager/handlers.rs (2)

serde_json (29-29)

serde_json (46-46)

executor/common/src/sync.rs (5)

new (150-178)

new (371-376)

new (416-418)

new (453-455)

Self (155-155)

executor/common/src/cancellation.rs (1)

make (25-44)

executor/common/src/lib.rs (1)

populate_default_config_vars (71-91)

executor/common/src/templater.rs (2)

patch_str (62-77)

patch_json (110-133)

modules/implementation/src/llm/mod.rs (1)

run_llm_module (106-168)

modules/implementation/src/web/mod.rs (1)

run_web_module (27-89)

tests/runner/mock_host.py (3)

tests/runner/origin/public_abi.py (2)

StorageType (14-17)

ResultCode (7-11)

tests/runner/origin/base_host.py (1)

HostException (24-29)

tests/runner/origin/host_fns.py (1)

Errors (25-30)

tests/runner/run.py (4)

tests/runner/origin/logger.py (9)

error (22-23)

Logger (6-26)

log (8-8)

log (40-41)

log (48-49)

log (68-78)

with_keys (25-26)

with_keys (43-44)

with_keys (51-52)

runners/genlayer-py-std/src/genlayer/py/types.py (1)

Address (122-254)

tests/runner/mock_host.py (3)

MockHost (55-215)

MockStorage (30-52)

write (41-52)

tests/runner/origin/base_host.py (2)

save_code_callback (112-125)

run_genvm (349-489)

modules/implementation/src/manager/mod.rs (5)

modules/implementation/src/manager/run.rs (3)

tokio (508-508)

new (177-213)

start_service (237-256)

executor/common/src/sync.rs (4)

new (150-178)

new (371-376)

new (416-418)

new (453-455)

modules/implementation/src/manager/modules.rs (1)

new (30-36)

modules/implementation/src/manager/versioning.rs (1)

new (68-74)

modules/implementation/src/manager/handlers.rs (15)

serde_json (29-29)

serde_json (46-46)

handle_status (13-18)

handle_module_start (42-53)

handle_module_stop (25-40)

handle_genvm_run (55-72)

handle_genvm_run_readonly (74-115)

handle_contract_detect_version (117-131)

handle_set_log_level (133-151)

handle_manifest_reload (153-159)

handle_set_env (161-180)

handle_get_permits (182-185)

handle_set_permits (187-202)

handle_genvm_shutdown (214-234)

handle_genvm_status (236-246)

libs/install/bin/genvm-detect-version-server.py (2)

modules/implementation/src/manager/mod.rs (1)

args (306-306)

executor/src/main.rs (1)

main (63-72)

modules/implementation/src/manager/versioning.rs (2)

modules/implementation/src/manager/run.rs (2)

tokio (508-508)

new (177-213)

executor/common/src/sync.rs (7)

Self (155-155)

new (150-178)

new (371-376)

new (416-418)

new (453-455)

drop (112-130)

drop (314-332)

🪛 Ruff (0.13.3)

tests/runner/origin/logger.py

51-51: Unused method argument: keys

(ARG002)

tests/cases/semi-stable/llm/percent_in_prompt.py

2-2: from genlayer import * used; unable to detect undefined names

(F403)

5-5: gl may be undefined, or defined from star imports

(F405)

6-6: gl may be undefined, or defined from star imports

(F405)

11-11: gl may be undefined, or defined from star imports

(F405)

tests/runner/origin/base_host.py

360-360: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

397-397: f-string without any placeholders

Remove extraneous f prefix

(F541)

399-399: Create your own exception

(TRY002)

399-399: Avoid specifying long messages outside the exception class

(TRY003)

404-404: Store a reference to the return value of asyncio.ensure_future

(RUF006)

463-463: Do not catch blind exception: Exception

(BLE001)

467-467: Do not catch blind exception: Exception

(BLE001)

471-471: Create your own exception

(TRY002)

489-489: Create your own exception

(TRY002)

489-489: Avoid specifying long messages outside the exception class

(TRY003)

tests/cases/semi-stable/llm/woc.py

3-3: from genlayer import * used; unable to detect undefined names

(F403)

8-8: gl may be undefined, or defined from star imports

(F405)

14-14: gl may be undefined, or defined from star imports

(F405)

43-43: gl may be undefined, or defined from star imports

(F405)

47-47: gl may be undefined, or defined from star imports

(F405)

55-55: gl may be undefined, or defined from star imports

(F405)

tests/runner/mock_host.py

25-25: from origin.base_host import * used; unable to detect undefined names

(F403)

140-140: Unused method argument: mode

(ARG002)

140-140: public_abi may be undefined, or defined from star imports

(F405)

158-158: Unused method argument: type

(ARG002)

158-158: public_abi may be undefined, or defined from star imports

(F405)

158-158: Unused method argument: data

(ARG002)

158-158: collections may be undefined, or defined from star imports

(F405)

167-167: HostException may be undefined, or defined from star imports

(F405)

167-167: host_fns may be undefined, or defined from star imports

(F405)

169-169: HostException may be undefined, or defined from star imports

(F405)

169-169: host_fns may be undefined, or defined from star imports

(F405)

172-172: public_abi may be undefined, or defined from star imports

(F405)

174-174: public_abi may be undefined, or defined from star imports

(F405)

176-176: public_abi may be undefined, or defined from star imports

(F405)

tests/runner/run.py

117-117: Create your own exception

(TRY002)

117-117: Avoid specifying long messages outside the exception class

(TRY003)

127-127: Unused function argument: frame

(ARG001)

139-139: subprocess call: check for execution of untrusted input

(S603)

271-271: Parenthesize a and b expressions when chaining and and or together, to make the precedence clear

Parenthesize the and subexpression

(RUF021)

324-324: Use raise without specifying exception name

Remove exception name

(TRY201)

391-391: subprocess call: check for execution of untrusted input

(S603)

412-412: Function definition does not bind loop variable addr

(B023)

425-425: subprocess call: check for execution of untrusted input

(S603)

426-426: Starting a process with a partial executable path

(S607)

443-443: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)

471-475: Use of possibly insecure function; consider using ast.literal_eval

(S307)

481-481: Probable insecure usage of temporary file or directory: "/tmp"

(S108)

539-539: Do not catch blind exception: Exception

(BLE001)

594-594: Use explicit conversion flag

Replace with conversion flag

(RUF010)

594-594: Use explicit conversion flag

Replace with conversion flag

(RUF010)

699-699: Function definition does not bind loop variable f

(B023)

702-702: Use raise without specifying exception name

Remove exception name

(TRY201)

712-712: Function definition does not bind loop variable future

(B023)

715-715: Use raise without specifying exception name

Remove exception name

(TRY201)

libs/install/bin/genvm-detect-version-server.py

28-28: Avoid specifying long messages outside the exception class

(TRY003)

80-80: Avoid specifying long messages outside the exception class

(TRY003)

92-92: subprocess call: check for execution of untrusted input

(S603)

99-99: Do not catch blind exception: Exception

(BLE001)

154-154: Do not catch blind exception: Exception

(BLE001)

195-195: Unused function argument: signum

(ARG001)

195-195: Unused function argument: frame

(ARG001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: module-test-python

🔇 Additional comments (22)

executor/src/rt/memlimiter.rs (1)

35-38: No issues with the formatting tweak.

Purely stylistic split; behavior unchanged.

modules/install/lib/genvm-lua/lib-llm.lua (1)

173-174: Percent escaping handles gsub placeholder semantics

Nice touch on doubling the percent signs before substitution—this stops string.gsub from treating patterns like %0 or %1 as capture references, so literal % values survive intact.

executor/install/lib/python/post-install/__main__.py (3)

53-53: Whitespace-only change — no action needed.

102-105: LGTM: robust library name collection.

Handles both str and object cases cleanly.

115-117: LGTM: clearer RPATH/RUNPATH logging.

tests/runner/requirements.txt (1)

4-4: Add aiohttp for manager HTTP calls — looks good.

Pin aligns with Python 3.12 usage in CI and flake. Ensure version parity with Nix (pkgs.python312Packages.aiohttp) and pip envs to avoid drift.

flake.nix (1)

139-142: Nix shells include aiohttp — good alignment with tests.

This keeps dev/CI environments consistent with tests/runner requirements. No further changes needed.

Also applies to: 169-173

.github/workflows/release.yaml (1)

152-154: Switch to manager-driven tests — OK.

CI now runs tests via the Manager; macOS step installs aiohttp via requirements. Looks consistent with flake changes and run.py.

Also applies to: 190-195

configure.rb (1)

330-338: Codegen targets moved under tests/runner/origin — OK.

The phony codegen aggregate includes the new origin host/public ABI outputs. No issues spotted.

Also applies to: 350-355

executor/common/Cargo.toml (1)

31-31: LGTM!

The addition of clap with the derive feature is appropriate for CLI parsing support. The version (4.5.35) is current and the derive feature will enable ergonomic CLI argument parsing using derive macros.

executor/install/config/genvm.yaml (1)

11-11: Verify the narrowed log disable scope is intentional.

The log_disable configuration has been significantly narrowed from disabling multiple external modules (wasmtime*, cranelift*, tracing*, polling*, tungstenite*, tokio_tungstenite*, genvm::memlimiter*) to only disabling genvm::rt::memlimiter.

This change will enable logs from previously-silenced external dependencies, which may substantially increase log volume and noise. Confirm that:

This increased log verbosity is intentional and aligns with observability requirements

The change has been tested in a production-like environment to assess log volume impact

Log aggregation and storage systems can handle the additional throughput

modules/implementation/Cargo.toml (1)

22-22: LGTM!

The dependency additions appropriately support the new manager HTTP server functionality:

tokio "process" feature: enables spawning and managing child processes

chrono "serde" feature: enables timestamp serialization for API responses

warp (0.3.7): composable web framework for HTTP routing

bytes (1.10.1): efficient byte buffer handling

sysinfo (0.36): system information gathering (CPU, memory, processes)

libc (0.2): low-level system bindings

dashmap (6.1.0): concurrent HashMap for thread-safe state management

All versions are current and align with the PR's objective of adding a Manager HTTP server.

Also applies to: 37-37, 44-48

modules/implementation/src/main.rs (1)

5-5: LGTM!

The Manager subcommand integration follows the established pattern used by existing Web and Llm commands:

Public module declaration

CLI variant with associated args type

Command routing to the module's entrypoint function

The implementation is consistent with the codebase's CLI architecture.

Also applies to: 17-17, 35-35

executor/common/src/cancellation.rs (1)

14-23: LGTM!

The CancellationDrop wrapper correctly implements the RAII (Resource Acquisition Is Initialization) pattern, ensuring that the cleanup function is called exactly once when the value goes out of scope. The implementation is straightforward and follows Rust's drop semantics correctly.

executor/common/src/lib.rs (1)

71-91: LGTM!

The refactoring cleanly extracts configuration variable population into a reusable helper function. Benefits:

Separates concerns: variable population vs. config loading

Enables testing of variable population independently

Allows reuse in other contexts (e.g., the new Manager module)

Maintains identical functionality with proper error propagation

The implementation correctly handles executable path resolution, version injection, and environment variable mapping.

Also applies to: 93-97

modules/install/config/genvm-module-llm.yaml (1)

61-61: Gemini 2.0 Flash model update is valid. It’s generally available and retains JSON/image support while offering lower latency, stronger benchmarks, richer multimodal output, and large context windows—no code changes required.

modules/install/data/manifest.yaml (1)

1-3: Timestamp format OK; ensure consumers accept RFC3339 "Z"

Keep using RFC3339 ("...Z"). Ensure all parsers normalize "Z" to +00:00 (the Python server currently needs this fix).

executor/src/exe/mod.rs (1)

1-1: LGTM: exposing parse_version module

Matches CLI addition and decouples concerns cleanly.

modules/install/config/genvm-manager.yaml (1)

1-6: Config looks good

Values are reasonable; manifest_path aligns with modules/install. Ensure any auxiliary tools (e.g., the detect-version server) read the same manifest location.

executor/src/main.rs (1)

30-32: CLI wiring and logging override look solid

Log level override before setup is correct; new subcommand routing matches the added handler. Please just ensure logger::Level implements Clap parsing (FromStr or ValueEnum) so --log-level works as expected.

Also applies to: 54-61, 63-72

modules/implementation/src/manager/handlers.rs (2)

55-72: LGTM!

The handler correctly obtains module locks, warns if modules aren't running, and starts the GenVM. The receiver is intentionally ignored for async fire-and-forget execution.

204-212: LGTM!

The ShutdownRequest struct with default timeout is well-designed for graceful shutdown control.

doc/website/src/impl-spec/01-core-architecture/01-components.rst

executor/common/src/logger/mod.rs

executor/common/src/sync.rs

coderabbitai · 2025-10-06T15:19:45Z

executor/common/src/sync.rs

+        pub fn into_get_sub<'a, R>(self, getter: impl FnOnce(&'a T) -> R) -> DArcStruct<R>
+        where
+            R: 'a,
+        {
+            // SAFETY: actual_ptr is valid for the lifetime of self
+            let data_ref = unsafe { self.actual_ptr.as_ref() };
+            let sub = getter(data_ref);
+            let cb = self.control_block;
+            std::mem::forget(self);
+            DArcStruct {
+                control_block: cb,
+                actual_data: sub,
+            }
+        }
+
+        /// this function is unsound because forall 'a should be in function, but it's nearly impossible to type
+        pub async fn into_get_sub_async<'a, R, F>(
+            self,
+            getter: impl FnOnce(&'a T) -> F,
+        ) -> DArcStruct<R>
+        where
+            F: std::future::Future<Output = R>,
+            R: 'a,
+        {
+            // SAFETY: actual_ptr is valid for the lifetime of self
+            let data_ref = unsafe { self.actual_ptr.as_ref() };
+            let sub = getter(data_ref).await;
+            let cb = self.control_block;
+            std::mem::forget(self);
+            DArcStruct {
+                control_block: cb,
+                actual_data: sub,
+            }
+        }
+


⚠️ Potential issue | 🔴 Critical

Unsound API: into_get_sub and into_get_sub_async expose lifetime bugs

Both functions admit unsoundness. Either constrain R to not borrow from temporaries or mark these as unsafe.

Minimal mitigation:

-pub fn into_get_sub<'a, R>(self, getter: impl FnOnce(&'a T) -> R) -> DArcStruct<R> +pub unsafe fn into_get_sub<'a, R>(self, getter: impl FnOnce(&'a T) -> R) -> DArcStruct<R> @@ -pub async fn into_get_sub_async<'a, R, F>( +pub async unsafe fn into_get_sub_async<'a, R, F>(

Follow-up: document safety requirements (R must only contain data derived from T or be owned/'static). Alternatively, provide a safe variant that enforces R: 'static and recommend gep/gep_async for borrowed returns.

coderabbitai · 2025-10-06T15:19:45Z

executor/install/lib/python/post-install/__main__.py

 executor_root_dir = Path(__file__).parent.parent.parent.parent
 logger.info(f'Executor root directory: {executor_root_dir}')



⚠️ Potential issue | 🔴 Critical

executor_root_dir off-by-one (points to executor/install).

With 4 parents from this file, you end at executor/install, so bin/genvm likely won’t be found. Go up one more level.

-executor_root_dir = Path(__file__).parent.parent.parent.parent +executor_root_dir = Path(__file__).resolve().parent.parent.parent.parent.parent

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

executor_root_dir = Path(__file__).parent.parent.parent.parent

logger.info(f'Executor root directory: {executor_root_dir}')

executor_root_dir = Path(__file__).resolve().parent.parent.parent.parent.parent

logger.info(f'Executor root directory: {executor_root_dir}')

🤖 Prompt for AI Agents

In executor/install/lib/python/post-install/__main__.py around lines 54 to 56, executor_root_dir is set using Path(__file__).parent.parent.parent.parent which stops at executor/install (off-by-one); change it to climb one more level (parent.parent.parent.parent.parent) so executor_root_dir points to the repository root (so bin/genvm and other top-level paths resolve correctly).

coderabbitai · 2025-10-06T15:19:48Z

modules/implementation/src/manager/run.rs

+    let _ = unsafe { libc::fcntl(read_fd, libc::F_SETFL, libc::O_CLOEXEC) };
+
+    let read_fd = unsafe { std::os::unix::io::OwnedFd::from_raw_fd(read_fd) };
+    let write_fd = unsafe { std::os::unix::io::OwnedFd::from_raw_fd(write_fd) };
+


⚠️ Potential issue | 🟠 Major

Incorrect fcntl for close-on-exec

Use F_SETFD with FD_CLOEXEC on the read end. O_CLOEXEC with F_SETFL is wrong.

Apply this diff:

- let _ = unsafe { libc::fcntl(read_fd, libc::F_SETFL, libc::O_CLOEXEC) }; + let _ = unsafe { libc::fcntl(read_fd.as_raw_fd(), libc::F_SETFD, libc::FD_CLOEXEC) };

Note: ensure you pass a raw fd (not OwnedFd) to fcntl, or call as_raw_fd() on it.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In modules/implementation/src/manager/run.rs around lines 456 to 460, the call uses libc::fcntl with F_SETFL and O_CLOEXEC which is incorrect for setting close-on-exec; change it to call libc::fcntl(read_fd_raw, libc::F_SETFD, libc::FD_CLOEXEC) where read_fd_raw is the raw file descriptor (use read_fd as_raw_fd() if you have an OwnedFd, or pass the original raw fd before wrapping), then proceed to construct OwnedFd from_raw_fd for both read and write fds.

modules/implementation/src/manager/versioning.rs

coderabbitai · 2025-10-06T15:19:48Z

tests/runner/mock_host.py

+from origin.base_host import *
+import origin.host_fns
+import origin.public_abi



⚠️ Potential issue | 🔴 Critical

Undefined imports and names; star import. Fix imports and type hints.

public_abi and host_fns are undefined (used without origin. prefix). asyncio and collections are also not imported; annotations will raise at import time. Replace star import and alias modules.

Apply this diff to fix imports and references:

-from origin.base_host import * -import origin.host_fns -import origin.public_abi +import asyncio +from origin.base_host import ( + IHost, + HostException, + host_loop, + DefaultTransactionData, + DefaultEthTransactionData, +) +import origin.host_fns as host_fns +import origin.public_abi as public_abi +from collections.abc import Buffer

And adjust type hints/unused params:

- async def storage_read( - self, mode: public_abi.StorageType, account: bytes, slot: bytes, index: int, le: int - ) -> bytes: + async def storage_read( + self, _mode: public_abi.StorageType, account: bytes, slot: bytes, index: int, le: int + ) -> bytes: assert self.storage is not None return self.storage.read(Address(account), slot, index, le) @@ - async def storage_write( + async def storage_write( self, slot: bytes, index: int, - got: collections.abc.Buffer, + got: Buffer, ) -> None: @@ - async def consume_result( - self, type: public_abi.ResultCode, data: collections.abc.Buffer - ) -> None: + async def consume_result( + self, _type: public_abi.ResultCode, _data: Buffer + ) -> None: self._has_result = True @@ - async def get_leader_nondet_result(self, call_no: int, /) -> collections.abc.Buffer: + async def get_leader_nondet_result(self, call_no: int, /) -> Buffer: if self.leader_nondet is None: raise HostException(host_fns.Errors.I_AM_LEADER) if call_no >= len(self.leader_nondet): raise HostException(host_fns.Errors.ABSENT) res = self.leader_nondet[call_no] if res['kind'] == 'return': return bytes([public_abi.ResultCode.RETURN]) + _calldata.encode(res['value']) if res['kind'] == 'rollback': return bytes([public_abi.ResultCode.USER_ERROR]) + res['value'].encode('utf-8') if res['kind'] == 'contract_error': return bytes([public_abi.ResultCode.VM_ERROR]) + res['value'].encode('utf-8') assert False

Also applies to: 139-176

🧰 Tools

🪛 Ruff (0.13.3)

25-25: from origin.base_host import * used; unable to detect undefined names

(F403)

🤖 Prompt for AI Agents

In tests/runner/mock_host.py around lines 25-28 (and similarly for lines 139-176), replace the star import from origin.base_host with explicit imports of only the symbols you use, and import origin.public_abi and origin.host_fns as modules (e.g. import origin.public_abi as public_abi, import origin.host_fns as host_fns); also add missing imports for asyncio and collections (or typing collections where needed) so annotations don't fail at import time; update all references that currently rely on implicit origin. qualification to use the module aliases (public_abi.X, host_fns.Y), remove or fix any unused parameters and convert forward/optional type hints to string literals or from __future__ import annotations to avoid runtime evaluation, and ensure type hints reference the explicitly imported symbols rather than relying on a star import.

tests/runner/origin/base_host.py

tests/runner/run.py

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

executor/src/wasi/preview1.rs (1)

640-648: Potential panic when offset exceeds file size.

While line 640 correctly clamps remaining_len to 0 when offset > contents.len(), the slice operation at line 643 can still panic. Rust's slice indexing &contents[offset..offset] panics when offset > contents.len(), even for an empty range.

POSIX pread should return 0 bytes read when offset is at or beyond EOF, not trap. A WASM guest could legitimately probe beyond file size or pass an invalid offset.

Apply this diff to guard the slice operation:
     let remaining_len = contents.len().saturating_sub(offset);
     let mut buf_len: usize = iov.buf_len.try_into()?;
     buf_len = buf_len.min(remaining_len);
-    let cont_slice: &[u8] = &contents.as_ref()[offset..(offset + buf_len)];
-    memory.copy_from_slice(cont_slice, iov.buf.as_array(buf_len.try_into()?))?;
+    if buf_len > 0 {
+        let cont_slice: &[u8] = &contents.as_ref()[offset..(offset + buf_len)];
+        memory.copy_from_slice(cont_slice, iov.buf.as_array(buf_len.try_into()?))?;
+    }
     offset += buf_len;
     written += buf_len;

tests/runner/run.py (1)

544-556: Don’t assume e.args[-1] is a dict; avoid masking original error.

This can raise TypeError/IndexError inside the except. Safely merge extras.

-            except Exception as e:
+            except Exception as e:
                 time_elapsed = time.monotonic() - time_start
-                self.reporter.report_single(
+                extras = {}
+                try:
+                    if getattr(e, "args", None) and isinstance(e.args[-1], dict):
+                        extras = e.args[-1]
+                except Exception:
+                    extras = {}
+                self.reporter.report_single(
                     test_name,
                     {
                         'category': 'fail',
                         'steps': steps,
                         'exception': 'internal error',
                         'exc': e,
                         'elapsed': time_elapsed,
-                        **e.args[-1],
+                        **extras,
                     },
                 )
                 return

♻️ Duplicate comments (8)

tests/runner/run.py (1)

110-119: Fixed: single-read of aiohttp response body.

No more double-reading; robust check before returning parsed JSON.

modules/implementation/src/manager/handlers.rs (3)

207-227: Return non-200 on shutdown failure.

Currently returns 200 with error payload; use error propagation so Warp sets proper status.

-    match result {
-        Ok(()) => Ok(warp::reply::json(&serde_json::json!({
-            "result": "shutdown_completed",
-            "genvm_id": genvm_id
-        }))),
-        Err(e) => Ok(warp::reply::json(&serde_json::json!({
-            "error": format!("{}", e),
-            "genvm_id": genvm_id
-        }))),
-    }
+    result?;
+    Ok(warp::reply::json(&serde_json::json!({
+        "result": "shutdown_completed",
+        "genvm_id": genvm_id
+    })))

73-111: Incomplete readonly handler: placeholders and swallowed results.

host set to "TODO"
bypasses module locks (Box::new(()))
discards recv/status results
returns hardcoded {"schema":"contract_schema"}

Implement real logic or fail explicitly; placeholders will mislead clients.

-    let req = run::Request { /* ... */, host: "TODO".to_owned(), /* ... */ };
-    let (genvm_id, recv) = run::start_genvm(ctx.clone(), req, Box::new(())).await?;
-    let _ = recv.await;
-    let _ = ctx.run_ctx.get_genvm_status(genvm_id).await;
-    Ok(warp::reply::json(&serde_json::json!({"schema": "contract_schema"})))
+    // Option A: implement actual readonly introspection (preferred)
+    // - Acquire module locks if required
+    // - Use a dedicated executor subcommand to extract schema without a host
+    // - Return parsed schema JSON
+    // Option B: until implemented, return 501
+    anyhow::bail!("genvm_run_readonly_not_implemented");

Or, if you must start a GenVM here, supply a valid host, pass real module locks via Ctx::get_module_locks, and propagate recv/status errors.

154-173: Critical: unauthenticated arbitrary env var setter.

Allows setting any OS env via HTTP; high-risk (code execution/creds exposure). Gate with auth and whitelist keys.

-pub async fn handle_set_env(
-    _ctx: sync::DArc<AppContext>,
+const ALLOWED_ENV_VARS: &[&str] = &["GENVM_LOG_LEVEL", "GENVM_CONFIG"];
+pub async fn handle_set_env(
+    _ctx: sync::DArc<AppContext>,
     data: serde_json::Value,
 ) -> Result<impl warp::Reply> {
-    let key = data
+    let key = data
         .get("key")
         .and_then(|v| v.as_str())
         .ok_or_else(|| anyhow::anyhow!("invalid env var key"))?;
+    if !ALLOWED_ENV_VARS.contains(&key) {
+        anyhow::bail!("env_var_not_allowed");
+    }
     let value = data
         .get("value")
         .and_then(|v| v.as_str())
         .ok_or_else(|| anyhow::anyhow!("invalid env var value"))?;
     std::env::set_var(key, value);
     Ok(warp::reply::json(
         &serde_json::json!({"result": "env_var_set", "key": key, "value": value}),
     ))
 }

Also add auth/authorization before applying changes.

modules/implementation/src/manager/mod.rs (2)

96-103: Invalid return type for unwrap_all_anyhow (Extract associated type).

Use Extract = (R,) to satisfy Warp’s trait bounds.

-fn unwrap_all_anyhow<R: warp::Reply + 'static>(
+fn unwrap_all_anyhow<R: warp::Reply + 'static>(
     route: impl warp::Filter<Extract = (anyhow::Result<R>,), Error = warp::Rejection>
         + Clone
         + Send
         + Sync
         + 'static,
-) -> impl warp::Filter<Error = warp::Rejection, Extract: warp::Reply> + Clone + Send + Sync + 'static
+) -> impl warp::Filter<Extract = (R,), Error = warp::Rejection> + Clone + Send + Sync + 'static
 {

327-331: Avoid unwrap on host IP; propagate parse errors.

Prevent panic on invalid args.host.

-    let (addr, fut) = serv.bind_with_graceful_shutdown(
-        (args.host.parse::<std::net::IpAddr>().unwrap(), args.port),
+    let ip: std::net::IpAddr = args
+        .host
+        .parse()
+        .with_context(|| format!("invalid host IP: {}", args.host))?;
+    let (addr, fut) = serv.bind_with_graceful_shutdown(
+        (ip, args.port),
         async move { cancellation.chan.closed().await },
     );

modules/implementation/src/manager/run.rs (2)

79-86: Stop leaking throttled permits

We’re still calling mem::forget on owned permits when shrinking the pool. That leaks the permit handle forever and prevents us from releasing those tokens when the limit goes back up. Please retain the permits in a tracked collection and drop them when increasing the limit, as suggested earlier. For example:

 pub struct Ctx {
     known_executions: dashmap::DashMap<GenVMId, sync::DArc<SingleGenVMContext>>,
     next_genvm_id: std::sync::atomic::AtomicU64,
     pub permits: Arc<tokio::sync::Semaphore>,
     pub max_permits: tokio::sync::Mutex<usize>,
+    throttled_permits: tokio::sync::Mutex<Vec<tokio::sync::OwnedSemaphorePermit>>,
@@
-        if *permits_lock > permits {
-            let delta = *permits_lock - permits;
-            let p = self.permits.acquire_many(delta as u32).await.unwrap();
-            std::mem::forget(p);
+        if *permits_lock > permits {
+            let delta = *permits_lock - permits;
+            let mut buf = self.throttled_permits.lock().await;
+            for _ in 0..delta {
+                let p = self.permits.clone().acquire_owned().await.unwrap();
+                buf.push(p);
+            }
         } else {
             let delta = permits - *permits_lock;
-            self.permits.add_permits(delta);
+            let mut buf = self.throttled_permits.lock().await;
+            let mut remaining = delta;
+            while remaining > 0 {
+                if let Some(p) = buf.pop() {
+                    drop(p);
+                    remaining -= 1;
+                } else {
+                    self.permits.add_permits(remaining);
+                    break;
+                }
+            }
         }

Don’t forget to initialize throttled_permits in Ctx::new().

460-460: Use FD_CLOEXEC when setting close-on-exec

The fcntl call still passes O_CLOEXEC. With F_SETFD you must use FD_CLOEXEC; otherwise the close-on-exec flag isn’t set correctly.

-    let _ = unsafe { libc::fcntl(read_fd, libc::F_SETFD, libc::O_CLOEXEC) };
+    let _ = unsafe { libc::fcntl(read_fd, libc::F_SETFD, libc::FD_CLOEXEC) };

That keeps the read end from leaking into the child after exec.

🧹 Nitpick comments (13)

modules/implementation/src/scripting/ctx/filters.rs (1)
21-23: Remove unnecessary .peekable() call.

The refactoring from while let to for ch in is good, but since .peek() is never called in the function body, the .peekable() adapter is now dead code and should be removed for clarity.

Apply this diff:
-    let chars_iter = input.chars().peekable();
+    let chars_iter = input.chars();
modules/install/config/genvm-llm-default.lua (1)
59-61: Consider removing commented-out code.

The commented-out lines suggest this is a transitional state. If the new retry behavior is confirmed and permanent, remove lines 60-61 to keep the codebase clean.

Apply this diff to remove the dead code:
-			-- lets fall back to retry
-			-- as_user_error.fatal = true
-			-- lib.rs.user_error(as_user_error)
tests/runner/origin/base_host.py (3)
116-138: Recommend adding timeout to aiohttp request.

The aiohttp.request call at Line 119 lacks a timeout parameter. If the manager becomes unresponsive, this request could hang indefinitely, blocking test execution.

Apply this diff to add a timeout:
 async def get_pre_deployment_writes(
 	code: bytes, timestamp: datetime.datetime, manager_uri: str
 ) -> list[tuple[bytes, int, bytes]]:
 	async with aiohttp.request(
 		'POST',
+		timeout=aiohttp.ClientTimeout(total=30),
 		headers={
 			'Deployment-Timestamp': timestamp.astimezone(datetime.UTC).isoformat(
 				timespec='milliseconds'
 			),
 		},
 		url=f'{manager_uri}/contract/pre-deploy-writes',
 		data=code,
 	) as resp:
386-386: Hardcoded datetime fallback may mask missing test data.

The hardcoded fallback '2024-11-26T06:42:42.424242Z' at Line 386 could hide cases where the test message is missing the required datetime field. Consider whether this should raise an error or log a warning instead.

437-449: Add timeout to status polling request.

The aiohttp.request call at Line 437 lacks a timeout parameter. Status polling should have a reasonable timeout to prevent indefinite hangs.

Apply this diff:
 		async with aiohttp.request(
 			'GET',
+			timeout=aiohttp.ClientTimeout(total=30),
 			f'{manager_uri}/genvm/{genvm_id}',
 		) as resp:
modules/implementation/Cargo.toml (1)

44-49: New deps LGTM; verify sysinfo features.

warp/bytes/dashmap/sha3 choices are standard.

sysinfo 0.36 with default-features = false, features = ["system"] should cover memory stats; ensure no code path needs “process” or other features.

Based on learnings
tests/runner/run.py (3)
128-133: Silence unused arg; keep signature.

Rename frame to _frame to satisfy linters.
-    def stop_handler(_, frame):
+    def stop_handler(_, _frame):
272-279: Clarify precedence in condition.

Parenthesize to satisfy Ruff and avoid ambiguity.
-            if res['category'] == 'fail' and 'steps' in res or self.config.args.show_steps:
+            if (res['category'] == 'fail' and 'steps' in res) or self.config.args.show_steps:
475-480: Avoid eval for untrusted inputs (use literal_eval or restrict env).

If only literals are expected, switch to ast.literal_eval; otherwise tightly scope globals.
+        import ast
         calldata_bytes = calldata.encode(
-            eval(
-                single_conf['calldata'],
-                globals(),
-                single_conf['vars'].copy(),
-            )
+            ast.literal_eval(single_conf['calldata'])
         )
If expressions are required, consider eval with an empty globals and a minimal locals whitelist.
modules/implementation/src/manager/handlers.rs (1)

247-291: Confirm code-slot derivation and size encoding meet spec.

code_slot = Sha3([0u8;32] || 1u32le) is constant; ensure this is the intended slot for all contracts.

code length is stored as u32; guard against len > u32::MAX (unlikely, but add sanity check).
modules/implementation/src/manager/versioning.rs (3)
189-194: Check child exit status; surface failure.

Ensure parse helper exited successfully before using result.
-    let detected_version = task.await;
-
-    let _ = proc.wait().await;
-
-    detected_version.map(|v| v.min(possible_major))
+    let detected_version = task.await;
+    let status = proc.wait().await?;
+    if !status.success() {
+        anyhow::bail!("genvm parse-version-pattern failed with status {}", status);
+    }
+    detected_version.map(|v| v.min(possible_major))
111-122: Patch-raising loop likely redundant.

You already select max version by (major, minor, patch); the loop incrementing patch adds no value. Consider removing for simplicity.
-        loop {
-            let mut next = ver;
-            next.patch += 1;
-            if lock.executor_versions.get(&next).is_some() {
-                ver = next;
-            } else {
-                break;
-            }
-        }
+        // ver is already the highest <= timestamp for the requested major.
180-185: Nit: use char in strip_prefix.

Minor perf/readability.
-        let res = res.strip_prefix("v").unwrap_or(res);
+        let res = res.strip_prefix('v').unwrap_or(res);

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between decff4a and 4490c1a.

⛔ Files ignored due to path filters (2)

executor/common/Cargo.lock is excluded by !**/*.lock
modules/implementation/Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (13)

doc/website/src/impl-spec/01-core-architecture/01-components.rst (3 hunks)
executor/common/src/sync.rs (6 hunks)
executor/src/wasi/preview1.rs (1 hunks)
modules/implementation/Cargo.toml (2 hunks)
modules/implementation/fuzz/genvm-modules-normalize-whitespace.rs (1 hunks)
modules/implementation/src/manager/handlers.rs (1 hunks)
modules/implementation/src/manager/mod.rs (1 hunks)
modules/implementation/src/manager/run.rs (1 hunks)
modules/implementation/src/manager/versioning.rs (1 hunks)
modules/implementation/src/scripting/ctx/filters.rs (2 hunks)
modules/install/config/genvm-llm-default.lua (1 hunks)
tests/runner/origin/base_host.py (13 hunks)
tests/runner/run.py (4 hunks)

🧰 Additional context used

🧬 Code graph analysis (7)

modules/implementation/src/manager/run.rs (2)

executor/common/src/logger/mod.rs (3)

std (82-82)

fmt (22-30)

from_str (36-46)

executor/common/src/sync.rs (9)

into_arc (33-42)

new (156-184)

new (377-382)

new (422-424)

new (459-461)

Self (161-161)

drop (118-136)

drop (320-338)

ptr (174-174)

tests/runner/run.py (4)

tests/runner/origin/logger.py (9)

error (22-23)

Logger (6-26)

log (8-8)

log (40-41)

log (48-49)

log (68-78)

with_keys (25-26)

with_keys (43-44)

with_keys (51-52)

runners/genlayer-py-std/src/genlayer/py/types.py (1)

Address (122-254)

tests/runner/mock_host.py (3)

MockHost (55-215)

MockStorage (30-52)

write (41-52)

tests/runner/origin/base_host.py (2)

get_pre_deployment_writes (116-137)

run_genvm (360-500)

tests/runner/origin/base_host.py (4)

tests/runner/origin/logger.py (6)

Logger (6-26)

NoLogger (47-52)

trace (10-11)

debug (13-14)

warning (19-20)

error (22-23)

tests/runner/origin/host_fns.py (2)

Errors (25-30)

Methods (7-22)

tests/runner/origin/public_abi.py (2)

StorageType (14-17)

ResultCode (7-11)

tests/runner/mock_host.py (8)

loop_enter (115-134)

get_calldata (136-137)

consume_result (157-160)

get_leader_nondet_result (165-177)

post_nondet_result (179-180)

consume_gas (206-207)

get_balance (209-210)

remaining_fuel_as_gen (145-146)

modules/implementation/src/manager/versioning.rs (1)

modules/implementation/src/manager/run.rs (2)

tokio (512-512)

new (181-217)

modules/implementation/src/manager/handlers.rs (4)

executor/common/src/logger/mod.rs (2)

std (82-82)

from_str (36-46)

modules/implementation/src/manager/modules.rs (3)

get_status (133-149)

get_module_locks (151-167)

new (30-36)

modules/implementation/src/manager/versioning.rs (3)

res (184-184)

new (68-74)

detect_major_spec (125-194)

modules/implementation/src/manager/run.rs (2)

start_genvm (404-537)

new (181-217)

executor/common/src/sync.rs (1)

modules/implementation/src/manager/versioning.rs (1)

new (68-74)

modules/implementation/src/manager/mod.rs (6)

executor/common/src/sync.rs (4)

new (156-184)

new (377-382)

new (422-424)

new (459-461)

modules/implementation/src/manager/run.rs (2)

new (181-217)

start_service (241-260)

modules/implementation/src/manager/versioning.rs (1)

new (68-74)

modules/implementation/src/manager/modules.rs (1)

new (30-36)

modules/implementation/src/manager/handlers.rs (16)

serde_json (28-28)

serde_json (45-45)

handle_status (12-17)

handle_module_start (41-52)

handle_module_stop (24-39)

handle_genvm_run (54-71)

handle_genvm_run_readonly (73-111)

handle_contract_detect_version (113-124)

handle_set_log_level (126-144)

handle_manifest_reload (146-152)

handle_set_env (154-173)

handle_get_permits (175-178)

handle_set_permits (180-195)

handle_genvm_shutdown (207-227)

handle_genvm_status (229-239)

handle_make_deployment_storage_writes (247-291)

executor/common/src/lib.rs (1)

load_config (93-107)

🪛 Ruff (0.13.3)

tests/runner/run.py

118-118: Create your own exception

(TRY002)

118-118: Avoid specifying long messages outside the exception class

(TRY003)

128-128: Unused function argument: frame

(ARG001)

140-140: subprocess call: check for execution of untrusted input

(S603)

272-272: Parenthesize a and b expressions when chaining and and or together, to make the precedence clear

Parenthesize the and subexpression

(RUF021)

325-325: Use raise without specifying exception name

Remove exception name

(TRY201)

392-392: subprocess call: check for execution of untrusted input

(S603)

430-430: subprocess call: check for execution of untrusted input

(S603)

431-431: Starting a process with a partial executable path

(S607)

448-448: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)

476-480: Use of possibly insecure function; consider using ast.literal_eval

(S307)

486-486: Probable insecure usage of temporary file or directory: "/tmp"

(S108)

544-544: Do not catch blind exception: Exception

(BLE001)

599-599: Use explicit conversion flag

Replace with conversion flag

(RUF010)

599-599: Use explicit conversion flag

Replace with conversion flag

(RUF010)

704-704: Function definition does not bind loop variable f

(B023)

707-707: Use raise without specifying exception name

Remove exception name

(TRY201)

717-717: Function definition does not bind loop variable future

(B023)

720-720: Use raise without specifying exception name

Remove exception name

(TRY201)

tests/runner/origin/base_host.py

130-130: Create your own exception

(TRY002)

130-130: Avoid specifying long messages outside the exception class

(TRY003)

371-371: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

408-408: f-string without any placeholders

Remove extraneous f prefix

(F541)

410-410: Create your own exception

(TRY002)

410-410: Avoid specifying long messages outside the exception class

(TRY003)

415-415: Store a reference to the return value of asyncio.ensure_future

(RUF006)

474-474: Do not catch blind exception: Exception

(BLE001)

478-478: Do not catch blind exception: Exception

(BLE001)

482-482: Create your own exception

(TRY002)

500-500: Create your own exception

(TRY002)

500-500: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (14)

modules/implementation/src/scripting/ctx/filters.rs (1)

85-88: Stylistic change from map_or to if let.

The change from c.width().map_or(false, |w| w > 0) to the if let Some(w) form is functionally equivalent. Both are valid; the if let version is slightly more verbose but arguably more explicit.

modules/install/config/genvm-llm-default.lua (1)

58-62: Verify that all non-overload provider failures should be retryable.

The change converts non-overload provider failures from fatal errors to retryable conditions. Confirm that non-transient errors (authentication failures, invalid configuration, etc.) won't be masked by the generic NO_PROVIDER_FOR_PROMPT error raised after exhausting all providers.

Consider documenting this behavior change and adding metrics or structured logging to track which error types trigger retries versus immediate failure.

modules/implementation/fuzz/genvm-modules-normalize-whitespace.rs (2)

89-92: LGTM! Cleaner argument passing.

Passing s directly instead of &s avoids an unnecessary borrow layer (since s is already &str). This is cleaner and more direct.

94-96: LGTM! Idiomatic empty check.

Using is_empty() instead of res.len() == 0 follows Rust conventions and improves readability.

executor/src/wasi/preview1.rs (1)

640-640: LGTM: Cleaner boundary handling.

The change from checked_sub(offset).unwrap_or(0) to saturating_sub(offset) is more idiomatic and achieves the same result—clamping underflow to 0 when offset exceeds content length.

tests/runner/origin/base_host.py (7)

9-22: LGTM!

The new imports (aiohttp, host_fns, public_abi, Logger) are appropriate for the remote GenVM manager orchestration and logging integration.

25-29: LGTM!

The HostException constructor correctly uses host_fns.Errors for the error_code parameter, aligning with the new module structure.

53-61: LGTM!

The storage_read signature correctly uses public_abi.StorageType for the mode parameter, aligning with the public ABI types.

72-74: LGTM!

The consume_result signature correctly uses public_abi.ResultCode for the type parameter, aligning with the public ABI types.

140-146: LGTM!

The logger parameter integration is clean, and the trace logging at loop entry/exit provides useful debugging information.

172-335: LGTM!

The method dispatch logic correctly uses host_fns.Methods enum constants throughout and consistently returns host_fns.Errors.OK on success paths. The type conversions to public_abi.StorageType (Line 186) and public_abi.ResultCode (Line 212) are also correct.

345-358: LGTM!

The error handling for transport-level exceptions (aiohttp.ClientError, asyncio.TimeoutError) has been correctly implemented, addressing the concern from the previous review. Cleanup failures are now logged without propagating and aborting the main execution.

modules/implementation/Cargo.toml (2)

22-22: Tokio “process” feature addition looks correct.

Matches use of tokio::process in manager; no issues.

37-37: Enabling chrono “serde” is appropriate.

Required for DateTime (versioning/manifests); good.

executor/common/src/sync.rs

coderabbitai · 2025-10-07T15:32:09Z

tests/runner/origin/base_host.py

+	message: typing.Any,
+	host_data: str = '',
+	host: str,
+	extra_args: list[str] = [],


⚠️ Potential issue | 🟠 Major

Fix mutable default argument.

Using a mutable list [] as a default argument can cause subtle bugs if the list is mutated across multiple function calls. Python evaluates default arguments once at function definition time, not at call time.

Apply this diff to fix:

- extra_args: list[str] = [], + extra_args: list[str] | None = None, ) -> RunHostAndProgramRes: + if extra_args is None: + extra_args = []

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

extra_args: list[str] = [],

extra_args: list[str] | None = None,

) -> RunHostAndProgramRes:

if extra_args is None:

extra_args = []

🧰 Tools

🪛 Ruff (0.13.3)

371-371: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

🤖 Prompt for AI Agents

In tests/runner/origin/base_host.py around line 371, the function parameter uses a mutable default argument extra_args: list[str] = [], which can be shared across calls; change the signature to accept None (e.g., extra_args: Optional[list[str]] = None) and inside the function set extra_args = [] if extra_args is None so each call gets a fresh list, preserving the type hint and behavior.

coderabbitai · 2025-10-07T15:32:09Z

tests/runner/origin/base_host.py

+			async with aiohttp.request(
+				'POST',
+				f'{manager_uri}/genvm/run',
+				json={
+					'major': 0,  # FIXME
+					'message': message,
+					'is_sync': is_sync,
+					'capture_output': capture_output,
+					'host_data': host_data,
+					'max_execution_minutes': max_exec_mins,  # this parameter is needed to prevent zombie genvms
+					'timestamp': timestamp,
+					'host': host,
+					'extra_args': extra_args,
+				},
+			) as resp:
+				logger.debug('post /genvm/run', status=resp.status)
+				data = await resp.json()
+				logger.trace('post /genvm/run', body=data)
+				if resp.status != 200:
+					logger.error(
+						f'genvm manager /genvm/run failed', status=resp.status, body=data
+					)
+					raise Exception(f'genvm manager /genvm/run failed: {resp.status} {data}')
+				else:
+					genvm_id = data['id']
+					logger.debug('genvm manager /genvm', genvm_id=genvm_id, status=resp.status)
+					genvm_id_cell[0] = genvm_id
+					asyncio.ensure_future(wrap_timeout(genvm_id))


⚠️ Potential issue | 🟠 Major

Add timeout to manager request and store ensure_future result.

Two issues:

The aiohttp.request call at Line 388 lacks a timeout, which could cause test hangs if the manager is unresponsive.

Line 415: The result of asyncio.ensure_future(wrap_timeout(genvm_id)) is not stored, which can lead to task reference leaks and makes it impossible to cancel or check the timeout task status.

Apply this diff:

async with aiohttp.request( 'POST', + timeout=aiohttp.ClientTimeout(total=60), f'{manager_uri}/genvm/run', json={ 'major': 0, # FIXME 'message': message, 'is_sync': is_sync, 'capture_output': capture_output, 'host_data': host_data, 'max_execution_minutes': max_exec_mins, # this parameter is needed to prevent zombie genvms 'timestamp': timestamp, 'host': host, 'extra_args': extra_args, }, ) as resp: logger.debug('post /genvm/run', status=resp.status) data = await resp.json() logger.trace('post /genvm/run', body=data) if resp.status != 200: logger.error( f'genvm manager /genvm/run failed', status=resp.status, body=data ) raise Exception(f'genvm manager /genvm/run failed: {resp.status} {data}') else: genvm_id = data['id'] logger.debug('genvm manager /genvm', genvm_id=genvm_id, status=resp.status) genvm_id_cell[0] = genvm_id - asyncio.ensure_future(wrap_timeout(genvm_id)) + timeout_task = asyncio.ensure_future(wrap_timeout(genvm_id))

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 Ruff (0.13.3)

408-408: f-string without any placeholders

Remove extraneous f prefix

(F541)

410-410: Create your own exception

(TRY002)

410-410: Avoid specifying long messages outside the exception class

(TRY003)

415-415: Store a reference to the return value of asyncio.ensure_future

(RUF006)

🤖 Prompt for AI Agents

In tests/runner/origin/base_host.py around lines 388 to 415, the POST to manager lacks a request timeout and the ensure_future call is not stored; update the aiohttp.request call to include a ClientTimeout (e.g., aiohttp.ClientTimeout(total=...) or an appropriate per-request timeout) so the test won’t hang if the manager is unresponsive, and capture the return of asyncio.ensure_future(wrap_timeout(genvm_id)) into a variable or a pre-existing cell/list (so the task reference can be cancelled or inspected later) instead of discarding it.

coderabbitai · 2025-10-07T15:32:09Z

tests/runner/origin/base_host.py

+				if resp.status != 200 and body['status'] is not None:
+					new_res = Exception(f'genvm manager /genvm failed: {resp.status} {body}')
+				else:
+					new_res = typing.cast(dict, body['status'])
+			status_cell[0] = new_res
+			return new_res


⚠️ Potential issue | 🔴 Critical

Fix status check logic error.

Line 444 checks resp.status != 200 and body['status'] is not None, which creates an error only when both conditions are true. This logic is backwards:

If resp.status != 200, it's an error regardless of body['status'].

If resp.status == 200 but body['status'] is None, that's also likely an error (incomplete response).

The condition should use OR logic, not AND.

Apply this diff to fix the logic:

logger.debug('get /genvm', genvm_id=genvm_id, status=resp.status) body = await resp.json() logger.trace('get /genvm', genvm_id=genvm_id, body=body) - if resp.status != 200 and body['status'] is not None: + if resp.status != 200 or body['status'] is None: new_res = Exception(f'genvm manager /genvm failed: {resp.status} {body}') else: new_res = typing.cast(dict, body['status'])

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if resp.status != 200 and body['status'] is not None:

new_res = Exception(f'genvm manager /genvm failed: {resp.status} {body}')

else:

new_res = typing.cast(dict, body['status'])

status_cell[0] = new_res

return new_res

logger.debug('get /genvm', genvm_id=genvm_id, status=resp.status)

body = await resp.json()

logger.trace('get /genvm', genvm_id=genvm_id, body=body)

if resp.status != 200 or body['status'] is None:

new_res = Exception(f'genvm manager /genvm failed: {resp.status} {body}')

else:

new_res = typing.cast(dict, body['status'])

status_cell[0] = new_res

return new_res

🤖 Prompt for AI Agents

In tests/runner/origin/base_host.py around lines 444 to 449, the status-check uses "resp.status != 200 and body['status'] is not None" which only treats errors when both are true; change the condition to use OR so it treats any non-200 response or a missing body['status'] as an error. Update the if to "if resp.status != 200 or body['status'] is None" and construct the Exception in that branch, otherwise cast and return body['status'] as before so status_cell[0] is set to the dict on success.

coderabbitai · 2025-10-07T15:32:09Z

tests/runner/origin/base_host.py

+	async def prob_died():
+		await asyncio.wait(
+			[
+				asyncio.ensure_future(asyncio.sleep(1)),
+				asyncio.ensure_future(cancellation_event.wait()),
+			],
 			return_when=asyncio.FIRST_COMPLETED,
 		)
-		if coro_loop in done:
-			await wait_all_timeout()
+		genvm_id = genvm_id_cell[0]
+		if genvm_id is None:
+			return
+		status = await poll_status(genvm_id)
+		if status is not None and not cancellation_event.is_set():
+			logger.error('genvm died without connecting', genvm_id=genvm_id, status=status)
+			cancellation_event.set()



🛠️ Refactor suggestion | 🟠 Major

Clarify prob_died type checking logic.

At Line 463, status is checked as status is not None, but status can be either a dict or an Exception (per lines 445-447). The check should distinguish between these types more explicitly to avoid confusion.

Apply this diff for clarity:

async def prob_died(): await asyncio.wait( [ asyncio.ensure_future(asyncio.sleep(1)), asyncio.ensure_future(cancellation_event.wait()), ], return_when=asyncio.FIRST_COMPLETED, ) genvm_id = genvm_id_cell[0] if genvm_id is None: return status = await poll_status(genvm_id) - if status is not None and not cancellation_event.is_set(): + if isinstance(status, (dict, Exception)) and not cancellation_event.is_set(): logger.error('genvm died without connecting', genvm_id=genvm_id, status=status) cancellation_event.set()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def prob_died():

await asyncio.wait(

[

asyncio.ensure_future(asyncio.sleep(1)),

asyncio.ensure_future(cancellation_event.wait()),

],

return_when=asyncio.FIRST_COMPLETED,

)

if coro_loop in done:

await wait_all_timeout()

genvm_id = genvm_id_cell[0]

if genvm_id is None:

return

status = await poll_status(genvm_id)

if status is not None and not cancellation_event.is_set():

logger.error('genvm died without connecting', genvm_id=genvm_id, status=status)

cancellation_event.set()

async def prob_died():

await asyncio.wait(

[

asyncio.ensure_future(asyncio.sleep(1)),

asyncio.ensure_future(cancellation_event.wait()),

],

return_when=asyncio.FIRST_COMPLETED,

)

genvm_id = genvm_id_cell[0]

if genvm_id is None:

return

status = await poll_status(genvm_id)

if isinstance(status, (dict, Exception)) and not cancellation_event.is_set():

logger.error('genvm died without connecting', genvm_id=genvm_id, status=status)

cancellation_event.set()

🤖 Prompt for AI Agents

In tests/runner/origin/base_host.py around lines 451 to 466, the current check "status is not None" is ambiguous because poll_status can return either a dict or an Exception; change the logic to explicitly check types: treat a successful status only when isinstance(status, dict) (and then log the genvm died with that dict and set the cancellation_event), and add a separate branch for isinstance(status, Exception) to log the exception (and also set cancellation_event if appropriate). Use isinstance checks rather than None checks so the behavior is clear and correct.

coderabbitai · 2025-10-07T15:32:09Z

tests/runner/origin/base_host.py

+	fut_host = asyncio.ensure_future(wrap_host())
+	fut_proc = asyncio.ensure_future(wrap_proc())
+	await asyncio.wait([fut_host, fut_proc, asyncio.ensure_future(prob_died())])

+	exceptions: list[Exception] = []
+	try:
+		fut_host.result()
+	except Exception as e:
+		exceptions.append(e)
 	try:
-		await coro_loop
-	except ConnectionResetError:
-		pass
-	except (Exception, asyncio.CancelledError) as e:
-		errors.append(e)
-
-	exit_code = await process.wait()
-
-	if not handler.has_result():
-		if (
-			deadline_future is None
-			or deadline_future is not None
-			and deadline_future not in done
-		):
-			errors.append(Exception('no result provided'))
-		else:
-			await handler.consume_result(ResultCode.VM_ERROR, b'timeout')
-
-	result = RunHostAndProgramRes(
-		b''.join(stdout).decode(),
-		b''.join(stderr).decode(),
-		b''.join(genvm_log).decode(),
-	)
-
-	if len(errors) > 0:
-		raise Exception(
-			*errors,
-			{
-				'stdout': result.stdout,
-				'stderr': result.stderr,
-				'genvm_log': result.genvm_log,
-			},
-		) from errors[0]
-
-	return result
+		fut_proc.result()
+	except Exception as e:
+		exceptions.append(e)
+
+	if len(exceptions) > 0:
+		raise Exception(*exceptions) from exceptions[0]
+
+	genvm_id = genvm_id_cell[0]
+	if genvm_id is not None:
+		await _send_timeout(manager_uri, genvm_id, logger)
+
+		status = await poll_status(genvm_id)
+		if isinstance(status, Exception):
+			exceptions.append(status)
+		if len(exceptions) > 0:
+			final_exception = Exception('execution failed', exceptions[1:])
+			raise final_exception from exceptions[0]
+		return RunHostAndProgramRes(
+			stdout=status['stdout'],
+			stderr=status['stderr'],
+			genvm_log='# currently absent',
+		)
+
+	raise Exception('Execution failed')


⚠️ Potential issue | 🟠 Major

Simplify exception aggregation and finalization logic.

The exception handling flow (Lines 471-500) is convoluted:

Lines 471-479: Collect exceptions from fut_host and fut_proc.

Lines 481-482: Raise if exceptions exist.

Lines 484-493: Poll status again and potentially raise again.

Line 500: Raise if genvm_id is None.

This creates several issues:

Lines 481-482 raise immediately, making lines 484-500 unreachable in error cases.

Line 492: Creates Exception('execution failed', exceptions[1:]) but exceptions[1:] ignores the first exception in the error message.

The flow doesn't clearly separate "task failed" vs "status retrieval failed" cases.

Consider refactoring to:

Collect all exceptions without early raises.

Poll status only once at the end.

Aggregate all exceptions and raise once with clear context.

Example refactor:

fut_host = asyncio.ensure_future(wrap_host()) fut_proc = asyncio.ensure_future(wrap_proc()) await asyncio.wait([fut_host, fut_proc, asyncio.ensure_future(prob_died())]) exceptions: list[Exception] = [] try: fut_host.result() except Exception as e: exceptions.append(e) try: fut_proc.result() except Exception as e: exceptions.append(e) - - if len(exceptions) > 0: - raise Exception(*exceptions) from exceptions[0] - + genvm_id = genvm_id_cell[0] - if genvm_id is not None: + if genvm_id is None: + exceptions.append(Exception('GenVM ID not set - execution did not start')) + else: await _send_timeout(manager_uri, genvm_id, logger) - status = await poll_status(genvm_id) if isinstance(status, Exception): exceptions.append(status) - if len(exceptions) > 0: - final_exception = Exception('execution failed', exceptions[1:]) - raise final_exception from exceptions[0] - return RunHostAndProgramRes( + elif len(exceptions) == 0: + return RunHostAndProgramRes( stdout=status['stdout'], stderr=status['stderr'], genvm_log='# currently absent', ) - - raise Exception('Execution failed') + + # Raise all collected exceptions + if len(exceptions) > 0: + final_exception = Exception('GenVM execution failed', *exceptions[1:]) + raise final_exception from exceptions[0]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

fut_host = asyncio.ensure_future(wrap_host())

fut_proc = asyncio.ensure_future(wrap_proc())

await asyncio.wait([fut_host, fut_proc, asyncio.ensure_future(prob_died())])

exceptions: list[Exception] = []

try:

fut_host.result()

except Exception as e:

exceptions.append(e)

try:

await coro_loop

except ConnectionResetError:

pass

except (Exception, asyncio.CancelledError) as e:

errors.append(e)

exit_code = await process.wait()

if not handler.has_result():

if (

deadline_future is None

or deadline_future is not None

and deadline_future not in done

):

errors.append(Exception('no result provided'))

else:

await handler.consume_result(ResultCode.VM_ERROR, b'timeout')

result = RunHostAndProgramRes(

b''.join(stdout).decode(),

b''.join(stderr).decode(),

b''.join(genvm_log).decode(),

)

if len(errors) > 0:

raise Exception(

*errors,

{

'stdout': result.stdout,

'stderr': result.stderr,

'genvm_log': result.genvm_log,

},

) from errors[0]

return result

fut_proc.result()

except Exception as e:

exceptions.append(e)

if len(exceptions) > 0:

raise Exception(*exceptions) from exceptions[0]

genvm_id = genvm_id_cell[0]

if genvm_id is not None:

await _send_timeout(manager_uri, genvm_id, logger)

status = await poll_status(genvm_id)

if isinstance(status, Exception):

exceptions.append(status)

if len(exceptions) > 0:

final_exception = Exception('execution failed', exceptions[1:])

raise final_exception from exceptions[0]

return RunHostAndProgramRes(

stdout=status['stdout'],

stderr=status['stderr'],

genvm_log='# currently absent',

)

raise Exception('Execution failed')

fut_host = asyncio.ensure_future(wrap_host())

fut_proc = asyncio.ensure_future(wrap_proc())

await asyncio.wait([fut_host, fut_proc, asyncio.ensure_future(prob_died())])

exceptions: list[Exception] = []

try:

fut_host.result()

except Exception as e:

exceptions.append(e)

try:

fut_proc.result()

except Exception as e:

exceptions.append(e)

genvm_id = genvm_id_cell[0]

if genvm_id is None:

exceptions.append(Exception('GenVM ID not set - execution did not start'))

else:

await _send_timeout(manager_uri, genvm_id, logger)

status = await poll_status(genvm_id)

if isinstance(status, Exception):

exceptions.append(status)

elif len(exceptions) == 0:

return RunHostAndProgramRes(

stdout=status['stdout'],

stderr=status['stderr'],

genvm_log='# currently absent',

)

# Raise all collected exceptions

if len(exceptions) > 0:

final_exception = Exception('GenVM execution failed', *exceptions[1:])

raise final_exception from exceptions[0]

🧰 Tools

🪛 Ruff (0.13.3)

474-474: Do not catch blind exception: Exception

(BLE001)

478-478: Do not catch blind exception: Exception

(BLE001)

482-482: Create your own exception

(TRY002)

500-500: Create your own exception

(TRY002)

500-500: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

In tests/runner/origin/base_host.py around lines 467-500, remove the early raise after collecting fut_host/fut_proc exceptions and instead collect exceptions into a single list, then proceed to handle genvm_id and poll_status once; if poll_status returns an Exception append it to the same list; after all work is done, if the exceptions list is non-empty raise a single aggregated Exception('execution failed', exceptions) chained from exceptions[0]; if no exceptions and genvm_id is present return the RunHostAndProgramRes as before; if genvm_id is None and there are no prior exceptions raise a clear Exception('execution failed: missing genvm_id').

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (8)

tests/runner/origin/base_host.py (3)
371-371: Fix mutable default argument.

Using a mutable list [] as a default argument can cause subtle bugs if the list is mutated across multiple function calls. Python evaluates default arguments once at function definition time, not at call time.

Apply this diff:
-	extra_args: list[str] = [],
+	extra_args: list[str] | None = None,
 ) -> RunHostAndProgramRes:
+	if extra_args is None:
+		extra_args = []
388-415: Add timeout to manager request and store ensure_future result.

Two issues:

The aiohttp.request call at Line 388 lacks a timeout, which could cause test hangs if the manager is unresponsive.

Line 415: The result of asyncio.ensure_future(wrap_timeout(genvm_id)) is not stored, which can lead to task reference leaks and makes it impossible to cancel or check the timeout task status.

Apply this diff:
 			async with aiohttp.request(
 				'POST',
+				timeout=aiohttp.ClientTimeout(total=60),
 				f'{manager_uri}/genvm/run',
 				json={
 					'major': 0,  # FIXME
 					'message': message,
 					'is_sync': is_sync,
 					'capture_output': capture_output,
 					'host_data': host_data,
 					'max_execution_minutes': max_exec_mins,
 					'timestamp': timestamp,
 					'host': host,
 					'extra_args': extra_args,
 				},
 			) as resp:
 				logger.debug('post /genvm/run', status=resp.status)
 				data = await resp.json()
 				logger.trace('post /genvm/run', body=data)
 				if resp.status != 200:
 					logger.error(
 						f'genvm manager /genvm/run failed', status=resp.status, body=data
 					)
 					raise Exception(f'genvm manager /genvm/run failed: {resp.status} {data}')
 				else:
 					genvm_id = data['id']
 					logger.debug('genvm manager /genvm', genvm_id=genvm_id, status=resp.status)
 					genvm_id_cell[0] = genvm_id
-					asyncio.ensure_future(wrap_timeout(genvm_id))
+					timeout_task = asyncio.ensure_future(wrap_timeout(genvm_id))
473-504: Fix critical cleanup bypass on early exception.

If fut_host or fut_proc fail, Line 484 raises immediately, which skips the cleanup code at Lines 488-502. This means _send_timeout is never called, potentially leaving zombie GenVMs running on the manager. The cleanup must execute regardless of task failures.

Refactor to ensure cleanup always runs:
 	exceptions: list[Exception] = []
 	try:
 		fut_host.result()
 	except Exception as e:
 		exceptions.append(e)
 	try:
 		fut_proc.result()
 	except Exception as e:
 		exceptions.append(e)
-
-	if len(exceptions) > 0:
-		raise Exception(*exceptions) from exceptions[0]
-
+	
 	genvm_id = genvm_id_cell[0]
-	if genvm_id is not None:
+	if genvm_id is None:
+		exceptions.append(Exception('GenVM ID not set - execution did not start'))
+	else:
 		await _send_timeout(manager_uri, genvm_id, logger)
-
 		status = await poll_status(genvm_id)
-		if status is None:
+		if status is None:
 			exceptions.append(Exception('execution failed: no status'))
-		elif isinstance(status, Exception):
+		elif isinstance(status, Exception):
 			exceptions.append(status)
-		if len(exceptions) > 0:
-			final_exception = Exception('execution failed', exceptions[1:])
-			raise final_exception from exceptions[0]
-		return RunHostAndProgramRes(
+		elif len(exceptions) == 0:
+			return RunHostAndProgramRes(
 			stdout=status['stdout'],
 			stderr=status['stderr'],
 			genvm_log='# currently absent',
 		)
-
-	raise Exception('Execution failed')
+	
+	# Raise all collected exceptions
+	if len(exceptions) > 0:
+		final_exception = Exception('GenVM execution failed', *exceptions[1:])
+		raise final_exception from exceptions[0]
modules/implementation/src/manager/mod.rs (1)
96-119: Return type syntax is still invalid; will not compile.

The return type uses Extract: warp::Reply which is not valid Rust syntax. It should specify the exact tuple type as Extract = (R,).

Apply this diff:
 fn unwrap_all_anyhow<R: warp::Reply + 'static>(
     route: impl warp::Filter<Extract = (anyhow::Result<R>,), Error = warp::Rejection>
         + Clone
         + Send
         + Sync
         + 'static,
-) -> impl warp::Filter<Error = warp::Rejection, Extract: warp::Reply> + Clone + Send + Sync + 'static
+) -> impl warp::Filter<Extract = (R,), Error = warp::Rejection> + Clone + Send + Sync + 'static
 {
     route
         .boxed()
         .and_then(|x: anyhow::Result<R>| async move {
             x.map_err(|e| warp::reject::custom(AnyhowRejection(e)))
         })
modules/implementation/src/manager/run.rs (1)
482-482: Wrong flag for F_SETFD; use FD_CLOEXEC instead of O_CLOEXEC.

Line 482 pairs F_SETFD with O_CLOEXEC, which is incorrect. F_SETFD requires FD_CLOEXEC.

Apply this diff:
-    let _ = unsafe { libc::fcntl(read_fd, libc::F_SETFD, libc::O_CLOEXEC) };
+    let _ = unsafe { libc::fcntl(read_fd, libc::F_SETFD, libc::FD_CLOEXEC) };
modules/implementation/src/manager/handlers.rs (2)
158-177: Critical: arbitrary environment modification without authentication.

Lines 162-172 allow any HTTP client to set arbitrary OS environment variables (PATH, credentials, etc.), enabling code execution and credential theft. Add authentication, whitelist allowed keys, validate values, and audit all changes—or remove this endpoint.

211-231: HTTP 200 returned for both success and failure.

Lines 221-230 wrap both outcomes in Ok(...), yielding HTTP 200 regardless of whether shutdown succeeded. Propagate the error with result? so failures return an appropriate HTTP status.

Apply this diff:
 pub async fn handle_genvm_shutdown(
     ctx: sync::DArc<AppContext>,
     genvm_id: run::GenVMId,
     req: ShutdownRequest,
 ) -> Result<impl warp::Reply> {
-    let result = ctx
+    ctx
         .run_ctx
         .graceful_shutdown(genvm_id, req.wait_timeout_ms)
-        .await;
-
-    match result {
-        Ok(()) => Ok(warp::reply::json(&serde_json::json!({
-            "result": "shutdown_completed",
-            "genvm_id": genvm_id
-        }))),
-        Err(e) => Ok(warp::reply::json(&serde_json::json!({
-            "error": format!("{}", e),
-            "genvm_id": genvm_id
-        }))),
-    }
+        .await?;
+    Ok(warp::reply::json(&serde_json::json!({
+        "result": "shutdown_completed",
+        "genvm_id": genvm_id
+    })))
 }
executor/common/src/sync.rs (1)

192-226: Unsound API: into_get_sub and into_get_sub_async still expose lifetime bugs.

Both functions remain unsound because the lifetime 'a allows R to borrow from the temporary reference without proper constraints. The developer comments (lines 207, 249) acknowledge this but claim it's "nearly impossible to type."

Mitigation options:

Mark both methods as unsafe and document caller requirements (R must not borrow from temporaries or must be 'static/derived from T).

Provide only the sound alternatives (into_gep/into_gep_async with for<'a>) and deprecate these methods.

Restrict R with a bound like R: 'static to prevent borrowed returns.

The current CacheMap usage (lines 442-447) happens to be sound in practice because OnceCell::get_or_try_init returns a reference into the cell, but the API doesn't enforce this at the type level.

🧹 Nitpick comments (4)

tests/runner/origin/base_host.py (2)
130-130: Consider a custom exception for manager errors.

Using a generic Exception for API failures makes it harder for callers to distinguish manager errors from other failures. A custom exception class would improve error handling clarity.

453-467: Clarify status type handling in prob_died.

At Line 465, the check status is not None is ambiguous because poll_status returns dict | Exception | None. Using explicit type checks would improve clarity.

Apply this diff:
 		status = await poll_status(genvm_id)
-		if status is not None and not cancellation_event.is_set():
+		if isinstance(status, (dict, Exception)) and not cancellation_event.is_set():
 			logger.error('genvm died without connecting', genvm_id=genvm_id, status=status)
 			cancellation_event.set()
modules/implementation/src/manager/mod.rs (1)
41-60: Compile regex once, not on every call.

The regex at Line 42 is compiled each time parse_version is invoked. Use lazy_static or once_cell::sync::Lazy to compile it once.

Apply this diff:
+use once_cell::sync::Lazy;
+
+static VERSION_RE: Lazy<regex::Regex> = Lazy::new(|| {
+    regex::Regex::new(r"^v(\d+|\*)\.(\d+|\*)\.(\d+|\*)$").unwrap()
+});
+
 pub fn parse_version(version: &str) -> Result<VersionParts> {
-    let version_re = regex::Regex::new(r"^v(\d+|\*)\.(\d+|\*)\.(\d+|\*)$").unwrap();
-    let captures = version_re
+    let captures = VERSION_RE
         .captures(version)
         .ok_or_else(|| anyhow::anyhow!("Invalid version format: {}", version))?;
modules/implementation/src/manager/versioning.rs (1)
184-184: Log parse failures to aid debugging.

Line 184 silently falls back to possible_major on parse errors. Log the failure so issues with the parse-version-pattern command are visible.

Apply this diff:
-        let res = res.parse::<u32>().unwrap_or(possible_major);
+        let res = res.parse::<u32>().unwrap_or_else(|e| {
+            log_warn!(error:err = e, raw = res; "failed to parse version from genvm, using fallback");
+            possible_major
+        });

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4490c1a and 78788da.

📒 Files selected for processing (7)

executor/common/src/sync.rs (6 hunks)
executor/install/lib/python/post-install/__main__.py (4 hunks)
modules/implementation/src/manager/handlers.rs (1 hunks)
modules/implementation/src/manager/mod.rs (1 hunks)
modules/implementation/src/manager/run.rs (1 hunks)
modules/implementation/src/manager/versioning.rs (1 hunks)
tests/runner/origin/base_host.py (13 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

executor/install/lib/python/post-install/main.py

🧰 Additional context used

🧬 Code graph analysis (6)

executor/common/src/sync.rs (1)

modules/implementation/src/manager/versioning.rs (1)

new (68-74)

tests/runner/origin/base_host.py (4)

tests/runner/origin/logger.py (6)

Logger (6-26)

NoLogger (47-52)

trace (10-11)

debug (13-14)

warning (19-20)

error (22-23)

tests/runner/origin/host_fns.py (2)

Errors (25-30)

Methods (7-22)

tests/runner/origin/public_abi.py (2)

StorageType (14-17)

ResultCode (7-11)

tests/runner/mock_host.py (8)

loop_enter (115-134)

get_calldata (136-137)

consume_result (157-160)

get_leader_nondet_result (165-177)

post_nondet_result (179-180)

consume_gas (206-207)

get_balance (209-210)

remaining_fuel_as_gen (145-146)

modules/implementation/src/manager/mod.rs (7)

executor/common/src/sync.rs (4)

new (157-185)

new (378-383)

new (423-425)

new (460-462)

modules/implementation/src/manager/run.rs (2)

new (199-239)

start_service (263-282)

modules/implementation/src/manager/versioning.rs (1)

new (68-74)

modules/implementation/src/manager/modules.rs (1)

new (30-36)

modules/implementation/src/manager/handlers.rs (16)

serde_json (28-28)

serde_json (45-45)

handle_status (12-17)

handle_module_start (41-52)

handle_module_stop (24-39)

handle_genvm_run (54-71)

handle_genvm_run_readonly (73-115)

handle_contract_detect_version (117-128)

handle_set_log_level (130-148)

handle_manifest_reload (150-156)

handle_set_env (158-177)

handle_get_permits (179-182)

handle_set_permits (184-199)

handle_genvm_shutdown (211-231)

handle_genvm_status (233-243)

handle_make_deployment_storage_writes (251-295)

modules/implementation/src/web/mod.rs (1)

entrypoint (91-107)

executor/common/src/lib.rs (1)

load_config (93-107)

modules/implementation/src/manager/versioning.rs (2)

modules/implementation/src/manager/run.rs (2)

tokio (534-534)

new (199-239)

executor/common/src/sync.rs (5)

Self (162-162)

new (157-185)

new (378-383)

new (423-425)

new (460-462)

modules/implementation/src/manager/handlers.rs (4)

executor/common/src/logger/mod.rs (2)

std (82-82)

from_str (36-46)

modules/implementation/src/manager/modules.rs (3)

get_status (133-149)

get_module_locks (151-167)

new (30-36)

modules/implementation/src/manager/versioning.rs (3)

res (184-184)

new (68-74)

detect_major_spec (125-194)

modules/implementation/src/manager/run.rs (2)

start_genvm (426-559)

new (199-239)

modules/implementation/src/manager/run.rs (2)

executor/common/src/logger/mod.rs (3)

std (82-82)

fmt (22-30)

from_str (36-46)

executor/common/src/sync.rs (9)

into_arc (33-42)

new (157-185)

new (378-383)

new (423-425)

new (460-462)

Self (162-162)

drop (117-137)

drop (321-339)

ptr (175-175)

🪛 Ruff (0.13.3)

tests/runner/origin/base_host.py

130-130: Create your own exception

(TRY002)

130-130: Avoid specifying long messages outside the exception class

(TRY003)

371-371: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

408-408: f-string without any placeholders

Remove extraneous f prefix

(F541)

410-410: Create your own exception

(TRY002)

410-410: Avoid specifying long messages outside the exception class

(TRY003)

415-415: Store a reference to the return value of asyncio.ensure_future

(RUF006)

476-476: Do not catch blind exception: Exception

(BLE001)

480-480: Do not catch blind exception: Exception

(BLE001)

484-484: Create your own exception

(TRY002)

504-504: Create your own exception

(TRY002)

504-504: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (7)

modules/implementation/src/manager/handlers.rs (1)

78-80: Early return guards unfinished implementation.

The explicit error at Line 78 prevents using the incomplete readonly handler until it's ready. This addresses the past concern about placeholders.

executor/common/src/sync.rs (6)

68-86: LGTM! Refcount leak fixed.

The lift_result implementation now correctly handles the error path by explicitly dropping a temporary DArcStruct to decrement the refcount (lines 78-81). This addresses the refcount leak that was previously flagged.

116-138: LGTM! Drop ordering corrected.

The Drop implementation now correctly drops actual_data first (line 118) before decrementing the refcount and potentially freeing the control block. This prevents use-after-free for guard-like payloads (e.g., MutexGuard) whose destructors must run while the backing resource is still alive.

27-30: LGTM! DArcStruct structure and traits.

The DArcStruct<T> definition with ManuallyDrop<T> provides the necessary control over drop order. The Deref implementation and Send/Sync bounds are appropriate.

Also applies to: 108-114, 358-359

32-43: LGTM! Sound conversion from DArcStruct<&T> to DArc<T>.

The into_arc implementation correctly transfers ownership of the control block while creating a DArc pointing to the dereferenced T. This is analogous to gep and is sound because the control block owns the root object containing T.

45-66: LGTM! Panic-safe async transformation.

The await_inner implementation correctly maintains the refcount during the await by creating a temporary DArcStruct with (). If a panic occurs during await, temp_arc will drop and decrement the refcount appropriately. After successful await, wrapping temp_arc in ManuallyDrop transfers its ownership to the returned DArcStruct.

88-106: LGTM! Correct Option lifting.

The lift_option implementation mirrors lift_result and correctly drops the control block in the None case (lines 98-101).

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 78788da and a4f9047.

📒 Files selected for processing (1)

.github/workflows/queue.yaml (1 hunks)

coderabbitai · 2025-10-08T14:45:56Z

.github/workflows/queue.yaml

+      - uses:  insightsengineering/disk-space-reclaimer@v1
+        with:
+          tools-cache: false
+          swap-storage: false
+          docker-images: false
+      - uses: wimpysworld/nothing-but-nix@v6
+        with:
+          hatchet-protocol: holster


⚠️ Potential issue | 🟠 Major

Pin third-party actions to immutable SHAs.

Both insightsengineering/disk-space-reclaimer@v1 and wimpysworld/nothing-but-nix@v6 are external actions referenced by moving tags. Please switch them to specific commit SHAs (and optionally document the source repo) so we’re protected against tag hijacking.

🤖 Prompt for AI Agents

.github/workflows/queue.yaml around lines 38 to 45: the workflow uses third‑party actions with mutable tags (insightsengineering/disk-space-reclaimer@v1 and wimpysworld/nothing-but-nix@v6); replace each tag with the corresponding immutable commit SHA from the action’s repository (e.g., owner/repo@sha256-or-full-commit) and update any comments to document the source repo and chosen SHA so future reviewers can verify the provenance; ensure you pin both actions to their specific commit SHAs and run a quick workflow test to confirm compatibility.

* wip: multi version selection * feat: introduce genvm manager * fix: patch lua % exploit * chore: refactor test framework * fix: pass CI * fix: pre-CI improvements * chore: apply review * chore: pass CI

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between baa8a10 and 09ffae6.

📒 Files selected for processing (5)

.github/workflows/queue.yaml (1 hunks)
flake.nix (3 hunks)
modules/implementation/src/llm/providers.rs (3 hunks)
modules/install/config/genvm-module-llm.yaml (1 hunks)
tests/rust.sh (3 hunks)

🚧 Files skipped from review as they are similar to previous changes (4)

modules/implementation/src/llm/providers.rs
.github/workflows/queue.yaml
modules/install/config/genvm-module-llm.yaml
tests/rust.sh

coderabbitai · 2025-10-08T18:24:34Z

flake.nix

 									llvmPackages.libllvm

 									python312Packages.jsonnet
+									pkgs.python312Packages.aiohttp


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify version consistency with requirements.txt.

The addition of aiohttp is appropriate for the Manager HTTP service. However, requirements.txt specifies aiohttp==3.12.15 while the Nix environment will use whatever version is in the pinned nixpkgs (line 14). This could lead to version mismatches between different development environments.

Run the following script to check the aiohttp version in the pinned nixpkgs:

🏁 Script executed:

#!/bin/bash # Description: Check aiohttp version in the pinned nixpkgs commit # Extract the nixpkgs commit hash from flake.nix NIXPKGS_COMMIT="2b4230bf03deb33103947e2528cac2ed516c5c89" # Fetch the version from nixpkgs GitHub curl -s "https://raw.githubusercontent.com/NixOS/nixpkgs/${NIXPKGS_COMMIT}/pkgs/development/python-modules/aiohttp/default.nix" | grep -E 'version\s*=' | head -1

Length of output: 22

Align aiohttp versions in flake.nix and requirements.txt

Nix-provided aiohttp is 3.10.10, but tests/runner/requirements.txt requires 3.12.15. This discrepancy can cause compatibility issues across environments.

Override aiohttp in flake.nix to 3.12.15 (e.g., via overridePythonModule).

Or adjust the pip requirements to match the pinned nixpkgs version.

🤖 Prompt for AI Agents

In flake.nix around line 140, the Nix-provided aiohttp (3.10.10) conflicts with tests/runner/requirements.txt which pins 3.12.15; fix by overriding the Nix python package to match 3.12.15: replace the direct reference to pkgs.python312Packages.aiohttp with an overridden module using pkgs.python312Packages.aiohttp.overridePythonModule or pkgs.python312Packages.buildPythonPackage with the source set to aiohttp 3.12.15 (or import the exact wheel/source for 3.12.15), ensuring the overridden package is used wherever pkgs.python312Packages.aiohttp is referenced; alternatively, if you prefer changing requirements, update tests/runner/requirements.txt to the nix-pinned version instead so both match.

kp2pml30 added 5 commits October 6, 2025 23:54

wip: multi version selection

c0fa035

feat: introduce genvm manager

7d41383

fix: patch lua % exploit

f2ad561

chore: refactor test framework

3285334

fix: pass CI

decff4a

kp2pml30 self-assigned this Oct 6, 2025

coderabbitai bot reviewed Oct 6, 2025

View reviewed changes

kp2pml30 added the test:skip:pr label Oct 7, 2025

fix: pre-CI improvements

4490c1a

coderabbitai bot reviewed Oct 7, 2025

View reviewed changes

chore: apply review

78788da

kp2pml30 added this pull request to the merge queue Oct 7, 2025

coderabbitai bot reviewed Oct 7, 2025

View reviewed changes

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 7, 2025

kp2pml30 force-pushed the batch-update branch from bcc97c6 to a4f9047 Compare October 8, 2025 14:41

kp2pml30 enabled auto-merge October 8, 2025 14:41

kp2pml30 added this pull request to the merge queue Oct 8, 2025

coderabbitai bot reviewed Oct 8, 2025

View reviewed changes

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 8, 2025

kp2pml30 force-pushed the batch-update branch from a4f9047 to 565cc87 Compare October 8, 2025 14:54

kp2pml30 added this pull request to the merge queue Oct 8, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 8, 2025

kp2pml30 force-pushed the batch-update branch from 565cc87 to e0564e9 Compare October 8, 2025 15:18

kp2pml30 enabled auto-merge October 8, 2025 15:18

kp2pml30 added this pull request to the merge queue Oct 8, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 8, 2025

kp2pml30 force-pushed the batch-update branch from e0564e9 to baa8a10 Compare October 8, 2025 16:19

kp2pml30 added this pull request to the merge queue Oct 8, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 8, 2025

kp2pml30 force-pushed the batch-update branch from baa8a10 to 09ffae6 Compare October 8, 2025 18:19

kp2pml30 added this pull request to the merge queue Oct 8, 2025

coderabbitai bot reviewed Oct 8, 2025

View reviewed changes

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 8, 2025

chore: pass CI

85770e2

kp2pml30 force-pushed the batch-update branch from 09ffae6 to 85770e2 Compare October 8, 2025 19:06

kp2pml30 merged commit 4828f8c into main Oct 8, 2025
4 of 5 checks passed

kp2pml30 deleted the batch-update branch October 8, 2025 19:09

coderabbitai bot mentioned this pull request Oct 9, 2025

feat: manager polishing #244

Merged

coderabbitai bot mentioned this pull request Oct 20, 2025

feat(manager): polishing #245

Closed

coderabbitai bot mentioned this pull request Nov 10, 2025

Feat/improve manager #249

Merged

coderabbitai bot mentioned this pull request Dec 3, 2025

Batch update #253

Merged

coderabbitai bot mentioned this pull request Feb 6, 2026

feat: modules no-bind fd implementation #261

Merged

coderabbitai bot mentioned this pull request Mar 27, 2026

feat: restructure GenVM SDK docs #267

Open

6 tasks

		executor_root_dir = Path(__file__).parent.parent.parent.parent
		logger.info(f'Executor root directory: {executor_root_dir}')

-	extra_args: list[str] = [],
+	extra_args: list[str] | None = None,
+) -> RunHostAndProgramRes:
+	if extra_args is None:
+		extra_args = []

Conversation

kp2pml30 commented Oct 6, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

kp2pml30 commented Oct 6, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 6, 2025 •

edited

Loading