Skip to content

Batch update#230

Merged
kp2pml30 merged 5 commits intomainfrom
batch-update
Jul 10, 2025
Merged

Batch update#230
kp2pml30 merged 5 commits intomainfrom
batch-update

Conversation

@kp2pml30
Copy link
Member

@kp2pml30 kp2pml30 commented Jul 9, 2025

Summary by CodeRabbit

  • New Features

    • Added comprehensive documentation on GenVM architecture, components, host interface, communication protocol, and glossary terms.
    • Introduced a new section for the GenVM Implementation Specification in the documentation.
  • Bug Fixes

    • Improved error message clarity in test outputs and documentation.
    • Fixed typographical and cross-reference issues in documentation.
  • Refactor

    • Standardized and simplified socket communication and result encoding in host communication for better maintainability and observability.
    • Updated and streamlined method signatures and result handling in Python host test interfaces.
  • Chores

    • Updated Rust toolchain version and enhanced diagnostic output in CI workflows.
    • Updated subproject commit reference for runner support registry.
  • Documentation

    • Added and updated multiple documentation files for clarity, consistency, and expanded coverage of internal protocols and architecture.

@kp2pml30 kp2pml30 self-assigned this Jul 9, 2025
@kp2pml30 kp2pml30 enabled auto-merge July 9, 2025 13:56
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 9, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

This update introduces extensive documentation for GenVM's architecture, host interface, and glossary, alongside significant refactoring of the host communication protocol in both Rust and Python test harnesses. The Rust host interface now uses a unified length-prefixed byte slice protocol, simplifying result encoding and decoding. Minor bug fixes, logging improvements, and public API extensions are also included.

Changes

Files / Group Change Summary
.github/workflows/full.yaml Updated Rust toolchain version in CI and added cargo version output.
doc/internal/node.md Added new documentation detailing GenVM operation, protocol, and storage.
doc/website/src/impl-spec/... (multiple index, glossary, and arch) Added and updated documentation: core architecture overview, host interface, modules, and glossary. Adjusted terminology and references for clarity and consistency.
doc/website/src/spec/01-core-architecture/01-components.rst Replaced detailed architecture description with a concise summary of supervisor, sub-VMs, and runners.
doc/website/src/spec/01-core-architecture/02-vm.rst Minor textual and reference updates; clarified links and terminology.
doc/website/src/spec/01-core-architecture/index.rst Removed two entries from the toctree.
doc/website/src/spec/03-contract-interface/01-calldata.rst Standardized quotation marks in documentation.
doc/website/src/spec/03-contract-interface/03-storage.rst Added note clarifying the default derivation algorithm is not mandatory.
doc/website/src/spec/03-contract-interface/05-execution-flow.rst Updated reference for upgradability section.
doc/website/src/spec/glossary.rst Removed definitions for "Greyboxing," "Host," and "Module."
doc/website/src/spec/index.rst Clarified that GenLayer VM is "abstract."
executor/common/src/calldata/bin.rs
executor/common/src/calldata/mod.rs
Made encode_to function public and re-exported it.
executor/modules/implementation/scripting/llm-default.lua Refined error logging for user errors.
executor/src/host/mod.rs Refactored host communication: unified length-prefixed writes, improved result encoding/decoding, added tracing logs, and simplified error handling.
executor/testdata/cases/py-core/leader_errors/leader_no_nondet.stdout Changed error string from absent_leader_result to absent.
executor/testdata/runner/base_host.py
executor/testdata/runner/mock_host.py
executor/testdata/runner/host_fns.py
Simplified Python host interface: removed explicit result code usage, unified result handling as byte slices, updated method signatures, and renamed enum member.
runners/support/registry Updated subproject commit reference.

Sequence Diagram(s)

sequenceDiagram
    participant Node
    participant GenVM
    participant Host

    Node->>GenVM: Start process (--host, --message)
    GenVM->>Host: Connect via socket, send/receive messages
    loop Protocol
        Host->>GenVM: Send method ID + data
        GenVM->>Host: Process request, respond with result slice
        alt Nondeterministic call
            GenVM->>Host: post_nondet_result (call_no, encoded result)
            Host->>GenVM: get_leader_nondet_result (call_no)
            Host-->>GenVM: Return result slice (ResultCode + data)
        end
    end
    Node->>GenVM: Terminate (SIGTERM/SIGKILL)
    GenVM-->>Node: Exit (send result or crash)
Loading

Possibly related PRs

  • docs: add spec #229: Introduces foundational specification documents covering GenVM architecture, host interface, modules, runners, calldata format, ABI, storage, upgradability, and execution flow, which relate closely to the documentation updates in this PR.

Suggested labels

test:skip:pr

Poem

In the warren of code, new docs now bloom,
Host and GenVM in byte-slice tune.
Rust and Python, hand in paw,
Speak length-prefixed, without a flaw.
Logs now trace each rabbit hop,
As specs and glossaries fill the shop.
🐇✨

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Clippy (1.86.0)
Updating crates.io index

error: failed to get wasmparser as a dependency of package genvm v0.1.0 (/executor)

Caused by:
failed to load source for dependency wasmparser

Caused by:
Unable to update /executor/third-party/wasm-tools/crates/wasmparser

Caused by:
failed to read /executor/third-party/wasm-tools/crates/wasmparser/Cargo.toml

Caused by:
No such file or directory (os error 2)


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c9de112 and caee1e7.

📒 Files selected for processing (2)
  • doc/internal/node.md (1 hunks)
  • executor/modules/implementation/src/scripting/ctx/signing.rs (1 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@kp2pml30 kp2pml30 added this pull request to the merge queue Jul 9, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
doc/website/src/impl-spec/01-core-architecture/03-modules.rst (1)

7-8: Repeated escaped plural

Same remark applies here (and all other occurrences) to avoid visual artefacts in the generated HTML/PDF.

🧹 Nitpick comments (13)
.github/workflows/full.yaml (1)

104-108: cargo --version is usually printed by setup-rust-toolchain
The extra call is harmless but adds noise. Drop it unless you need the explicit line in logs.

doc/website/src/spec/03-contract-interface/03-storage.rst (1)

50-53: Good clarification, minor grammar tweak
but using it is not mandatory” → consider “its use is optional” for smoother phrasing.

doc/website/src/spec/index.rst (1)

4-5: Grammar polish
“abstract GenLayer's virtual machine” reads awkwardly. Prefer:

This specification describes GenLayer’s *abstract* virtual machine, which runs :term:`Intelligent Contract`\s.
doc/website/src/impl-spec/01-core-architecture/03-modules.rst (2)

1-2: Inline pluralisation may render oddly

Using :term:`Module`\s relies on Sphinx treating the back-slash as a literal escape. In some themes this shows up as Module\s. Consider the more common pattern:

:term:`Module` s

or simply keep the term singular in running text.


19-23: Consistency with glossary definition

The bullet list re-uses the pluralised escape. Once the first change is addressed, grep the file to update the remaining spots for consistency.

doc/website/src/impl-spec/glossary.rst (1)

6-8: Wording could be tightened

“... to prevent forcing LLMs to do unintended by contract creator actions.” →
“... to prevent LLMs from performing actions unintended by the contract creator.”

executor/testdata/runner/mock_host.py (1)

150-159: Static analysis warnings about undefined names from star imports.

The static analysis tool flags several undefined names (HostException, Errors, ResultCode) that are imported via star imports. While this works at runtime due to the star import from base_host, it makes the code harder to analyze statically.

Consider using explicit imports for better code clarity and static analysis:

-from base_host import *
+from base_host import (
+    IHost, HostException, Errors, ResultCode, StorageType,
+    DefaultTransactionData, DefaultEthTransactionData,
+    host_loop, asyncio, collections
+)
doc/website/src/impl-spec/01-core-architecture/01-components.rst (2)

79-82: Minor formatting issue in the deterministic mode description.

The text appears to have formatting issues where list items are concatenated without proper line breaks.

-**Deterministic Mode:** - Executes blockchain consensus logic - Provides
-reproducible results across all validators - Handles storage operations,
-message passing, and standard computation
+**Deterministic Mode:**
+- Executes blockchain consensus logic
+- Provides reproducible results across all validators  
+- Handles storage operations, message passing, and standard computation

91-96: Consider consistent formatting for WASI interface descriptions.

The WASI Preview 1 description uses a different formatting style than other sections.

-**WASI Preview 1 (``wasip1``)** - Standard WASI interface with
-deterministic modifications - File system operations, environment
-access, time functions - Modified to ensure reproducible behavior across
-validators
+**WASI Preview 1 (``wasip1``)**
+- Standard WASI interface with deterministic modifications
+- File system operations, environment access, time functions
+- Modified to ensure reproducible behavior across validators
doc/internal/node.md (3)

21-141: Add language identifier to the fenced code block.

The pseudo-code block should have a language identifier for proper syntax highlighting.

-```
+```pseudo
 const ACCOUNT_ADDR_SIZE = 20

3-4: Fix grammar issues in the process arguments section.

-To run a genvm, one must start a genvm process with following arguments:
-- `--host` tcp-it address or `unix://` prefixed unix domain socket
+To run GenVM, one must start a GenVM process with the following arguments:
+- `--host` TCP/IP address or `unix://` prefixed Unix domain socket

15-15: Fix typo in error handling description.

-Send it `SIGTERM`. If it doesn't quit in some sensible amount of time just `SIGKILL` it
+Send it `SIGTERM`. If it doesn't quit in a sensible amount of time, just `SIGKILL` it
executor/testdata/runner/base_host.py (1)

195-291: Consider addressing static analysis warnings about undefined names.

The static analysis tool flags ResultCode, Errors, and Methods as potentially undefined due to the dynamic imports via exec().

Consider using a more explicit import mechanism that static analysis tools can understand:

# Instead of:
if typing.TYPE_CHECKING:
    from .host_fns import *
    from .result_codes import *
else:
    exec(Path(__file__).parent.joinpath('host_fns.py').read_text())
    exec(Path(__file__).parent.joinpath('result_codes.py').read_text())

# Use:
from .host_fns import Methods, Errors, StorageType
from .result_codes import ResultCode

This would make the code more maintainable and eliminate the static analysis warnings.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b12b17d and c9de112.

📒 Files selected for processing (26)
  • .github/workflows/full.yaml (2 hunks)
  • doc/internal/node.md (1 hunks)
  • doc/website/src/impl-spec/01-core-architecture/01-components.rst (1 hunks)
  • doc/website/src/impl-spec/01-core-architecture/02-host-interface.rst (2 hunks)
  • doc/website/src/impl-spec/01-core-architecture/03-modules.rst (2 hunks)
  • doc/website/src/impl-spec/01-core-architecture/index.rst (1 hunks)
  • doc/website/src/impl-spec/glossary.rst (1 hunks)
  • doc/website/src/impl-spec/index.rst (1 hunks)
  • doc/website/src/index.rst (1 hunks)
  • doc/website/src/spec/01-core-architecture/01-components.rst (1 hunks)
  • doc/website/src/spec/01-core-architecture/02-vm.rst (3 hunks)
  • doc/website/src/spec/01-core-architecture/index.rst (0 hunks)
  • doc/website/src/spec/03-contract-interface/01-calldata.rst (1 hunks)
  • doc/website/src/spec/03-contract-interface/03-storage.rst (1 hunks)
  • doc/website/src/spec/03-contract-interface/05-execution-flow.rst (1 hunks)
  • doc/website/src/spec/glossary.rst (0 hunks)
  • doc/website/src/spec/index.rst (1 hunks)
  • executor/common/src/calldata/bin.rs (1 hunks)
  • executor/common/src/calldata/mod.rs (1 hunks)
  • executor/modules/implementation/scripting/llm-default.lua (1 hunks)
  • executor/src/host/mod.rs (12 hunks)
  • executor/testdata/cases/py-core/leader_errors/leader_no_nondet.stdout (1 hunks)
  • executor/testdata/runner/base_host.py (5 hunks)
  • executor/testdata/runner/host_fns.py (1 hunks)
  • executor/testdata/runner/mock_host.py (1 hunks)
  • runners/support/registry (1 hunks)
💤 Files with no reviewable changes (2)
  • doc/website/src/spec/01-core-architecture/index.rst
  • doc/website/src/spec/glossary.rst
🧰 Additional context used
🧬 Code Graph Analysis (2)
executor/common/src/calldata/mod.rs (1)
executor/common/src/calldata/bin.rs (2)
  • encode (280-286)
  • encode_to (209-278)
executor/testdata/runner/mock_host.py (4)
executor/testdata/runner/base_host.py (1)
  • HostException (26-31)
executor/testdata/runner/host_fns.py (1)
  • Errors (24-29)
executor/testdata/runner/result_codes.py (1)
  • ResultCode (7-11)
runners/genlayer-py-std/src/genlayer/py/calldata.py (1)
  • encode (108-201)
🪛 LanguageTool
doc/internal/node.md

[grammar] ~1-~1: Use correct spacing
Context: ## Running the GenVM To run a genvm, one must start a genvm ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~3-~3: Use proper capitalization
Context: ## Running the GenVM To run a genvm, one must start a genvm process with fo...

(QB_NEW_EN_OTHER_ERROR_IDS_6)


[grammar] ~3-~3: Use proper capitalization
Context: ...GenVM To run a genvm, one must start a genvm process with following arguments: - `--...

(QB_NEW_EN_OTHER_ERROR_IDS_6)


[grammar] ~3-~3: There might be a problem here.
Context: ...vm, one must start a genvm process with following arguments: - --host tcp-it address or unix:// prefixed unix doma...

(QB_NEW_EN_MERGED_MATCH)


[grammar] ~4-~4:
Context: ...dress or unix:// prefixed unix domain socket - --message message data as json, follows [schema](../schem...

(QB_NEW_EN_OTHER_ERROR_IDS_)


[grammar] ~5-~5: There might be a problem here.
Context: ...ket - --message message data as json, follows schema See [example](../../executor/testdata/templ...

(QB_NEW_EN_MERGED_MATCH)


[grammar] ~6-~6: There might be a mistake here.
Context: ...templates/message.json) that is used in tests ## How to ask GenVM to quit? Send it `SIGT...

(QB_NEW_EN_OTHER)


[grammar] ~9-~9: There might be a mistake here.
Context: ...doesn't quit in some sensible amount of time just SIGKILL it ## How node receives...

(QB_NEW_EN_OTHER)


[grammar] ~9-~9: There might be a mistake here.
Context: ... sensible amount of time just SIGKILL it ## How node receives code, message, ... fr...

(QB_NEW_EN_OTHER)


[grammar] ~11-~11: Use articles correctly
Context: ...mount of time just SIGKILL it ## How node receives code, message, ... from user I...

(QB_NEW_EN_OTHER_ERROR_IDS_11)


[grammar] ~11-~11: There might be a problem here.
Context: ...SIGKILL` it ## How node receives code, message, ... from user It is for node to decide....

(QB_NEW_EN_MERGED_MATCH)


[grammar] ~11-~11: There might be a problem here.
Context: ...w node receives code, message, ... from user It is for node to decide. GenVM knows only...

(QB_NEW_EN_MERGED_MATCH)


[grammar] ~12-~12: Use articles correctly
Context: ... code, message, ... from user It is for node to decide. GenVM knows only about the c...

(QB_NEW_EN_OTHER_ERROR_IDS_11)


[grammar] ~12-~12: There might be a mistake here.
Context: ...a (and potentially message) and nothing else ## Communication protocol All further comm...

(QB_NEW_EN_OTHER)


[grammar] ~14-~14: Use colons correctly
Context: ...age) and nothing else ## Communication protocol All further communication is done via s...

(QB_NEW_EN_OTHER_ERROR_IDS_30)


[grammar] ~15-~15: Use articles correctly
Context: ...er communication is done via socket. If genvm process exited before sending the resul...

(QB_NEW_EN_OTHER_ERROR_IDS_11)


[grammar] ~15-~15: Use the right verb tense
Context: ...sending the result, it means that genvm crushed. Potential bug should be reported Metho...

(QB_NEW_EN_OTHER_ERROR_IDS_13)


[grammar] ~15-~15: Make sure to use plural and singular nouns correctly
Context: ... it means that genvm crushed. Potential bug should be reported Method ids list is ...

(QB_NEW_EN_OTHER_ERROR_IDS_10)


[grammar] ~15-~15: There might be a mistake here.
Context: ... genvm crushed. Potential bug should be reported Method ids list is available as [json](...

(QB_NEW_EN_OTHER)


[grammar] ~17-~17: There might be a mistake here.
Context: ...o use it in the build system to codegen constants Below is pseudo code, but [real code](....

(QB_NEW_EN_OTHER)


[grammar] ~19-~19: There might be a mistake here.
Context: ...and is as readable as this code is (see async def host_loop(handler: IHost):) const ACCOUNT_ADDR_SIZE = 20 const SLOT_ID_SIZE = 32 fn write_byte_slice(arr): write_u32_le len(arr) write_bytes arr fn read_slice(): len := read_u32_le data := read_bytes(len) return data loop: method_id := read_byte match method_id json/methods/get_calldata: calldata, err := host_get_calldata() if err != json/errors/ok: write_byte err else: write_byte json/errors/ok write_byte_slice calldata json/methods/storage_read: read_type := read_byte as json/storage_type address := read_bytes(ACCOUNT_ADDR_SIZE) slot := read_bytes(SLOT_ID_SIZE) index := read_u32_le len := read_u32_le data, err := host_storage_read(read_type, address, slot, index, len) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok write_bytes data # must be exactly len in size json/methods/storage_write: slot := read_bytes(SLOT_ID_SIZE) index := read_u32_le len := read_u32_le data := read_bytes(len) err := host_storage_write(slot, index, data) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok json/methods/consume_result: host_result := read_slice() # this is needed to ensure that genvm doesn't close socket before all data is read write_byte 0x00 break json/methods/get_leader_nondet_result: call_no := read_u32_le data, err := host_get_leader_nondet_result(call_no) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok write_byte_slice result_data json/methods/post_nondet_result: call_no := read_u32_le result := read_slice() err := host_post_nondet_result(call_no, result) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok json/methods/post_message: address := read_bytes(ACCOUNT_ADDR_SIZE) calldata := read_slice() message_data := read_slice() # JSON string err := host_post_message(address, calldata, message_data) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok json/methods/consume_fuel: gas := read_u64_le host_consume_fuel(gas) # note: this method doesn't send any response json/methods/deploy_contract: calldata := read_slice() code := read_slice() message_data := read_slice() # JSON string err := host_deploy_contract(calldata, code, message_data) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok json/methods/eth_call: address := read_bytes(ACCOUNT_ADDR_SIZE) calldata := read_slice() result, err := host_eth_call(address, calldata) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok write_byte_slice result json/methods/eth_send: address := read_bytes(ACCOUNT_ADDR_SIZE) calldata := read_slice() message_data := read_slice() # JSON string err := host_eth_send(address, calldata, message_data) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok json/methods/get_balance: address := read_bytes(ACCOUNT_ADDR_SIZE) balance, err := host_get_balance(address) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok write_bytes balance.to_le_bytes(32) # 256-bit integer json/methods/remaining_fuel_as_gen: fuel, err := host_remaining_fuel_as_gen() if err != json/errors/ok: write_byte err else: write_byte json/errors/ok write_bytes fuel.to_le_bytes(8) # 64-bit integer, must be safe integer (fits in double) See [mock implementation](../../executo...

(QB_NEW_EN_OTHER)


[grammar] ~143-~143: There might be a problem here.
Context: ...integer (fits in double) ``` See mock implementation ## Types ### VM results There are following codes: -...

(QB_NEW_EN_MERGED_MATCH)


[grammar] ~147-~147: Use colons correctly
Context: .../runner/mock_host.py) ## Types ### VM results There are following codes: - Return -...

(QB_NEW_EN_OTHER_ERROR_IDS_30)


[grammar] ~148-~148: There might be a problem here.
Context: ...py) ## Types ### VM results There are following codes: - Return - VMError indicating vm-produced error that usually can't be...

(QB_NEW_EN_MERGED_MATCH)


[grammar] ~150-~150: There might be a problem here.
Context: ...vm-produced error that usually can't be handled - UserError user-produced error #### Nondet blocks and sandbox encoding - 1 ...

(QB_NEW_EN_MERGED_MATCH)


[grammar] ~153-~153: Use colons correctly
Context: ...d error #### Nondet blocks and sandbox encoding - 1 byte of result code - result as-is ...

(QB_NEW_EN_OTHER_ERROR_IDS_30)


[grammar] ~155-~155: Use correct spacing
Context: ...s-is (calldata for Return, string for VMError|UserError) #### Parent VM result - 1 byte of result cod...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~157-~157: Use colons correctly
Context: ...or VMError|UserError) #### Parent VM result - 1 byte of result code - calldata for ...

(QB_NEW_EN_OTHER_ERROR_IDS_30)


[grammar] ~159-~159: Use correct spacing
Context: ...ssage": "string", "fingerprint": ... }forVMError|UserError ### Calldataget_calldata` method must re...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~161-~161: Use correct spacing
Context: ...t": ... }forVMError|UserError ### Calldataget_calldata` method must return [calldata encoded](../calld...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~162-~162: Use correct spacing
Context: ...](../calldata.md) bytes that conform to ABI: typescript { method?: string, // only for non-consturctors args: Array<any>, kwargs?: { [key: string]: any } } ### Read result It has code followed by byt...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~175-~175: Use correct spacing
Context: ...nternal error, like llm's modules being absent ### Storage format Storage can be seen as a ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~178-~178: There might be a mistake here.
Context: ...eading undefined memory must return zeroes Storage can be seen as a file system tr...

(QB_NEW_EN_OTHER)


[grammar] ~180-~180: There might be a mistake here.
Context: ...em tree containing directories named as contracts which contain files named as slots, the...

(QB_NEW_EN_OTHER)


[grammar] ~180-~180: There might be a mistake here.
Context: ... contracts which contain files named as slots, then following implementation is valid:...

(QB_NEW_EN_OTHER)


[grammar] ~180-~180: Use articles correctly
Context: ...hich contain files named as slots, then following implementation is valid: ```bash # read...

(QB_NEW_EN_OTHER_ERROR_IDS_11)


[grammar] ~180-~180: Use correct spacing
Context: ...slots, then following implementation is valid: bash # read contract_a slot_b 10...100 cat db/contract_a/slot_b.bytes /dev/null | tail -c +10 | head -c +100 NOTE: calculating storage updates, ha...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~186-~186: Add a comma
Context: ... NOTE: calculating storage updates, hashes and so on is host's (node's) responsibi...

(QB_NEW_EN_OTHER_ERROR_IDS_22)


[grammar] ~186-~186: Use articles correctly
Context: ...ng storage updates, hashes and so on is host's (node's) responsibility. It is [the sam...

(QB_NEW_EN_OTHER_ERROR_IDS_11)


[grammar] ~186-~186: Use articles correctly
Context: .../state/state_object.go#L232): they have dirty override for the store

(QB_NEW_EN_OTHER_ERROR_IDS_11)


[grammar] ~186-~186: Use a period to end declarative sentences
Context: ...L232): they have dirty override for the store

(QB_NEW_EN_OTHER_ERROR_IDS_25)

🪛 markdownlint-cli2 (0.17.2)
doc/internal/node.md

21-21: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.11.9)
executor/testdata/runner/mock_host.py

148-148: collections may be undefined, or defined from star imports

(F405)


150-150: HostException may be undefined, or defined from star imports

(F405)


150-150: Errors may be undefined, or defined from star imports

(F405)


152-152: HostException may be undefined, or defined from star imports

(F405)


152-152: Errors may be undefined, or defined from star imports

(F405)


155-155: ResultCode may be undefined, or defined from star imports

(F405)


157-157: ResultCode may be undefined, or defined from star imports

(F405)


159-159: ResultCode may be undefined, or defined from star imports

(F405)


162-162: collections may be undefined, or defined from star imports

(F405)

executor/testdata/runner/base_host.py

195-195: ResultCode may be undefined, or defined from star imports

(F405)


205-205: Errors may be undefined, or defined from star imports

(F405)


209-209: Methods may be undefined, or defined from star imports

(F405)


291-291: Methods may be undefined, or defined from star imports

(F405)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: initial / pre-commit
🔇 Additional comments (30)
doc/website/src/spec/03-contract-interface/01-calldata.rst (1)

27-28: Typographic change looks good
Switching to straight quotes improves consistency with the rest of the spec. No further action required.

.github/workflows/full.yaml (1)

77-81: Rust toolchain pinned to 1.87.0 – verify compatibility
Jumping three versions (1.84 ➜ 1.87) may surface clippy or MSRV breakages in crates. Confirm the workspace compiles on CI and locally before merging; otherwise pin to stable or the minimal version dictated by cargo msrv.

doc/website/src/index.rst (1)

8-11: Confirm the target document is discoverable by Sphinx

impl-spec/index is now referenced in the root toctree.
Please double-check that:

  1. doc/website/src/impl-spec/index.rst exists on this branch.
  2. impl-spec is listed (or excluded) in exclude_patterns inside conf.py as intended.
  3. A local make html (or RTD build) succeeds without “toctree contains reference to nonexisting file” warnings.

If any of these fail the docs build, CI will break.

executor/testdata/cases/py-core/leader_errors/leader_no_nondet.stdout (1)

1-1: Synchronise new error string with all expectations

The expected output changed from absent_leader_resultabsent.
Ensure every test case, mock host, and Rust error mapping now emits the truncated identifier; otherwise cross-language tests will diverge.

runners/support/registry (1)

1-1: Ensure CI fetches full submodule history and downstream tests still pass

I wasn’t able to list the commits between ac95d7e262ffccbfbc366035fe64b232d1db51cbc0f54a8e77b8eb3254ed76940136886a2d0c8355 because the old SHA isn’t present in the registry submodule clone. Before merging, please manually verify:

  • The registry remote contains the old commit (ac95d7e262ff…), and your CI isn’t using a shallow clone that omits it.
  • All workflows that check out this repo include submodule support, e.g.:
    • git submodule update --init --recursive
    • or, for GitHub Actions, with: submodules: true under actions/checkout
  • Run cargo test --workspace --all-targets (and any Python tests) locally and confirm CI jobs pass against the bumped registry layout.
doc/website/src/spec/03-contract-interface/05-execution-flow.rst (1)

28-28: Cross-reference target needs verification

The link now points to 04-upgradability. Please confirm that the referenced document/label actually exists after the documentation reshuffle; otherwise Sphinx will emit a broken-link warning.

executor/testdata/runner/host_fns.py (1)

21-22: Enum rename requires downstream updates

REMAINING_FUEL_AS_GEN replaces REMAINING_FUEL.
Make sure all Python test harnesses and the Rust side (executor/src/host/**) dispatch on the new discriminant 13; any stale literal values or pattern matches will silently mis-route messages.

doc/website/src/impl-spec/01-core-architecture/index.rst (1)

6-11: Verify toctree paths

Please make sure the three referenced files are located exactly at the same directory level; a mismatch (e.g. missing .rst or differing filename case) breaks make html.

doc/website/src/impl-spec/index.rst (1)

1-14: Well-structured documentation entry point.

The documentation structure is clean and follows reStructuredText conventions properly. The toctree organization with glossary and core architecture sections provides good navigation for users.

doc/website/src/impl-spec/01-core-architecture/02-host-interface.rst (2)

100-103: Documentation correctly reflects protocol changes.

The update from read_result() to read_slice() accurately reflects the simplified host communication protocol. The comment about preventing socket closure before data is read provides helpful context for the protocol design.


112-112: Improved variable naming for clarity.

The change from data to result_data provides better semantic clarity about what the variable contains, improving documentation readability.

executor/common/src/calldata/mod.rs (1)

7-7: Good API extension for encoding flexibility.

Making encode_to public allows users to encode directly into existing buffers, which is more efficient than the existing encode function that creates new vectors. This is a useful addition that maintains backward compatibility.

executor/modules/implementation/scripting/llm-default.lua (1)

49-53: Improved logging reduces noise while maintaining error visibility.

The conditional logging change is well-designed. It only logs warnings for unexpected errors (when as_user_error is nil), reducing log noise while ensuring that non-user errors are still visible for debugging. The logic correctly handles the error flow.

doc/website/src/spec/01-core-architecture/02-vm.rst (3)

8-8: Good typography standardization.

Changing from curly quotes to straight quotes improves consistency and readability in technical documentation.


142-142: Corrected cross-reference maintains documentation accuracy.

The update from "04-calldata" to "01-calldata" ensures the cross-reference points to the correct documentation section, maintaining navigability.


148-148: Minor wording improvements enhance clarity.

The small wording changes improve readability without altering the technical meaning, making the documentation more accessible.

Also applies to: 179-179

executor/testdata/runner/mock_host.py (2)

148-160: Protocol change looks good - aligns with the unified result encoding.

The change to return a single bytes buffer with the ResultCode as the first byte followed by the encoded payload is consistent with the new protocol specification. The use of exceptions for error cases instead of return codes is a cleaner approach.


162-163: Simplified method signature aligns with protocol changes.

The removal of the type: ResultCode parameter is consistent with the protocol changes where the result code is now embedded in the data itself.

doc/website/src/spec/01-core-architecture/01-components.rst (1)

1-19: Documentation restructuring provides clear separation of concerns.

The condensed high-level overview effectively summarizes the GenVM architecture while avoiding duplication with the detailed implementation specification. The focus on the dual-mode execution model and supervisor pattern provides a good conceptual understanding.

doc/website/src/impl-spec/01-core-architecture/01-components.rst (1)

1-142: Comprehensive architectural documentation with clear structure.

This new documentation provides an excellent detailed overview of GenVM's architecture. The Mermaid diagram effectively visualizes the component relationships, and the detailed descriptions of each component are well-organized and informative.

doc/internal/node.md (1)

1-187: Valuable technical documentation with detailed protocol specification.

This internal documentation provides essential details about the GenVM process operation and communication protocol. The pseudo-code implementation is particularly helpful for understanding the protocol flow.

executor/testdata/runner/base_host.py (5)

77-81: Interface simplification aligns with the new protocol design.

The removal of explicit ResultCode parameters and the switch to raw byte buffers simplifies the interface while maintaining the same functionality through embedded result codes.


150-153: Clean implementation of the slice reading function.

The read_slice function correctly reads length-prefixed data and returns it as a memoryview for efficient buffer handling.


194-195: Correct extraction of ResultCode from the received slice.

The implementation properly extracts the result code from the first byte and passes the remaining data to the handler.


205-208: Proper implementation of length-prefixed data sending.

The code correctly sends the data as a length-prefixed byte slice after the OK status.


291-291: Method name update for consistency.

The rename from REMAINING_FUEL to REMAINING_FUEL_AS_GEN aligns with the naming convention used across the codebase.

executor/src/host/mod.rs (4)

11-11: LGTM!

The core::str import is appropriately added to support UTF-8 conversions in the refactored result decoding logic.


275-275: Good addition of observability!

The logging statements provide valuable debugging information and are well-placed at function entry points and after key operations.

Also applies to: 329-329, 335-335, 370-370, 394-394, 414-414, 434-434, 448-448, 472-472, 496-496, 514-514


265-265: Excellent refactoring to use unified helper!

The consistent use of write_slice across storage_write, post_message, and deploy_contract reduces code duplication and ensures uniform protocol handling.

Also applies to: 403-404, 422-424


356-363: Document or handle UTF-8 assumptions for host error messages

The host/VM protocol currently uses str::from_utf8(rest)? when decoding user and VM error payloads:

executor/src/host/mod.rs
   …
   x if x == ResultCode::UserError as u8 =>
       vm::RunOk::UserError(String::from(str::from_utf8(rest)?))
   x if x == ResultCode::VmError as u8 =>
       vm::RunOk::VMError(String::from(str::from_utf8(rest)?), None)

We didn’t find any explicit guarantee in the protocol spec or code comments that error messages are always valid UTF-8. As a result, if the host ever sends non-UTF-8 bytes, this will bubble up as a decoding error.

• Either document in your protocol/message format that all error payloads must be UTF-8 encoded
• Or switch to a lossy conversion (e.g. String::from_utf8_lossy(rest).into_owned()) and handle any replacement characters gracefully

Please review the intended guarantees for host error strings and update the code or documentation accordingly.

Comment on lines +209 to 210
pub fn encode_to(to: &mut Vec<u8>, value: &Value) {
match value {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Making encode_to public needs accompanying contract + docs
encode_to was private and is now exposed to external callers, but:

  • It mutates to blindly and relies on callers to respect the already-encoded invariants (ordered map keys, no circular references, etc.).
  • There is no #[doc] comment explaining these invariants or the expected pre-/post-conditions.
  • The function name is exported verbatim; consider pub(crate) if only GenVM internals need it, otherwise document it thoroughly and add fuzz/round-trip tests to lock the wire format.
-pub fn encode_to(to: &mut Vec<u8>, value: &Value) {
+/// Encodes `value` in Calldata binary format and appends the bytes to `to`.
+///
+/// Caller must ensure map keys are **strictly** ascending and the buffer
+/// does **not** already contain partial Calldata; otherwise the stream will be
+/// un-decodable.
+pub fn encode_to(to: &mut Vec<u8>, value: &Value) {

🏁 Script executed:

#!/usr/bin/env bash
set -e

echo "1. Function definition and visibility:"
rg -n -C3 "^pub fn encode_to" executor/common/src/calldata/bin.rs || true

echo
echo "2. Doc comments directly above encode_to:"
rg -B5 "^pub fn encode_to" executor/common/src/calldata/bin.rs | sed -n '1,10p' || true

echo
echo "3. All references to encode_to in the repository:"
rg -n "encode_to" .

echo
echo "4. Tests or examples mentioning encode_to:"
rg -n "encode_to" -g "*test*.rs" || true

Length of output: 1648


Add documentation and tests for encode_to

The newly public encode_to function in executor/common/src/calldata/bin.rs (line 209) has no docs and relies on strict caller invariants. Please:

  • Insert a /// doc comment above encode_to detailing:
    • That it appends Calldata in binary format to to.
    • Caller guarantees: empty or fully-encoded buffer, strictly ascending map keys, no circular refs.
  • Add fuzz or round-trip tests (e.g. in tests/) to lock down the wire format.
  • If callers outside the executor internals don’t need this API, consider changing its re-export in executor/common/src/calldata/mod.rs (around line 7) to pub(crate) instead of pub use.

Suggested doc stub:

 executor/common/src/calldata/bin.rs
@@ -206,6 +206,13 @@
     }
 }
 
+/// Encode `value` into Calldata’s binary format, appending to `to`.
+///
+/// # Safety & Invariants
+/// - `to` must be empty or already contain a complete Calldata stream.
///  - Map keys in `value` must be strictly ascending.
///  - No circular references allowed (will overflow/loop).
+/// - Violating these will render the stream un-decodable.
 pub fn encode_to(to: &mut Vec<u8>, value: &Value) {
     match value {
         Value::Null => to.push(SPECIAL_NULL),

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In executor/common/src/calldata/bin.rs at lines 209-210, the public function
encode_to lacks documentation and tests. Add a /// doc comment above encode_to
explaining that it appends Calldata in binary format to the provided buffer and
specifying caller guarantees: the buffer must be empty or fully encoded, map
keys strictly ascending, and no circular references. Then, create fuzz or
round-trip tests in the tests/ directory to verify and lock down the wire
format. Finally, if external callers outside executor internals do not require
this function, change its re-export in executor/common/src/calldata/mod.rs near
line 7 from pub use to pub(crate) to restrict visibility.

Comment on lines 313 to 318
Err(e) => {
let data = calldata::Value::Str(format!("{e:?}"));
let val = calldata::encode(&data);
let mut encoded = Vec::from([ResultCode::VmError as u8]);
encoded.extend_from_slice(format!("{e:?}").as_bytes());

(ResultCode::InternalError, val)
encoded
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use Display formatting for external error messages.

Debug formatting ({e:?}) on line 315 may expose internal implementation details. Consider using Display formatting for cleaner external error messages.

         Err(e) => {
             let mut encoded = Vec::from([ResultCode::VmError as u8]);
-            encoded.extend_from_slice(format!("{e:?}").as_bytes());
+            encoded.extend_from_slice(format!("{e}").as_bytes());
 
             encoded
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Err(e) => {
let data = calldata::Value::Str(format!("{e:?}"));
let val = calldata::encode(&data);
let mut encoded = Vec::from([ResultCode::VmError as u8]);
encoded.extend_from_slice(format!("{e:?}").as_bytes());
(ResultCode::InternalError, val)
encoded
}
Err(e) => {
let mut encoded = Vec::from([ResultCode::VmError as u8]);
encoded.extend_from_slice(format!("{e}").as_bytes());
encoded
}
🤖 Prompt for AI Agents
In executor/src/host/mod.rs around lines 313 to 318, the error message is
formatted using Debug (`{e:?}`), which may expose internal details. Change the
formatting to use Display (`{e}`) instead by replacing `format!("{e:?}")` with
`format!("{e}")` to provide cleaner and safer external error messages.

Comment on lines +70 to 77
fn write_slice(sock: &mut dyn Sock, data: &[u8]) -> Result<()> {
let len = data.len() as u32;

sock.write_all(&len.to_le_bytes())?;
sock.write_all(data)?;

Ok(())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add bounds check to prevent silent truncation.

The cast from usize to u32 on line 71 could silently truncate if the data length exceeds 4GB. While unlikely in practice, it's better to fail explicitly.

 fn write_slice(sock: &mut dyn Sock, data: &[u8]) -> Result<()> {
-    let len = data.len() as u32;
+    let len = u32::try_from(data.len())
+        .context("Data length exceeds maximum size (4GB)")?;
 
     sock.write_all(&len.to_le_bytes())?;
     sock.write_all(data)?;
 
     Ok(())
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn write_slice(sock: &mut dyn Sock, data: &[u8]) -> Result<()> {
let len = data.len() as u32;
sock.write_all(&len.to_le_bytes())?;
sock.write_all(data)?;
Ok(())
}
fn write_slice(sock: &mut dyn Sock, data: &[u8]) -> Result<()> {
let len = u32::try_from(data.len())
.context("Data length exceeds maximum size (4GB)")?;
sock.write_all(&len.to_le_bytes())?;
sock.write_all(data)?;
Ok(())
}
🤖 Prompt for AI Agents
In executor/src/host/mod.rs around lines 70 to 77, the cast from usize to u32
for data length can silently truncate if data exceeds 4GB. Add an explicit check
before the cast to verify that data.len() does not exceed u32::MAX, and return
an error if it does, preventing silent truncation.

sock.write_all(&call_no.to_le_bytes())?;

write_result(sock, Ok(res))?;
write_slice(sock, &Vec::from_iter(res.as_bytes_iter()))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Consider avoiding unnecessary allocation.

The Vec::from_iter creates an allocation that might be avoidable if the result can provide bytes as a slice.

-        write_slice(sock, &Vec::from_iter(res.as_bytes_iter()))?;
+        // If res has a method that returns &[u8] directly:
+        write_slice(sock, res.as_bytes())?;
+        // Otherwise, consider writing bytes directly without intermediate Vec

🏁 Script executed:

#!/bin/bash
# Check if RunOk has an as_bytes() method or similar
ast-grep --pattern 'impl $_ for RunOk {
  $$$
  fn as_bytes($$$) $$$
  $$$
}'

Length of output: 79


We still need to confirm what res actually is and how its as_bytes_iter() is implemented. Let’s locate its type and the iterator definition:


🏁 Script executed:

#!/bin/bash
# Find the definition of the result type (e.g., RunOk)
rg -n "struct RunOk" -A 3
rg -n "enum RunOk" -A 3

# Find how as_bytes_iter is implemented
rg -n "as_bytes_iter" -A 5

Length of output: 2389


Avoid heap allocation in post_nondet_result

The current code collects the entire byte sequence into a Vec<u8>, causing an unnecessary heap allocation:

write_slice(sock, &Vec::from_iter(res.as_bytes_iter()))?;

Instead, you can write the status byte and payload directly, eliminating the temporary Vec:

-        write_slice(sock, &Vec::from_iter(res.as_bytes_iter()))?;
+        use crate::public_abi::ResultCode;
+
+        // Write the status code byte
+        let code = [match res {
+            RunOk::Return(_)    => ResultCode::Return    as u8,
+            RunOk::UserError(_) => ResultCode::UserError as u8,
+            RunOk::VMError(_,_) => ResultCode::VMError   as u8,
+        }];
+        write_slice(sock, &code)?;
+
+        // Write the payload bytes without an intermediate Vec
+        match res {
+            RunOk::Return(buf)    => write_slice(sock, buf)?,
+            RunOk::UserError(msg) => write_slice(sock, msg.as_bytes())?,
+            RunOk::VMError(msg, _)=> write_slice(sock, msg.as_bytes())?,
+        }

This refactoring removes the extra allocation while preserving the exact same output.

🤖 Prompt for AI Agents
In executor/src/host/mod.rs at line 379, the code creates a temporary Vec from
an iterator causing unnecessary heap allocation. Refactor the code to write the
status byte and payload directly to the socket without collecting into a Vec, by
calling write_slice separately for the status byte and the payload iterator,
thus eliminating the intermediate allocation while preserving output.

@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 9, 2025
kp2pml30 and others added 2 commits July 10, 2025 18:32
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@kp2pml30 kp2pml30 merged commit 0366da9 into main Jul 10, 2025
1 of 2 checks passed
@kp2pml30 kp2pml30 deleted the batch-update branch July 10, 2025 09:34
@coderabbitai coderabbitai bot mentioned this pull request Aug 28, 2025
@coderabbitai coderabbitai bot mentioned this pull request Dec 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant