Conversation
|
Caution Review failedThe pull request is closed. WalkthroughThis update introduces extensive documentation for GenVM's architecture, host interface, and glossary, alongside significant refactoring of the host communication protocol in both Rust and Python test harnesses. The Rust host interface now uses a unified length-prefixed byte slice protocol, simplifying result encoding and decoding. Minor bug fixes, logging improvements, and public API extensions are also included. Changes
Sequence Diagram(s)sequenceDiagram
participant Node
participant GenVM
participant Host
Node->>GenVM: Start process (--host, --message)
GenVM->>Host: Connect via socket, send/receive messages
loop Protocol
Host->>GenVM: Send method ID + data
GenVM->>Host: Process request, respond with result slice
alt Nondeterministic call
GenVM->>Host: post_nondet_result (call_no, encoded result)
Host->>GenVM: get_leader_nondet_result (call_no)
Host-->>GenVM: Return result slice (ResultCode + data)
end
end
Node->>GenVM: Terminate (SIGTERM/SIGKILL)
GenVM-->>Node: Exit (send result or crash)
Possibly related PRs
Suggested labels
Poem
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Clippy (1.86.0)error: failed to get Caused by: Caused by: Caused by: Caused by: 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (1)
doc/website/src/impl-spec/01-core-architecture/03-modules.rst (1)
7-8: Repeated escaped pluralSame remark applies here (and all other occurrences) to avoid visual artefacts in the generated HTML/PDF.
🧹 Nitpick comments (13)
.github/workflows/full.yaml (1)
104-108:cargo --versionis usually printed bysetup-rust-toolchain
The extra call is harmless but adds noise. Drop it unless you need the explicit line in logs.doc/website/src/spec/03-contract-interface/03-storage.rst (1)
50-53: Good clarification, minor grammar tweak
“but using it is not mandatory” → consider “its use is optional” for smoother phrasing.doc/website/src/spec/index.rst (1)
4-5: Grammar polish
“abstract GenLayer's virtual machine” reads awkwardly. Prefer:This specification describes GenLayer’s *abstract* virtual machine, which runs :term:`Intelligent Contract`\s.doc/website/src/impl-spec/01-core-architecture/03-modules.rst (2)
1-2: Inline pluralisation may render oddlyUsing
:term:`Module`\srelies on Sphinx treating the back-slash as a literal escape. In some themes this shows up asModule\s. Consider the more common pattern::term:`Module` sor simply keep the term singular in running text.
19-23: Consistency with glossary definitionThe bullet list re-uses the pluralised escape. Once the first change is addressed, grep the file to update the remaining spots for consistency.
doc/website/src/impl-spec/glossary.rst (1)
6-8: Wording could be tightened“... to prevent forcing LLMs to do unintended by contract creator actions.” →
“... to prevent LLMs from performing actions unintended by the contract creator.”executor/testdata/runner/mock_host.py (1)
150-159: Static analysis warnings about undefined names from star imports.The static analysis tool flags several undefined names (
HostException,Errors,ResultCode) that are imported via star imports. While this works at runtime due to the star import frombase_host, it makes the code harder to analyze statically.Consider using explicit imports for better code clarity and static analysis:
-from base_host import * +from base_host import ( + IHost, HostException, Errors, ResultCode, StorageType, + DefaultTransactionData, DefaultEthTransactionData, + host_loop, asyncio, collections +)doc/website/src/impl-spec/01-core-architecture/01-components.rst (2)
79-82: Minor formatting issue in the deterministic mode description.The text appears to have formatting issues where list items are concatenated without proper line breaks.
-**Deterministic Mode:** - Executes blockchain consensus logic - Provides -reproducible results across all validators - Handles storage operations, -message passing, and standard computation +**Deterministic Mode:** +- Executes blockchain consensus logic +- Provides reproducible results across all validators +- Handles storage operations, message passing, and standard computation
91-96: Consider consistent formatting for WASI interface descriptions.The WASI Preview 1 description uses a different formatting style than other sections.
-**WASI Preview 1 (``wasip1``)** - Standard WASI interface with -deterministic modifications - File system operations, environment -access, time functions - Modified to ensure reproducible behavior across -validators +**WASI Preview 1 (``wasip1``)** +- Standard WASI interface with deterministic modifications +- File system operations, environment access, time functions +- Modified to ensure reproducible behavior across validatorsdoc/internal/node.md (3)
21-141: Add language identifier to the fenced code block.The pseudo-code block should have a language identifier for proper syntax highlighting.
-``` +```pseudo const ACCOUNT_ADDR_SIZE = 20
3-4: Fix grammar issues in the process arguments section.-To run a genvm, one must start a genvm process with following arguments: -- `--host` tcp-it address or `unix://` prefixed unix domain socket +To run GenVM, one must start a GenVM process with the following arguments: +- `--host` TCP/IP address or `unix://` prefixed Unix domain socket
15-15: Fix typo in error handling description.-Send it `SIGTERM`. If it doesn't quit in some sensible amount of time just `SIGKILL` it +Send it `SIGTERM`. If it doesn't quit in a sensible amount of time, just `SIGKILL` itexecutor/testdata/runner/base_host.py (1)
195-291: Consider addressing static analysis warnings about undefined names.The static analysis tool flags
ResultCode,Errors, andMethodsas potentially undefined due to the dynamic imports viaexec().Consider using a more explicit import mechanism that static analysis tools can understand:
# Instead of: if typing.TYPE_CHECKING: from .host_fns import * from .result_codes import * else: exec(Path(__file__).parent.joinpath('host_fns.py').read_text()) exec(Path(__file__).parent.joinpath('result_codes.py').read_text()) # Use: from .host_fns import Methods, Errors, StorageType from .result_codes import ResultCodeThis would make the code more maintainable and eliminate the static analysis warnings.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (26)
.github/workflows/full.yaml(2 hunks)doc/internal/node.md(1 hunks)doc/website/src/impl-spec/01-core-architecture/01-components.rst(1 hunks)doc/website/src/impl-spec/01-core-architecture/02-host-interface.rst(2 hunks)doc/website/src/impl-spec/01-core-architecture/03-modules.rst(2 hunks)doc/website/src/impl-spec/01-core-architecture/index.rst(1 hunks)doc/website/src/impl-spec/glossary.rst(1 hunks)doc/website/src/impl-spec/index.rst(1 hunks)doc/website/src/index.rst(1 hunks)doc/website/src/spec/01-core-architecture/01-components.rst(1 hunks)doc/website/src/spec/01-core-architecture/02-vm.rst(3 hunks)doc/website/src/spec/01-core-architecture/index.rst(0 hunks)doc/website/src/spec/03-contract-interface/01-calldata.rst(1 hunks)doc/website/src/spec/03-contract-interface/03-storage.rst(1 hunks)doc/website/src/spec/03-contract-interface/05-execution-flow.rst(1 hunks)doc/website/src/spec/glossary.rst(0 hunks)doc/website/src/spec/index.rst(1 hunks)executor/common/src/calldata/bin.rs(1 hunks)executor/common/src/calldata/mod.rs(1 hunks)executor/modules/implementation/scripting/llm-default.lua(1 hunks)executor/src/host/mod.rs(12 hunks)executor/testdata/cases/py-core/leader_errors/leader_no_nondet.stdout(1 hunks)executor/testdata/runner/base_host.py(5 hunks)executor/testdata/runner/host_fns.py(1 hunks)executor/testdata/runner/mock_host.py(1 hunks)runners/support/registry(1 hunks)
💤 Files with no reviewable changes (2)
- doc/website/src/spec/01-core-architecture/index.rst
- doc/website/src/spec/glossary.rst
🧰 Additional context used
🧬 Code Graph Analysis (2)
executor/common/src/calldata/mod.rs (1)
executor/common/src/calldata/bin.rs (2)
encode(280-286)encode_to(209-278)
executor/testdata/runner/mock_host.py (4)
executor/testdata/runner/base_host.py (1)
HostException(26-31)executor/testdata/runner/host_fns.py (1)
Errors(24-29)executor/testdata/runner/result_codes.py (1)
ResultCode(7-11)runners/genlayer-py-std/src/genlayer/py/calldata.py (1)
encode(108-201)
🪛 LanguageTool
doc/internal/node.md
[grammar] ~1-~1: Use correct spacing
Context: ## Running the GenVM To run a genvm, one must start a genvm ...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~3-~3: Use proper capitalization
Context: ## Running the GenVM To run a genvm, one must start a genvm process with fo...
(QB_NEW_EN_OTHER_ERROR_IDS_6)
[grammar] ~3-~3: Use proper capitalization
Context: ...GenVM To run a genvm, one must start a genvm process with following arguments: - `--...
(QB_NEW_EN_OTHER_ERROR_IDS_6)
[grammar] ~3-~3: There might be a problem here.
Context: ...vm, one must start a genvm process with following arguments: - --host tcp-it address or unix:// prefixed unix doma...
(QB_NEW_EN_MERGED_MATCH)
[grammar] ~4-~4:
Context: ...dress or unix:// prefixed unix domain socket - --message message data as json, follows [schema](../schem...
(QB_NEW_EN_OTHER_ERROR_IDS_)
[grammar] ~5-~5: There might be a problem here.
Context: ...ket - --message message data as json, follows schema See [example](../../executor/testdata/templ...
(QB_NEW_EN_MERGED_MATCH)
[grammar] ~6-~6: There might be a mistake here.
Context: ...templates/message.json) that is used in tests ## How to ask GenVM to quit? Send it `SIGT...
(QB_NEW_EN_OTHER)
[grammar] ~9-~9: There might be a mistake here.
Context: ...doesn't quit in some sensible amount of time just SIGKILL it ## How node receives...
(QB_NEW_EN_OTHER)
[grammar] ~9-~9: There might be a mistake here.
Context: ... sensible amount of time just SIGKILL it ## How node receives code, message, ... fr...
(QB_NEW_EN_OTHER)
[grammar] ~11-~11: Use articles correctly
Context: ...mount of time just SIGKILL it ## How node receives code, message, ... from user I...
(QB_NEW_EN_OTHER_ERROR_IDS_11)
[grammar] ~11-~11: There might be a problem here.
Context: ...SIGKILL` it ## How node receives code, message, ... from user It is for node to decide....
(QB_NEW_EN_MERGED_MATCH)
[grammar] ~11-~11: There might be a problem here.
Context: ...w node receives code, message, ... from user It is for node to decide. GenVM knows only...
(QB_NEW_EN_MERGED_MATCH)
[grammar] ~12-~12: Use articles correctly
Context: ... code, message, ... from user It is for node to decide. GenVM knows only about the c...
(QB_NEW_EN_OTHER_ERROR_IDS_11)
[grammar] ~12-~12: There might be a mistake here.
Context: ...a (and potentially message) and nothing else ## Communication protocol All further comm...
(QB_NEW_EN_OTHER)
[grammar] ~14-~14: Use colons correctly
Context: ...age) and nothing else ## Communication protocol All further communication is done via s...
(QB_NEW_EN_OTHER_ERROR_IDS_30)
[grammar] ~15-~15: Use articles correctly
Context: ...er communication is done via socket. If genvm process exited before sending the resul...
(QB_NEW_EN_OTHER_ERROR_IDS_11)
[grammar] ~15-~15: Use the right verb tense
Context: ...sending the result, it means that genvm crushed. Potential bug should be reported Metho...
(QB_NEW_EN_OTHER_ERROR_IDS_13)
[grammar] ~15-~15: Make sure to use plural and singular nouns correctly
Context: ... it means that genvm crushed. Potential bug should be reported Method ids list is ...
(QB_NEW_EN_OTHER_ERROR_IDS_10)
[grammar] ~15-~15: There might be a mistake here.
Context: ... genvm crushed. Potential bug should be reported Method ids list is available as [json](...
(QB_NEW_EN_OTHER)
[grammar] ~17-~17: There might be a mistake here.
Context: ...o use it in the build system to codegen constants Below is pseudo code, but [real code](....
(QB_NEW_EN_OTHER)
[grammar] ~19-~19: There might be a mistake here.
Context: ...and is as readable as this code is (see async def host_loop(handler: IHost):) const ACCOUNT_ADDR_SIZE = 20 const SLOT_ID_SIZE = 32 fn write_byte_slice(arr): write_u32_le len(arr) write_bytes arr fn read_slice(): len := read_u32_le data := read_bytes(len) return data loop: method_id := read_byte match method_id json/methods/get_calldata: calldata, err := host_get_calldata() if err != json/errors/ok: write_byte err else: write_byte json/errors/ok write_byte_slice calldata json/methods/storage_read: read_type := read_byte as json/storage_type address := read_bytes(ACCOUNT_ADDR_SIZE) slot := read_bytes(SLOT_ID_SIZE) index := read_u32_le len := read_u32_le data, err := host_storage_read(read_type, address, slot, index, len) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok write_bytes data # must be exactly len in size json/methods/storage_write: slot := read_bytes(SLOT_ID_SIZE) index := read_u32_le len := read_u32_le data := read_bytes(len) err := host_storage_write(slot, index, data) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok json/methods/consume_result: host_result := read_slice() # this is needed to ensure that genvm doesn't close socket before all data is read write_byte 0x00 break json/methods/get_leader_nondet_result: call_no := read_u32_le data, err := host_get_leader_nondet_result(call_no) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok write_byte_slice result_data json/methods/post_nondet_result: call_no := read_u32_le result := read_slice() err := host_post_nondet_result(call_no, result) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok json/methods/post_message: address := read_bytes(ACCOUNT_ADDR_SIZE) calldata := read_slice() message_data := read_slice() # JSON string err := host_post_message(address, calldata, message_data) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok json/methods/consume_fuel: gas := read_u64_le host_consume_fuel(gas) # note: this method doesn't send any response json/methods/deploy_contract: calldata := read_slice() code := read_slice() message_data := read_slice() # JSON string err := host_deploy_contract(calldata, code, message_data) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok json/methods/eth_call: address := read_bytes(ACCOUNT_ADDR_SIZE) calldata := read_slice() result, err := host_eth_call(address, calldata) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok write_byte_slice result json/methods/eth_send: address := read_bytes(ACCOUNT_ADDR_SIZE) calldata := read_slice() message_data := read_slice() # JSON string err := host_eth_send(address, calldata, message_data) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok json/methods/get_balance: address := read_bytes(ACCOUNT_ADDR_SIZE) balance, err := host_get_balance(address) if err != json/errors/ok: write_byte err else: write_byte json/errors/ok write_bytes balance.to_le_bytes(32) # 256-bit integer json/methods/remaining_fuel_as_gen: fuel, err := host_remaining_fuel_as_gen() if err != json/errors/ok: write_byte err else: write_byte json/errors/ok write_bytes fuel.to_le_bytes(8) # 64-bit integer, must be safe integer (fits in double) See [mock implementation](../../executo...
(QB_NEW_EN_OTHER)
[grammar] ~143-~143: There might be a problem here.
Context: ...integer (fits in double) ``` See mock implementation ## Types ### VM results There are following codes: -...
(QB_NEW_EN_MERGED_MATCH)
[grammar] ~147-~147: Use colons correctly
Context: .../runner/mock_host.py) ## Types ### VM results There are following codes: - Return -...
(QB_NEW_EN_OTHER_ERROR_IDS_30)
[grammar] ~148-~148: There might be a problem here.
Context: ...py) ## Types ### VM results There are following codes: - Return - VMError indicating vm-produced error that usually can't be...
(QB_NEW_EN_MERGED_MATCH)
[grammar] ~150-~150: There might be a problem here.
Context: ...vm-produced error that usually can't be handled - UserError user-produced error #### Nondet blocks and sandbox encoding - 1 ...
(QB_NEW_EN_MERGED_MATCH)
[grammar] ~153-~153: Use colons correctly
Context: ...d error #### Nondet blocks and sandbox encoding - 1 byte of result code - result as-is ...
(QB_NEW_EN_OTHER_ERROR_IDS_30)
[grammar] ~155-~155: Use correct spacing
Context: ...s-is (calldata for Return, string for VMError|UserError) #### Parent VM result - 1 byte of result cod...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~157-~157: Use colons correctly
Context: ...or VMError|UserError) #### Parent VM result - 1 byte of result code - calldata for ...
(QB_NEW_EN_OTHER_ERROR_IDS_30)
[grammar] ~159-~159: Use correct spacing
Context: ...ssage": "string", "fingerprint": ... }forVMError|UserError ### Calldataget_calldata` method must re...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~161-~161: Use correct spacing
Context: ...t": ... }forVMError|UserError ### Calldataget_calldata` method must return [calldata encoded](../calld...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~162-~162: Use correct spacing
Context: ...](../calldata.md) bytes that conform to ABI: typescript { method?: string, // only for non-consturctors args: Array<any>, kwargs?: { [key: string]: any } } ### Read result It has code followed by byt...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~175-~175: Use correct spacing
Context: ...nternal error, like llm's modules being absent ### Storage format Storage can be seen as a ...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~178-~178: There might be a mistake here.
Context: ...eading undefined memory must return zeroes Storage can be seen as a file system tr...
(QB_NEW_EN_OTHER)
[grammar] ~180-~180: There might be a mistake here.
Context: ...em tree containing directories named as contracts which contain files named as slots, the...
(QB_NEW_EN_OTHER)
[grammar] ~180-~180: There might be a mistake here.
Context: ... contracts which contain files named as slots, then following implementation is valid:...
(QB_NEW_EN_OTHER)
[grammar] ~180-~180: Use articles correctly
Context: ...hich contain files named as slots, then following implementation is valid: ```bash # read...
(QB_NEW_EN_OTHER_ERROR_IDS_11)
[grammar] ~180-~180: Use correct spacing
Context: ...slots, then following implementation is valid: bash # read contract_a slot_b 10...100 cat db/contract_a/slot_b.bytes /dev/null | tail -c +10 | head -c +100 NOTE: calculating storage updates, ha...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~186-~186: Add a comma
Context: ... NOTE: calculating storage updates, hashes and so on is host's (node's) responsibi...
(QB_NEW_EN_OTHER_ERROR_IDS_22)
[grammar] ~186-~186: Use articles correctly
Context: ...ng storage updates, hashes and so on is host's (node's) responsibility. It is [the sam...
(QB_NEW_EN_OTHER_ERROR_IDS_11)
[grammar] ~186-~186: Use articles correctly
Context: .../state/state_object.go#L232): they have dirty override for the store
(QB_NEW_EN_OTHER_ERROR_IDS_11)
[grammar] ~186-~186: Use a period to end declarative sentences
Context: ...L232): they have dirty override for the store
(QB_NEW_EN_OTHER_ERROR_IDS_25)
🪛 markdownlint-cli2 (0.17.2)
doc/internal/node.md
21-21: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Ruff (0.11.9)
executor/testdata/runner/mock_host.py
148-148: collections may be undefined, or defined from star imports
(F405)
150-150: HostException may be undefined, or defined from star imports
(F405)
150-150: Errors may be undefined, or defined from star imports
(F405)
152-152: HostException may be undefined, or defined from star imports
(F405)
152-152: Errors may be undefined, or defined from star imports
(F405)
155-155: ResultCode may be undefined, or defined from star imports
(F405)
157-157: ResultCode may be undefined, or defined from star imports
(F405)
159-159: ResultCode may be undefined, or defined from star imports
(F405)
162-162: collections may be undefined, or defined from star imports
(F405)
executor/testdata/runner/base_host.py
195-195: ResultCode may be undefined, or defined from star imports
(F405)
205-205: Errors may be undefined, or defined from star imports
(F405)
209-209: Methods may be undefined, or defined from star imports
(F405)
291-291: Methods may be undefined, or defined from star imports
(F405)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: initial / pre-commit
🔇 Additional comments (30)
doc/website/src/spec/03-contract-interface/01-calldata.rst (1)
27-28: Typographic change looks good
Switching to straight quotes improves consistency with the rest of the spec. No further action required..github/workflows/full.yaml (1)
77-81: Rust toolchain pinned to 1.87.0 – verify compatibility
Jumping three versions (1.84 ➜ 1.87) may surface clippy or MSRV breakages in crates. Confirm the workspace compiles on CI and locally before merging; otherwise pin tostableor the minimal version dictated bycargo msrv.doc/website/src/index.rst (1)
8-11: Confirm the target document is discoverable by Sphinx
impl-spec/indexis now referenced in the root toctree.
Please double-check that:
doc/website/src/impl-spec/index.rstexists on this branch.impl-specis listed (or excluded) inexclude_patternsinsideconf.pyas intended.- A local
make html(or RTD build) succeeds without “toctree contains reference to nonexisting file” warnings.If any of these fail the docs build, CI will break.
executor/testdata/cases/py-core/leader_errors/leader_no_nondet.stdout (1)
1-1: Synchronise new error string with all expectationsThe expected output changed from
absent_leader_result→absent.
Ensure every test case, mock host, and Rust error mapping now emits the truncated identifier; otherwise cross-language tests will diverge.runners/support/registry (1)
1-1: Ensure CI fetches full submodule history and downstream tests still passI wasn’t able to list the commits between
ac95d7e262ffccbfbc366035fe64b232d1db51cb→c0f54a8e77b8eb3254ed76940136886a2d0c8355because the old SHA isn’t present in the registry submodule clone. Before merging, please manually verify:
- The registry remote contains the old commit (
ac95d7e262ff…), and your CI isn’t using a shallow clone that omits it.- All workflows that check out this repo include submodule support, e.g.:
git submodule update --init --recursive- or, for GitHub Actions,
with: submodules: trueunderactions/checkout- Run
cargo test --workspace --all-targets(and any Python tests) locally and confirm CI jobs pass against the bumped registry layout.doc/website/src/spec/03-contract-interface/05-execution-flow.rst (1)
28-28: Cross-reference target needs verificationThe link now points to
04-upgradability. Please confirm that the referenced document/label actually exists after the documentation reshuffle; otherwise Sphinx will emit a broken-link warning.executor/testdata/runner/host_fns.py (1)
21-22: Enum rename requires downstream updates
REMAINING_FUEL_AS_GENreplacesREMAINING_FUEL.
Make sure all Python test harnesses and the Rust side (executor/src/host/**) dispatch on the new discriminant 13; any stale literal values or pattern matches will silently mis-route messages.doc/website/src/impl-spec/01-core-architecture/index.rst (1)
6-11: Verify toctree pathsPlease make sure the three referenced files are located exactly at the same directory level; a mismatch (e.g. missing
.rstor differing filename case) breaksmake html.doc/website/src/impl-spec/index.rst (1)
1-14: Well-structured documentation entry point.The documentation structure is clean and follows reStructuredText conventions properly. The toctree organization with glossary and core architecture sections provides good navigation for users.
doc/website/src/impl-spec/01-core-architecture/02-host-interface.rst (2)
100-103: Documentation correctly reflects protocol changes.The update from
read_result()toread_slice()accurately reflects the simplified host communication protocol. The comment about preventing socket closure before data is read provides helpful context for the protocol design.
112-112: Improved variable naming for clarity.The change from
datatoresult_dataprovides better semantic clarity about what the variable contains, improving documentation readability.executor/common/src/calldata/mod.rs (1)
7-7: Good API extension for encoding flexibility.Making
encode_topublic allows users to encode directly into existing buffers, which is more efficient than the existingencodefunction that creates new vectors. This is a useful addition that maintains backward compatibility.executor/modules/implementation/scripting/llm-default.lua (1)
49-53: Improved logging reduces noise while maintaining error visibility.The conditional logging change is well-designed. It only logs warnings for unexpected errors (when
as_user_erroris nil), reducing log noise while ensuring that non-user errors are still visible for debugging. The logic correctly handles the error flow.doc/website/src/spec/01-core-architecture/02-vm.rst (3)
8-8: Good typography standardization.Changing from curly quotes to straight quotes improves consistency and readability in technical documentation.
142-142: Corrected cross-reference maintains documentation accuracy.The update from "04-calldata" to "01-calldata" ensures the cross-reference points to the correct documentation section, maintaining navigability.
148-148: Minor wording improvements enhance clarity.The small wording changes improve readability without altering the technical meaning, making the documentation more accessible.
Also applies to: 179-179
executor/testdata/runner/mock_host.py (2)
148-160: Protocol change looks good - aligns with the unified result encoding.The change to return a single bytes buffer with the
ResultCodeas the first byte followed by the encoded payload is consistent with the new protocol specification. The use of exceptions for error cases instead of return codes is a cleaner approach.
162-163: Simplified method signature aligns with protocol changes.The removal of the
type: ResultCodeparameter is consistent with the protocol changes where the result code is now embedded in the data itself.doc/website/src/spec/01-core-architecture/01-components.rst (1)
1-19: Documentation restructuring provides clear separation of concerns.The condensed high-level overview effectively summarizes the GenVM architecture while avoiding duplication with the detailed implementation specification. The focus on the dual-mode execution model and supervisor pattern provides a good conceptual understanding.
doc/website/src/impl-spec/01-core-architecture/01-components.rst (1)
1-142: Comprehensive architectural documentation with clear structure.This new documentation provides an excellent detailed overview of GenVM's architecture. The Mermaid diagram effectively visualizes the component relationships, and the detailed descriptions of each component are well-organized and informative.
doc/internal/node.md (1)
1-187: Valuable technical documentation with detailed protocol specification.This internal documentation provides essential details about the GenVM process operation and communication protocol. The pseudo-code implementation is particularly helpful for understanding the protocol flow.
executor/testdata/runner/base_host.py (5)
77-81: Interface simplification aligns with the new protocol design.The removal of explicit
ResultCodeparameters and the switch to raw byte buffers simplifies the interface while maintaining the same functionality through embedded result codes.
150-153: Clean implementation of the slice reading function.The
read_slicefunction correctly reads length-prefixed data and returns it as a memoryview for efficient buffer handling.
194-195: Correct extraction of ResultCode from the received slice.The implementation properly extracts the result code from the first byte and passes the remaining data to the handler.
205-208: Proper implementation of length-prefixed data sending.The code correctly sends the data as a length-prefixed byte slice after the OK status.
291-291: Method name update for consistency.The rename from
REMAINING_FUELtoREMAINING_FUEL_AS_GENaligns with the naming convention used across the codebase.executor/src/host/mod.rs (4)
11-11: LGTM!The
core::strimport is appropriately added to support UTF-8 conversions in the refactored result decoding logic.
275-275: Good addition of observability!The logging statements provide valuable debugging information and are well-placed at function entry points and after key operations.
Also applies to: 329-329, 335-335, 370-370, 394-394, 414-414, 434-434, 448-448, 472-472, 496-496, 514-514
265-265: Excellent refactoring to use unified helper!The consistent use of
write_sliceacrossstorage_write,post_message, anddeploy_contractreduces code duplication and ensures uniform protocol handling.Also applies to: 403-404, 422-424
356-363: Document or handle UTF-8 assumptions for host error messagesThe host/VM protocol currently uses
str::from_utf8(rest)?when decoding user and VM error payloads:executor/src/host/mod.rs … x if x == ResultCode::UserError as u8 => vm::RunOk::UserError(String::from(str::from_utf8(rest)?)) x if x == ResultCode::VmError as u8 => vm::RunOk::VMError(String::from(str::from_utf8(rest)?), None)We didn’t find any explicit guarantee in the protocol spec or code comments that error messages are always valid UTF-8. As a result, if the host ever sends non-UTF-8 bytes, this will bubble up as a decoding error.
• Either document in your protocol/message format that all error payloads must be UTF-8 encoded
• Or switch to a lossy conversion (e.g.String::from_utf8_lossy(rest).into_owned()) and handle any replacement characters gracefullyPlease review the intended guarantees for host error strings and update the code or documentation accordingly.
| pub fn encode_to(to: &mut Vec<u8>, value: &Value) { | ||
| match value { |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Making encode_to public needs accompanying contract + docs
encode_to was private and is now exposed to external callers, but:
- It mutates
toblindly and relies on callers to respect the already-encoded invariants (ordered map keys, no circular references, etc.). - There is no
#[doc]comment explaining these invariants or the expected pre-/post-conditions. - The function name is exported verbatim; consider
pub(crate)if only GenVM internals need it, otherwise document it thoroughly and add fuzz/round-trip tests to lock the wire format.
-pub fn encode_to(to: &mut Vec<u8>, value: &Value) {
+/// Encodes `value` in Calldata binary format and appends the bytes to `to`.
+///
+/// Caller must ensure map keys are **strictly** ascending and the buffer
+/// does **not** already contain partial Calldata; otherwise the stream will be
+/// un-decodable.
+pub fn encode_to(to: &mut Vec<u8>, value: &Value) {🏁 Script executed:
#!/usr/bin/env bash
set -e
echo "1. Function definition and visibility:"
rg -n -C3 "^pub fn encode_to" executor/common/src/calldata/bin.rs || true
echo
echo "2. Doc comments directly above encode_to:"
rg -B5 "^pub fn encode_to" executor/common/src/calldata/bin.rs | sed -n '1,10p' || true
echo
echo "3. All references to encode_to in the repository:"
rg -n "encode_to" .
echo
echo "4. Tests or examples mentioning encode_to:"
rg -n "encode_to" -g "*test*.rs" || trueLength of output: 1648
Add documentation and tests for encode_to
The newly public encode_to function in executor/common/src/calldata/bin.rs (line 209) has no docs and relies on strict caller invariants. Please:
- Insert a
///doc comment aboveencode_todetailing:- That it appends Calldata in binary format to
to. - Caller guarantees: empty or fully-encoded buffer, strictly ascending map keys, no circular refs.
- That it appends Calldata in binary format to
- Add fuzz or round-trip tests (e.g. in
tests/) to lock down the wire format. - If callers outside the executor internals don’t need this API, consider changing its re-export in
executor/common/src/calldata/mod.rs(around line 7) topub(crate)instead ofpub use.
Suggested doc stub:
executor/common/src/calldata/bin.rs
@@ -206,6 +206,13 @@
}
}
+/// Encode `value` into Calldata’s binary format, appending to `to`.
+///
+/// # Safety & Invariants
+/// - `to` must be empty or already contain a complete Calldata stream.
/// - Map keys in `value` must be strictly ascending.
/// - No circular references allowed (will overflow/loop).
+/// - Violating these will render the stream un-decodable.
pub fn encode_to(to: &mut Vec<u8>, value: &Value) {
match value {
Value::Null => to.push(SPECIAL_NULL),Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In executor/common/src/calldata/bin.rs at lines 209-210, the public function
encode_to lacks documentation and tests. Add a /// doc comment above encode_to
explaining that it appends Calldata in binary format to the provided buffer and
specifying caller guarantees: the buffer must be empty or fully encoded, map
keys strictly ascending, and no circular references. Then, create fuzz or
round-trip tests in the tests/ directory to verify and lock down the wire
format. Finally, if external callers outside executor internals do not require
this function, change its re-export in executor/common/src/calldata/mod.rs near
line 7 from pub use to pub(crate) to restrict visibility.
| Err(e) => { | ||
| let data = calldata::Value::Str(format!("{e:?}")); | ||
| let val = calldata::encode(&data); | ||
| let mut encoded = Vec::from([ResultCode::VmError as u8]); | ||
| encoded.extend_from_slice(format!("{e:?}").as_bytes()); | ||
|
|
||
| (ResultCode::InternalError, val) | ||
| encoded | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Use Display formatting for external error messages.
Debug formatting ({e:?}) on line 315 may expose internal implementation details. Consider using Display formatting for cleaner external error messages.
Err(e) => {
let mut encoded = Vec::from([ResultCode::VmError as u8]);
- encoded.extend_from_slice(format!("{e:?}").as_bytes());
+ encoded.extend_from_slice(format!("{e}").as_bytes());
encoded
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Err(e) => { | |
| let data = calldata::Value::Str(format!("{e:?}")); | |
| let val = calldata::encode(&data); | |
| let mut encoded = Vec::from([ResultCode::VmError as u8]); | |
| encoded.extend_from_slice(format!("{e:?}").as_bytes()); | |
| (ResultCode::InternalError, val) | |
| encoded | |
| } | |
| Err(e) => { | |
| let mut encoded = Vec::from([ResultCode::VmError as u8]); | |
| encoded.extend_from_slice(format!("{e}").as_bytes()); | |
| encoded | |
| } |
🤖 Prompt for AI Agents
In executor/src/host/mod.rs around lines 313 to 318, the error message is
formatted using Debug (`{e:?}`), which may expose internal details. Change the
formatting to use Display (`{e}`) instead by replacing `format!("{e:?}")` with
`format!("{e}")` to provide cleaner and safer external error messages.
| fn write_slice(sock: &mut dyn Sock, data: &[u8]) -> Result<()> { | ||
| let len = data.len() as u32; | ||
|
|
||
| sock.write_all(&len.to_le_bytes())?; | ||
| sock.write_all(data)?; | ||
|
|
||
| Ok(()) | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Add bounds check to prevent silent truncation.
The cast from usize to u32 on line 71 could silently truncate if the data length exceeds 4GB. While unlikely in practice, it's better to fail explicitly.
fn write_slice(sock: &mut dyn Sock, data: &[u8]) -> Result<()> {
- let len = data.len() as u32;
+ let len = u32::try_from(data.len())
+ .context("Data length exceeds maximum size (4GB)")?;
sock.write_all(&len.to_le_bytes())?;
sock.write_all(data)?;
Ok(())
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn write_slice(sock: &mut dyn Sock, data: &[u8]) -> Result<()> { | |
| let len = data.len() as u32; | |
| sock.write_all(&len.to_le_bytes())?; | |
| sock.write_all(data)?; | |
| Ok(()) | |
| } | |
| fn write_slice(sock: &mut dyn Sock, data: &[u8]) -> Result<()> { | |
| let len = u32::try_from(data.len()) | |
| .context("Data length exceeds maximum size (4GB)")?; | |
| sock.write_all(&len.to_le_bytes())?; | |
| sock.write_all(data)?; | |
| Ok(()) | |
| } |
🤖 Prompt for AI Agents
In executor/src/host/mod.rs around lines 70 to 77, the cast from usize to u32
for data length can silently truncate if data exceeds 4GB. Add an explicit check
before the cast to verify that data.len() does not exceed u32::MAX, and return
an error if it does, preventing silent truncation.
| sock.write_all(&call_no.to_le_bytes())?; | ||
|
|
||
| write_result(sock, Ok(res))?; | ||
| write_slice(sock, &Vec::from_iter(res.as_bytes_iter()))?; |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Consider avoiding unnecessary allocation.
The Vec::from_iter creates an allocation that might be avoidable if the result can provide bytes as a slice.
- write_slice(sock, &Vec::from_iter(res.as_bytes_iter()))?;
+ // If res has a method that returns &[u8] directly:
+ write_slice(sock, res.as_bytes())?;
+ // Otherwise, consider writing bytes directly without intermediate Vec🏁 Script executed:
#!/bin/bash
# Check if RunOk has an as_bytes() method or similar
ast-grep --pattern 'impl $_ for RunOk {
$$$
fn as_bytes($$$) $$$
$$$
}'Length of output: 79
We still need to confirm what res actually is and how its as_bytes_iter() is implemented. Let’s locate its type and the iterator definition:
🏁 Script executed:
#!/bin/bash
# Find the definition of the result type (e.g., RunOk)
rg -n "struct RunOk" -A 3
rg -n "enum RunOk" -A 3
# Find how as_bytes_iter is implemented
rg -n "as_bytes_iter" -A 5Length of output: 2389
Avoid heap allocation in post_nondet_result
The current code collects the entire byte sequence into a Vec<u8>, causing an unnecessary heap allocation:
write_slice(sock, &Vec::from_iter(res.as_bytes_iter()))?;Instead, you can write the status byte and payload directly, eliminating the temporary Vec:
- write_slice(sock, &Vec::from_iter(res.as_bytes_iter()))?;
+ use crate::public_abi::ResultCode;
+
+ // Write the status code byte
+ let code = [match res {
+ RunOk::Return(_) => ResultCode::Return as u8,
+ RunOk::UserError(_) => ResultCode::UserError as u8,
+ RunOk::VMError(_,_) => ResultCode::VMError as u8,
+ }];
+ write_slice(sock, &code)?;
+
+ // Write the payload bytes without an intermediate Vec
+ match res {
+ RunOk::Return(buf) => write_slice(sock, buf)?,
+ RunOk::UserError(msg) => write_slice(sock, msg.as_bytes())?,
+ RunOk::VMError(msg, _)=> write_slice(sock, msg.as_bytes())?,
+ }This refactoring removes the extra allocation while preserving the exact same output.
🤖 Prompt for AI Agents
In executor/src/host/mod.rs at line 379, the code creates a temporary Vec from
an iterator causing unnecessary heap allocation. Refactor the code to write the
status byte and payload directly to the socket without collecting into a Vec, by
calling write_slice separately for the status byte and the payload iterator,
thus eliminating the intermediate allocation while preserving output.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Chores
Documentation