execution-semantics

Here is 1 public repository matching this topic...

kroq86 / honeybadger

formal VM benchmark and inspectable reasoning runtime for testing whether language models can follow machine-like execution semantics on synthetic tasks.

benchmark vm semantics mcp language-models copilot vm-benchmark mcp-server copilot-coding-agent dataset-factory eval-toolkit reasoning-runtime execution-semantics

Updated Mar 20, 2026
Python

Improve this page

Add a description, image, and links to the execution-semantics topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the execution-semantics topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

execution-semantics

Here is 1 public repository matching this topic...

kroq86 / honeybadger

Improve this page

Add this topic to your repo