This repository is a specialized version of the HAMLET framework (Hierarchical Agents for Multi-level Learning, Execution & Tasking), enhanced with capabilities for generating and executing Discrete Event System (DEVS) simulations.
git clone <repository-url>
cd HAMLET_publishWe recommend using Conda:
conda create -n hamlet_env python=3.10 -y
conda activate hamlet_env
conda install -c pytorch faiss-cpu -y
conda install pandas -y
conda install pytorch -y
pip install -r requirements.txtCopy the example file and fill in your keys:
cp .env.example .envThen edit .env with your API credentials:
| Variable | Required | Description |
|---|---|---|
OPENROUTER_API_KEY |
Yes (recommended) | API key from OpenRouter |
OPENROUTER_API_BASE |
No | OpenRouter base URL, default: https://openrouter.ai/api/v1 |
OPENAI_API_KEY |
Yes (alternative) | OpenAI-compatible API key (also works with Aliyun, etc.) |
OPENAI_BASE_URL |
No | OpenAI-compatible base URL, default: https://api.openai.com/v1 |
SERPER_API_KEY |
Yes | API key from Serper for web search |
JINA_API_KEY |
Yes | API key from Jina for web scraping |
HF_TOKEN |
Optional | Hugging Face token for downloading models |
Note: You need at least one of
OPENROUTER_API_KEYorOPENAI_API_KEY.SERPER_API_KEYandJINA_API_KEYare required for web search capabilities.
python -m devs_app.runpython -m devs_app.run --mode cliGenerate a DEVS model from a benchmark specification:
python -m devs_app.run \
--mode generate \
--debug_args_file benchmark/ABP/ABP_D1.yaml \
--concur_num 4| Argument | Default | Description |
|---|---|---|
--model_id |
gpt-4.1 |
Model for the agent (weak model) |
--model_id_strong |
gpt-5.2 |
Model for check/repair steps (strong model) |
--mode |
gradio |
gradio, cli, server, generate, generate_and_test |
--debug_args_file |
devs_app/devs_model_inputs/example1.json |
Path to JSON/YAML file with tool parameters |
--target_tool |
devs_construct_tree |
Tool name to invoke in generate modes |
--concur_num |
4 |
Number of concurrent generation workers |
--agent_planning_interval |
4 |
Planning interval for manager agent |
--agent_max_steps |
80 |
Max reasoning steps for manager agent |
--agent_log_level |
DEBUG |
Log verbosity: DEBUG, INFO, WARNING, ERROR |
We compare against OpenHands, MetaGPT, and SWE-agent. Each baseline requires a separate Conda environment and has its own .env and requirements.txt in its folder. Fill in the API keys in each baseline's .env file.
conda create -n openhands python=3.12
conda activate openhands
pip install -r devs_baseline/openhands_run/requirements.txtThen edit devs_baseline/openhands_run/.env with your API keys. See devs_baseline/openhands_run/README.md for details.
conda create -n metagpt python=3.10
conda activate metagpt
pip install -r devs_baseline/meta_gpt_run/requirements.txtThen edit devs_baseline/meta_gpt_run/.env with your API keys. See devs_baseline/meta_gpt_run/README.md for details.
conda create -n sweagent python=3.12
conda activate sweagent
pip install -r devs_baseline/swe_agent_run/requirements.txtBuild the Docker image:
cd devs_baseline/swe_agent_run/docker_construct
docker build -t python-xdevs-simpy .The Docker image installs xdevs==3.0.0 from PyPI during build (network access required).
Then edit devs_baseline/swe_agent_run/.env with your API keys. See devs_baseline/swe_agent_run/README.md for details.
Navigate to the tester directory:
cd devs_testerRun a single generation task:
python gen_runner.py --framework devs_fast_plan --model openai/qwen3.6-plus --benchmark ABP --workspace /tmp/wsList available frameworks and benchmarks:
python gen_runner.py --list-frameworks
python gen_runner.py --list-benchmarksEvaluate a generated project:
python eval_runner.py --benchmark ABP --sim_cwd /tmp/ws/abp_model --sim_script run.py --workspace /tmp/resultsEdit devs_tester/experiment_config.py to configure:
BENCHMARKS-- benchmark catalog with paths to gen config, test config, and checkerTARGET_BENCHMARKS-- which benchmarks to runEXPERIMENT_LLMS-- model short_name to OpenRouter model_id mappingEXPERIMENT_FRAMEWORKS-- which frameworks to testGENERATION_TIMEOUTS-- per-framework generation timeouts
HAMLET_publish/
├── benchmark/ # Benchmark specifications (ABP, SEIRD, SA, etc.)
├── default_tools/ # Default HAMLET tools (file editing, web search, KB, etc.)
├── devs_app/ # Main DEVS agent application
│ └── run.py # Entry point for all modes
├── devs_baseline/ # Baseline implementations
│ ├── devs_skill/ # Skill-guided baseline scripts
│ ├── meta_gpt_run/ # MetaGPT runner
│ ├── openhands_run/ # OpenHands runner
│ └── swe_agent_run/ # SWE-agent runner
├── devs_tester/ # Experiment orchestration suite
│ ├── experiment_config.py # All experiment parameters
│ ├── gen_runner.py # Code generation engine
│ └── eval_runner.py # Evaluation pipeline (simulation + checker)
├── devs_tools/ # DEVS-specific tools
│ └── devs_construct_pure_fast_plan/ # Plan-then-construct DEVS tool
├── example_app/ # Standard HAMLET example application
├── src/ # Core HAMLET framework code
│ ├── models.py # Model definitions
│ ├── local_python_executor.py
│ ├── remote_executors.py
│ └── ...
└── requirements.txt
| Name | Description |
|---|---|
| ABP | Alternating Bit Protocol |
| SEIRD | Epidemiological model |
| SA | Simulated Annealing |
| OTrain | Airport Operations Train |
| IOBS | Island Observing Station |
| barbershop | Barber Shop simulation |
| oft | Ocean Freight Terminal |
| ComplexSup1 | Complex Supply Network 1 |
| ComplexSup2 | Complex Supply Network 2 |
| Framework | Description |
|---|---|
devs_tool |
DEVS construct with check loop |
devs_fast_plan |
DEVS fast plan-then-construct (no check, concurrent) |
meta_gpt |
MetaGPT multi-agent |
swe_agent |
SWE-Agent standard |
openhands |
OpenHands standard |
Since this project is built on HAMLET, it supports all standard HAMLET features for general-purpose agent tasks.
HAMLET includes a suite of default tools located in default_tools/:
- File Management:
list_dir,see_text_file,modify_file, etc. - Knowledge Base (KB): Semantic search, file addition, and management.
- Web Capabilities:
web_search(Deep Search),text_web_browser. - Visual Capabilities:
visualizer(Visual QA).
A pre-configured example application is included in example_app/:
python -m example_app.run- Main project license: see
LICENSE(Apache-2.0). - Project acknowledgements: see
ACKNOWLEDGEMENTS.md. - Third-party license notices: see
THIRD_PARTY_NOTICES.md.