spec2code generates C implementations from structured requirements and evaluates them using compile checks, formal verification, and non-functional critics.
This repository is based on the methodology from:
- Generating Safety-Critical Automotive C-programs using LLMs with Formal Verification (NeSy 2025)
- OpenReview:
https://openreview.net/forum?id=xcLrUIeGo3
This codebase includes a shutdown-algorithm use case based on An Empirical Study of the Code Generation of Safety-Critical Software Using LLMs (Applied Sciences 2024).
Suggested citation (paper):
@inproceedings{sevenhuijsen2025safetycritical,
title={Generating Safety-Critical Automotive C-programs using LLMs with Formal Verification},
author={Sevenhuijsen, Merlijn and Patil, Minal Suresh and Nyberg, Mattias and Ung, Gustav},
year={2025},
url={https://openreview.net/forum?id=xcLrUIeGo3}
}Reference paper for the shutdown-algorithm use case:
@article{liu2024empirical,
title={An Empirical Study of the Code Generation of Safety-Critical Software Using LLMs},
author={Liu, Mingxing and Wang, Junfeng and Lin, Tao and Ma, Quan and Fang, Zhiyang and Wu, Yanqun},
journal={Applied Sciences},
volume={14},
number={3},
pages={1046},
year={2024},
doi={10.3390/app14031046},
url={https://www.mdpi.com/2076-3417/14/3/1046}
}- This project is released under a non-commercial license.
- Commercial use is not allowed without written permission.
- See
LICENSEfor full terms.
- Install and run: docs/INSTALLING_AND_RUNNING.md
- Add a critic: docs/ADDING_CRITICS.md
- Docs index: docs/README.md
High-level pipeline flow:
- CLI entrypoint parses args:
src/spec2code/cli/run_pipeline.py - Orchestrator builds runtime and executes configs:
src/spec2code/core/runner.py - Config parser validates JSON experiment files:
src/spec2code/pipeline_modules/config_loader.py - Prompt inputs are assembled from specs/interfaces/headers.
- LLMs are called through provider abstraction:
src/spec2code/pipeline_modules/llms.py - Artifacts are parsed/written and critics are run:
src/spec2code/core/artifacts.py - Per-sample and run-level JSON outputs are stored in the configured output folder.
- HTML report is generated at
<SPEC2CODE_OUTPUT_ROOT>/reports/last-run.html.
flowchart TD
A[CLI: run_pipeline.py] --> B[Runner: core/runner.py]
B --> C[Solver init + config loading]
C --> D[Runtime build: models + critics]
D --> E[Pipeline executor]
E --> F[LLM call via provider layer]
F --> G[Artifact parser + writer]
G --> H[Critics execution]
H --> I[output.json and output_pipeline.json]
I --> J[GUI report: <SPEC2CODE_OUTPUT_ROOT>/reports/last-run.html]
For a deeper component walkthrough, see docs/ARCHITECTURE.md.
Full setup/run instructions are maintained in docs/INSTALLING_AND_RUNNING.md.
This project is designed to run cleanly inside Docker.
- macOS / Windows: Docker Desktop
- Linux: Docker Engine + Docker CLI
Official install docs:
https://docs.docker.com/get-docker/
From repository root:
docker build -f dockerfile -t spec2code:local .Vernfr (tools/nfrcheck) is optional and is not built by default.
Apple Silicon (M1/M2/M3) notes:
- Native build (recommended):
docker build --platform linux/arm64 -f dockerfile -t spec2code:local .- If you explicitly need x86_64 behavior/tools:
docker build --platform linux/amd64 -f dockerfile -t spec2code:local .Optional (only if you need Bedrock CLI commands in-container):
docker build -f dockerfile --build-arg INSTALL_AWSCLI=1 -t spec2code:local .Optional (enable Vernfr build in Docker image):
docker build -f dockerfile --build-arg BUILD_NFRCHECK=1 -t spec2code:local .docker run --rm -it -v "$(pwd)":/workspace spec2code:local bashIf you built linux/amd64 on Apple Silicon, run with matching platform:
docker run --rm -it --platform linux/amd64 -v "$(pwd)":/workspace spec2code:local bashcd /workspace
PYTHONPATH=src python3 -m spec2code.cli.run_pipeline --config config/gui_templates/shutdown-algorithm-template.jsonDefault template config (config/gui_templates/shutdown-algorithm-template.json) uses test-llm-shutdown (mock model), so it runs without cloud credentials or local Ollama.
If a container is already running:
docker exec -it <container_name_or_id> bash
cd /workspace
PYTHONPATH=src python3 -m spec2code.cli.run_pipeline --config config/gui_templates/shutdown-algorithm-template.jsonYou can launch a local web UI to run pipeline templates without editing JSON manually:
PYTHONPATH=src python -m spec2code.gui.run_server --host 127.0.0.1 --port 8080If running from inside Docker, publish the port and bind to all interfaces:
docker run --rm -it -p 8080:8080 -v "$(pwd)":/workspace spec2code:local bash
cd /workspace
PYTHONPATH=src python -m spec2code.gui.run_server --host 0.0.0.0 --port 8080Then open:
http://127.0.0.1:8080/runnerhttp://127.0.0.1:8080/results
The UI provides:
/runnerpage: template mode and custom JSON mode/resultspage: embedded latest report and quick analytics (bar/pie charts)- theme options (Indigo/Monokai/Sunrise/Slate)
- run logs and warning hints (e.g., missing Why3)
By default, GUI/runtime outputs are written outside the repository to:
../spec2code_output(resolved from repo root)
By default, case study assets are also loaded from outside the repository:
../spec2code_case_studies(resolved from repo root)
Additional user-managed input assets can be stored in:
../spec2code_input(resolved from repo root; shown in GUI file picker)
Override this location with:
export SPEC2CODE_OUTPUT_ROOT=/absolute/path/to/spec2code_output
export SPEC2CODE_CASE_STUDIES_ROOT=/absolute/path/to/spec2code_case_studies
export SPEC2CODE_INPUT_ROOT=/absolute/path/to/spec2code_inputGUI template files are read from:
config/gui_templates/
Temporary JSON files created by GUI runs are stored under:
<SPEC2CODE_OUTPUT_ROOT>/gui_tmp/
Provider-based design (not tied to AWS):
ollamafor local modelsopenai-compatiblefor vLLM and compatible APIs- optional
bedrock
Config file:
config/llm_providers.yaml
Override config location:
export SPEC2CODE_LLM_CONFIG=/path/to/llm_providers.yaml- Add (or reuse) a provider in
config/llm_providers.yaml. - Add a model under
models:. - Use that model key in
llms_usedin your run config JSON.
Example:
providers:
my_vllm:
type: openai-compatible
base_url: http://localhost:8000/v1
api_key_env: VLLM_API_KEY
models:
my-vllm-model:
provider: my_vllm
model: Qwen/Qwen2.5-Coder-7B-Instruct
max_tokens: 2048Then in pipeline config:
"llms_used": ["my-vllm-model"]Pipeline configs are JSON arrays (multiple runs per file).
Key fields:
name,selected_prompt_templatellms_used,n_programs_generated,temperaturenatural_spec_path,interface_path,headers_dir,headers_manifestinclude_dirs,verification_header_pathcritics,timeout_s,framac_wp_timeout_s,framac_wp_no_let
Notes:
case_studyis still accepted for backward compatibility, but current templates use explicit file paths.- Relative
case_studies/...paths resolve against<SPEC2CODE_CASE_STUDIES_ROOT>when assets are externalized.
Reference:
config/gui_templates/shutdown-algorithm-template.json
- Supported out of the box:
shutdown_algorithmin<SPEC2CODE_CASE_STUDIES_ROOT>/shutdown_algorithm - Legacy/experimental identifiers like
sgmm*andsfld*may be accepted by parser logic, but they are not shipped as ready-to-run templates/assets in this OSS package. - See docs/CASE_STUDIES.md for expected folder layout and status by case study.
Run the shutdown algorithm use case:
PYTHONPATH=src python -m spec2code.cli.run_pipeline --config config/gui_templates/shutdown-algorithm-template.jsonConfigurable critics:
compileframac-wpcppcheck-misravernfr-control-flowvernfr-data-flow
Vernfr support is optional. Build/install it only when you want Vernfr critics. If Vernfr is not installed, other critics still work normally.
Implementation entrypoint:
src/spec2code/pipeline_modules/critics/critics_runner.py
- Per sample:
<output_folder>/<llm>/sample_000/output.json - Run summary:
<output_folder>/output_pipeline.json - Report page:
<SPEC2CODE_OUTPUT_ROOT>/reports/last-run.html
For adding providers, critics, prompt templates, parser formats, and pipeline features:
- docs/README.md
- docs/EXTENDING.md
- docs/ADDING_LLMS.md
- docs/ADDING_CRITICS.md
- docs/CRITIC_EXTENSIBILITY_PLAN.md
- docs/ADDING_PROMPTS.md
- docs/ADDING_PIPELINE_FEATURES.md
- docs/CASE_STUDIES.md
- docs/ARCHITECTURE.md
If needed, you can still run locally:
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
PYTHONPATH=src python -m spec2code.cli.run_pipeline --config config/gui_templates/shutdown-algorithm-template.jsonFor Vernfr critics in local (non-Docker) runs, install optional tooling first:
bash tools/install_optional_vernfr.shManual equivalent:
eval "$(opam env --switch=ocaml5)"
cd tools/nfrcheck
dune build @install
dune install