GRETA (Grammar REpair via Tree Automata) demonstrates how ambiguous context-free grammars (CFGs) can be disambiguated using tree automata synthesized from user-provided examples.
GRETA introduces tree automata-learning algorithm and programming-by-examples synthesis framework to formally and automatically repair CFG ambiguities based on tree examples selected by the user. The project was originally motivated by the idea from the paper Restricting Grammars with Tree Automata by Michael D. Adams and Matthew Might1
.
├── dune
├── dune-project
├── Makefile
├── bin
│ ├── dune
│ ├── opt_type.ml // types for optimization ablations
│ └── main.ml // main repl
├── lib
│ ├── dune
│ ├── converter.ml // convertion between mly and CFG and between CFG and TA
│ ├── operation.ml // TA intersection operation
│ ├── operation_alt.ml // TA intersection operation w/o optimizations
│ ├── examples.ml // tree examples generator
│ ├── learner.ml // TA-learner
│ ├── utils.ml // some glue code to hook things together
│ ├── cfg.ml // definition of CFG
│ ├── ta.ml // definition of TA and tree
│ ├── treeutils.ml // utilities for TA
│ ├── pp.ml // pretty printers
│ ├── parser.mly // definition of the grammar in menhir
│ ├── lexer.ml // definition of a lexer
│ └── ast.ml // definition of an ast
└── test
├── grammars-revamp/ // grammars tested for OOPSLA 2026
├── :
└── dune
GRETA requires:
- OCaml (via
opam) - dune
- Menhir (custom fork with CFG dumping support)
- Some Python tools (for testing)
The instructions below are OS-specific.
sudo apt update
sudo apt install -y \
opam \
expect \
python3-pandas \
python3-matplotlibopam init -y
opam switch create 5.1.1
eval $(opam env --switch=5.1.1)opam install -y \
dune \
sedlex \
ppx_deriving \
ppx_deriving_yojson \
num \
core \
core_unix \
qcheck \
fileutils \
stdint \
graphicsbrew install opam
opam init
eval "$(opam env)"opam switch create 5.1.1
eval "$(opam env)"opam install -y \
dune \
sedlex \
ppx_deriving \
ppx_deriving_yojson \
num \
core \
core_unix \
qcheck \
fileutils \
stdint \
graphicsGRETA depends on a custom Menhir fork that supports CFG dumping.
-
Clone the Menhir repository.
-
Build and install it:
make installNote! This Menhir version must be installed before building GRETA.
- GRETA is configured to run with:
(lang dune 2.1)
(using menhir 2.0)- If you encounter Merlin errors such as:
... seems to be compiled with a version of OCaml that is not supported by Merlin
switch to OCaml 4.14.0:
opam switch create 4.14.0
eval $(opam env)You can do the above by opam switch list and selecting
ocaml.4.14.0 compiler for this project. Make sure you run eval $(opam env) after switching to version 4.14.0.
From the project root directory:
makeThis command builds the project and launches the GRETA interactive REPL. Follow the on-screen prompts to select the tree example(s) representing
You can choose optimization/ablation settings. By default, it runs with all the optimizations.
- Run without reachability-based optimization:
make run-wo-opt1- Run without duplicate removal optimization:
make run-wo-opt2- Run without epsilon introduction optimization:
make run-wo-opt3- Run without any of the 3 optimizations:
make run-wo-opt123The test suite reproduces the experimental evaluation used in the OOPSLA 2026 paper.
cd menhir # path to the custom Menhir fork (already in dump-cfg)
make installcd ../greta/test
./harness.pyPython Environment Notes
When running the test harness, you may encounter errors about uninstalled packages.
If you try installing these packages with:
python3 -m pip install pandas matplotlib
you may instead see error: externally-managed-environment.
This happens because Python installed via the system package manager prevents pip from modifying the system-managed environment.
Recommended solution: use a virtual environment
From the project root directory:
cd /path/to/greta
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install pandas matplotlib
Then run the harness while the virtual environment is active:
cd test
./harness.py
That is, you will need to activate the virtual environment (source .venv/bin/activate) whenever you run the test harness in a new
terminal session.
The harness produces the following outputs:
- Per-grammar results — For each grammar variant and mode, a results
folder is created at
grammars-revamp/<group>/<variant>_<mode>_results_artifact/containing individual CSV files,.cfg,.conflicts,.trees, and any intermediate.mlyfiles. - Aggregated results —
results.csvinside each results folder, produced by the aggregator. - LaTeX table —
artifact_table.texin thetest/directory, containing the full evaluation table ready for inclusion in a paper. - Scatter plots —
convert_time_vs_ambiguities.pdf,learn_time_vs_ambiguities.pdf, andintersect_time_vs_ambiguities.pdfin thetest/directory.
[1] Michael D. Adams and Matthew Might. Restricting Grammars with Tree Automata.