Learn2Zinc

Code and artifacts for finetuning LLMs to generate MiniZinc constraint programming models from natural language problem descriptions. We train on problems from the text2zinc dataset and evaluate on a held-out test set of 100 problems originally sourced from CardinalOperations/IndustryOR. Additional training problems are drawn from the OR-Instruct-3K dataset.

Repository Structure

learn2zinc/
├── finetuning.ipynb                 # Unified finetuning notebook (all models × all strategies)
├── verify_results.py                # Standalone result verification script
├── data/                            # Local data files
├── evaluations/
│   └── generated_code/              # Pre-generated MiniZinc files for every model/strategy
│       ├── gemma-2-9b_original/     # Base model (no finetuning)
│       ├── gemma-2-9b_learn2zinc_finetuned/
│       ├── gemma-2-9b_learn2zinc_cot_finetuned/
│       ├── gemma-2-9b_learn2zinc_augmented_finetuned/
│       ├── gemma-2-9b_combined_finetuned/
│       ├── retry_gemma-2-9b_learn2zinc_augmented_finetuned/
│       ├── ...                      # (same structure for all 5 base models)
│       └── ensemble_learn2zinc_augmented/  # Ensemble cascade across models
└── LICENSE

Each strategy folder contains up to 100 .mzn files, one per problem in the IndustryOR test set.

Base Models

Model	Size	HuggingFace (finetuned)
Qwen3	0.6B	skadio/learn2zinc-Qwen3-0.6B
Llama 3.2	1B	skadio/learn2zinc-Llama-3.2-1B
Llama 3.2	3B	skadio/learn2zinc-Llama-3.2-3B
Gemma 2	9B	skadio/learn2zinc-Gemma-2-9B
GPT-oss	20B	skadio/learn2zinc-GPT-oss-20B

The published checkpoints above are the best-performing variant for each base model, all finetuned on the Learn2Zinc-Augmented dataset.

Finetuning Datasets

Each dataset provides (natural language description → MiniZinc code) training pairs with different prompting strategies:

Strategy	Dataset	Description
Learn2Zinc-Base (Direct)	skadio/learn2zinc-base	Direct problem-to-code pairs
Learn2Zinc-CoT (Reasoning)	skadio/learn2zinc-cot	Chain-of-thought reasoning before code
Learn2Zinc-Augmented	skadio/learn2zinc-augmented	Augmented training data with syntax error correction examples

Evaluation Strategies

The evaluations/generated_code/ directory contains outputs from the following configurations:

{model}_original — Base model without any finetuning
{model}_learn2zinc_base_finetuned — Finetuned on Learn2Zinc-Base (direct) dataset
{model}_learn2zinc_cot_finetuned — Finetuned on Learn2Zinc-CoT dataset
{model}_learn2zinc_augmented_finetuned — Finetuned on Learn2Zinc-Augmented dataset
{model}_combined_finetuned — Finetuned on a mix of Learn2Zinc-Base and Learn2Zinc-CoT strategies
retry_{model}_learn2zinc_augmented_finetuned — Multi-attempt retry (up to 5 attempts) with error feedback
ensemble_learn2zinc_augmented — Cascade ensemble across all models (smallest to largest)

Reproducing Paper Results

You can verify all results from the paper tables without running any model inference. The pre-generated .mzn files are evaluated directly against the test set using the MiniZinc solver.

Prerequisites

pip install datasets pandas tqdm

Install the MiniZinc CLI with the HiGHS solver.

Run Verification

# Evaluate all strategies
python verify_results.py

The script will:

Load the 100 IndustryOR problems from the skadio/text2zinc HuggingFace dataset
Read the pre-generated .mzn files for each strategy
Execute each model against the MiniZinc solver (HiGHS, 120s timeout)
Compare objective values against ground truth
Print execution accuracy and solution accuracy per strategy, and save detailed results to CSV

Finetuning

The finetuning.ipynb notebook contains the full finetuning pipeline. It iterates over all base models and dataset strategies, applying LoRA-based finetuning using Unsloth.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learn2Zinc

Repository Structure

Base Models

Finetuning Datasets

Evaluation Strategies

Reproducing Paper Results

Prerequisites

Run Verification

Finetuning

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
evaluations		evaluations
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
finetuning.ipynb		finetuning.ipynb
verify_results.py		verify_results.py

Folders and files

Latest commit

History

Repository files navigation

Learn2Zinc

Repository Structure

Base Models

Finetuning Datasets

Evaluation Strategies

Reproducing Paper Results

Prerequisites

Run Verification

Finetuning

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages