
Fine-tune LLMs on AL (Business Central) code using training data from al-corpus.
| Item |
Detail |
| Language |
Python 3.11+ |
| Build |
hatchling via pyproject.toml |
| Core deps |
anthropic, click, pyyaml, sacrebleu |
| Optional deps |
unsloth, trl, transformers, torch (training) |
| Data source |
SShadowS/al-corpus |
| Model target |
LoRA adapter via Unsloth QLoRA |
al-corpus pairs → pairs.jsonl
↓
al-train describe → described_pairs.jsonl (Claude Haiku)
↓
al-train format → train.jsonl + eval.jsonl (ChatML)
↓
al-train train → LoRA adapter (Unsloth QLoRA)
↓
al-train eval → report (5 metrics)
python -m venv .venv
source .venv/Scripts/activate # Windows Git Bash
pip install -e ".[train]"
# 1. Generate pairs with al-corpus
al-corpus pairs ./my-al-project -o pairs.jsonl
# 2. Generate descriptions (test with 100 first)
al-train describe pairs.jsonl -o described.jsonl -n 100
# 3. Full corpus via batch API
al-train describe pairs.jsonl -o described.jsonl --batch
al-train describe --poll described.jsonl.batch_id -o described.jsonl
# 4. Format for training
al-train format described.jsonl -o train.jsonl --eval eval.jsonl
# 5. Train
al-train train train.jsonl --eval eval.jsonl
# 6. Evaluate
al-train eval ./output/al-coder-lora --eval-set eval.jsonl
| Command |
Input |
Output |
Description |
al-train describe |
pairs.jsonl |
described_pairs.jsonl |
Generate natural-language descriptions via Claude Haiku |
al-train describe --batch |
pairs.jsonl |
described_pairs.jsonl |
Submit as an Anthropic batch job |
al-train describe --poll |
.batch_id file |
described_pairs.jsonl |
Poll and retrieve a completed batch |
al-train format |
described_pairs.jsonl |
train.jsonl, eval.jsonl |
Convert to ChatML format and split train/eval |
al-train train |
train.jsonl |
LoRA adapter |
Fine-tune with Unsloth QLoRA |
al-train eval |
LoRA adapter + eval.jsonl |
Report |
Evaluate across 5 metrics |
| Requirement |
Notes |
| Python 3.11+ |
|
ANTHROPIC_API_KEY |
Required for description generation |
| NVIDIA GPU, 24 GB+ VRAM |
Required for training (.[train] extras) |
| al-corpus on PATH |
Required for pair generation and evaluation |
| File |
Purpose |
pyproject.toml |
Package metadata, dependencies, entry points |
pairs.jsonl |
Raw AL code pairs from al-corpus |
described_pairs.jsonl |
Pairs enriched with Claude-generated descriptions |
train.jsonl / eval.jsonl |
ChatML-formatted split ready for training |
output/al-coder-lora/ |
Trained LoRA adapter output directory |
Author: Torben Leth <sshadows@sshadows.dk> — License: MIT