Skip to content

ailiance/ailiance-models-tuning

Repository files navigation

ailiance-models-tuning

Fine-tuning pipeline for the Ailiance domain-expert LLM family — 10 hardware/embedded domains on Qwen 2.5-32B.

Part of the Ailiance platform. Upstream sibling: ailiance-mac-tuner (MLX toolkit for Apple Silicon). Downstream consumer: micro-kiki (MoE-LoRA routing runtime).


Where to find related artifacts

Ailiance is the EU-sovereign LLM serving stack of L'Electron Rare, a French SME. Multi-model, audit-grade, EU AI Act Art. 13/15/52/53 transparency.

What it does

  • Builds chat-format datasets (ShareGPT + Hugging Face merges) for 10 hardware domains.
  • Trains QLoRA NF4 4-bit adapters on top of Qwen/Qwen2.5-32B-Instruct.
  • Evaluates adapters via token-overlap against held-out samples per domain.
  • Publishes adapters to the Hugging Face Hub with autogenerated model cards.
  • Maintains a JSON model registry (artifacts/model_registry.json) tracking version, metrics, base model.

Domain catalog

10 domains, one recipe shape per domain:

Category Domains
Firmware / MCU stm32, embedded, platformio, iot
EDA / Hardware kicad, spice, emc, power
Signal / CAD dsp, freecad

Plus espidf available via datasets/builders/expand_espidf.py. Each domain has a dedicated seed set, system prompt, and YAML recipe.

Hardware

Training target: KXKM-AI — RTX 4090 24 GB only. Qwen 2.5-32B QLoRA fits via:

max_memory = {0: "22GiB", "cpu": "50GiB"}
llm_int8_enable_fp32_cpu_offload = True

GrosMac / Tower cannot train 32B — use them for dataset building, eval, and publishing only. See parent monorepo ../CLAUDE.md for SSH setup. Ask before launching a multi-hour job on KXKM-AI.

Pipeline

datasets/builders/build_<domain>_dataset.py       seed + HF merge → JSONL
  → scripts/validate_dataset.py                   role/content schema
  → scripts/train_sft.py --config configs/...     QLoRA on 4090
  → outputs/sft-<domain>/adapter_model.safetensors
  → scripts/eval_adapters.py                      token-overlap, 5 samples/domain
  → scripts/publish_adapters.py                   HF Hub + model card
  → src/ailiance_tuning/registry.py                   JSON registry

Quick start

# Dependencies (Python via uv, parent F4L workspace uses 3.14)
uv sync

# Tests (CPU only, fast)
uv run python -m pytest tests/ -v

# Build all domain datasets (seeds only, no HF download)
./scripts/build_all_datasets.sh

# With Hugging Face enrichment
./scripts/build_all_datasets.sh --with-hf

# Validate dataset schema
python scripts/validate_dataset.py datasets/processed/*.jsonl

# Train one domain (on KXKM-AI, via SSH)
python scripts/train_sft.py \
  --base-model Qwen/Qwen2.5-32B-Instruct \
  --dataset datasets/processed/stm32_train.jsonl \
  --output-dir outputs/sft-stm32

# Evaluate all adapters
python scripts/eval_adapters.py --samples 5

# Publish all adapters to HF
python scripts/publish_adapters.py --org clemsail

Project structure

src/ailiance_tuning/       Thin lib — config dataclasses, registry, validator
scripts/               Real entry points (train / eval / publish / build)
datasets/builders/     One builder per domain (seed + HF merge)
datasets/processed/    Built JSONL outputs (gitignored)
configs/               YAML recipes, one per domain
outputs/               Trained adapters (gitignored)
artifacts/             Model registry JSON (gitignored)
tests/                 Config validation, dataset schema checks

Cross-repo map

Repo Role
mascarade LLM orchestration — loads adapters at inference time
micro-kiki Downstream runtime — 35-domain routing + cognitive layer
ailiance-mac-tuner Sibling — MLX fine-tuning for Mac Studio (distillation target = teacher)

Gotchas

  • Base-model mismatch: train_sft.py defaults to Qwen2.5-32B-Instruct but eval_adapters.py / publish_adapters.py historically hardcoded Qwen3-8B. Align all three when switching base; adapters are base-specific.
  • Adapter paths: training writes to outputs/sft-<domain>-qwen25-32b/, eval expects outputs/sft-<domain>/. Rename or symlink.
  • ShareGPT vs OpenAI format: builders emit ShareGPT (conversations/from/value), validator enforces OpenAI (messages/role/content). Always run sharegpt_to_openai() before writing JSONL.
  • HF streaming datasets can hang silently — --max-samples is mandatory.

More in CLAUDE.md (Hardware reality + Gotchas sections).

License

MIT. See LICENSE.

About

ailiance fine-tuning pipeline — model training, evaluation, registry (Unsloth, LoRA).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors