Skip to content

Model v4

Latest

Choose a tag to compare

@avafloww avafloww released this 25 Feb 08:25
· 1 commit to main since this release

Model v4 — Diverse Augmentation

Training data now uses real-world package names from PyPI (800+), npm (1300+), crates.io (500+), plus curated docker images, repo names, and system packages. No more myapp hallucinations.

Improvements

  • pytohnpython — previously hallucinated myapp:v1.0, now correct
  • rm mydir/rm -rf mydir/ — previously garbled, now correct
  • pip install on PEP 668 — now suggests uvx, pipx, and venv creation
  • docker ps permission — now suggests both sudo and usermod -aG docker
  • All v3 fixes retained (clean EOS stopping, multi-alt where appropriate)

Stats

  • 60K training examples with 2600+ unique package/project names
  • No single placeholder exceeds 0.1% of training data (was 7% for myapp)
  • Train loss: 0.099, eval loss: 0.068