Skip to content

PTG v0.3.1 — Setup & serve phases

Latest

Choose a tag to compare

@dirvine dirvine released this 28 Jun 11:07

v0.3.1 — Setup & serve phases

A focused follow-on to v0.3.0: the released ptg binary can now prepare its own
runtime environment, so running a mesh is a one-command affair rather than a
manual server/model setup.

What's new

ptg setup --yes    # detect llama-server, download the gated Gemma QAT model (~2.7 GB), write config
ptg serve          # launch the inference server in the foreground (config is remembered)
ptg --probe        # then just run a mesh — no --vllm-url / --model flags needed
  • ptg setup detects llama-server (flag → PTG_LLAMA_SERVER
    LLAMA_SERVER~/.cache/ptg/bin~/llama-spike/.../build/bin → PATH),
    downloads the verified Gemma QAT model into the cache, and writes
    ~/.config/ptg/config.toml (%APPDATA% on Windows). Model download prefers
    the hf / huggingface-cli tool (handles Gemma gating best) and falls back
    to a native resumable download
    (HF_TOKEN bearer auth, .part → rename) if
    the CLI is absent or broken. The token is used only for the download and is
    never persisted.
  • ptg serve reads the config, short-circuits if the server is already
    running, else launches llama-server in the foreground with the exact verified
    flags. --dry-run prints the command. It does not daemonize.
  • Config-aware run path: the mesh command (ptg with no subcommand) now
    resolves server URL and model as flag → env → setup config → bundled default,
    so "setup once, then just ptg" works. The subcommand is optional, so all
    existing invocations are unchanged.

Honest prerequisites (documented in the README, not hidden)

  • llama-server must be installed. ptg setup detects it and prints exact
    install instructions if missing, but deliberately does not build it
    (GPU/platform-specific). Install from llama.cpp.
  • Gemma is gated. Accept the license at the model's HuggingFace page, then
    authenticate (hf login or set HF_TOKEN).

Validation

  • 114 unit/integration tests pass; clippy/doc clean; panic-free (-D warnings
    with panic/unwrap/expect denies).
  • Live-tested locally: setup --dry-run (detects the real server), serve --dry-run (exact server flags), config-aware run, and a real native
    download
    (broken hf shim → graceful fallback → file downloaded + renamed).
  • The release workflow now smoke-tests ptg setup --dry-run on all four shipped
    targets (Linux, macOS arm64/x86_64, Windows), gating packaging on the setup
    command actually running on every platform.

No research changes

This release adds tooling only. The benchmarked structured-lateral findings and
the runtime concurrency fix from v0.3.0 are unchanged.