Skip to content

Chaty v0.3.3

Choose a tag to compare

@Fangyuan025 Fangyuan025 released this 06 Jun 15:26
· 25 commits to main since this release

Chaty v0.3.3 — graceful out-of-memory handling

  • OOM-aware model loading. If a model doesn't fit, Chaty now automatically backs off the GPU offload — covering both the weights and the KV-cache/compute buffers (the latter often runs a small GPU out of VRAM even when the weights fit) — and tells you what happened with a toast (e.g. "Low VRAM — GPU offload reduced to 20/28 layers" or "…fell back to CPU").
  • If even a pure-CPU load runs out of memory, you get a clear message ("Out of memory — try a smaller / more-quantized model, or free up RAM") instead of a cryptic crash.

Install (Windows x64)

Download Chaty_0.3.3_x64-setup.exe below and run it (per-user, no admin).