Chaty v0.3.3
Chaty v0.3.3 — graceful out-of-memory handling
- OOM-aware model loading. If a model doesn't fit, Chaty now automatically backs off the GPU offload — covering both the weights and the KV-cache/compute buffers (the latter often runs a small GPU out of VRAM even when the weights fit) — and tells you what happened with a toast (e.g. "Low VRAM — GPU offload reduced to 20/28 layers" or "…fell back to CPU").
- If even a pure-CPU load runs out of memory, you get a clear message ("Out of memory — try a smaller / more-quantized model, or free up RAM") instead of a cryptic crash.
Install (Windows x64)
Download Chaty_0.3.3_x64-setup.exe below and run it (per-user, no admin).