Releases · Fangyuan025/Chaty

15 Jun 04:08

v0.8.0

f0fbefd

v0.8.0 Latest

Latest

Install

Platform	File
Windows x64	`Chaty_*_x64-setup.exe` — per-user installer, no admin
macOS (Apple Silicon)	`Chaty_*_aarch64.dmg`

⚠️ macOS first launch — "Apple could not verify…" / "damaged"

Chaty is ad-hoc signed but not notarized (no paid Apple Developer
account), so Gatekeeper flags it on first launch. The app is safe —
everything runs locally. Clear the download quarantine once, in Terminal:

xattr -dr com.apple.quarantine /Applications/Chaty.app

Then open Chaty normally. (Alternatively: try to open it once, then go to
System Settings → Privacy & Security → Open Anyway.)

Full Changelog: v0.7.0...v0.8.0

Assets 4

13 Jun 03:24

github-actions

v0.7.0

0619353

v0.7.0

Install

Platform	File
Windows x64	`Chaty_*_x64-setup.exe` — per-user installer, no admin
macOS (Apple Silicon)	`Chaty_*_aarch64.dmg`

⚠️ macOS first launch — "Apple could not verify…" / "damaged"

xattr -dr com.apple.quarantine /Applications/Chaty.app

Then open Chaty normally. (Alternatively: try to open it once, then go to
System Settings → Privacy & Security → Open Anyway.)

Full Changelog: v0.6.0...v0.7.0

Assets 4

12 Jun 02:13

github-actions

v0.6.0

ff7e41c

v0.6.0 — macOS (Apple Silicon) port

Chaty now runs natively on Apple Silicon Macs, alongside Windows.

macOS support

Metal GPU backend — selected per target via a feature-multiplexer crate; Windows keeps Vulkan unchanged. On unified memory, all layers are offloaded when the model fits (recommendedMaxWorkingSetSize budget), with P-core-only worker threads.
Native window chrome — traffic lights (titleBarStyle: Overlay), Dock-icon reopen, menu-bar tray, Cmd+Shift+Space global hotkey.
Clean quit on every path — tray Quit, app-menu Quit and Cmd+Q no longer trip ggml/ONNX teardown crashes ("Chaty quit unexpectedly").
.dmg packaging with entitlements (mic, JIT, library validation for the bundled ONNX dylibs); CI builds the dmg headlessly via hdiutil.
Native microphone capture (CoreAudio/cpal) — WKWebView never exposes capture devices to embedded apps, so recording bypasses it entirely; devices are scanned (no phantom default-device failures) and the system mic consent is requested properly.

Model support

Gemma 4 — native renderer for the <|turn>role …<turn|> format, <|think|> thinking control, <|channel>thought reasoning folded into the think panel, turn-boundary stop insurance.
Qwen 3.5 / 3.6 — pre-opened <think> handling (synthetic open tag for the UI), reliable no-think (empty think block), detected by GGUF architecture so community finetunes with custom templates behave too; the dead /no_think soft switch is never sent to 3.5+.
Robust template fallback chain — embedded template → system-role folding → per-architecture built-in → ChatML, so unusual GGUFs still chat.
Families covered: Llama 3, Gemma 3, Gemma 4, Qwen 3, Qwen 3.5/3.6.

Memory & model switching

Synchronous eject before load — switching models fully tears down (and verifies release of) the old model before the new one loads; no more unified-memory swap freezes. mmap is disabled on macOS (Metal-wired pages of mmap'd MoE models never returned to the kernel).
Model load progress bar (eject → weights % → ready) in the titlebar and model chip.
Pre-flight guard — models that cannot physically fit in RAM are refused with a clear message.
Context auto-fit — "Auto" now uses as much of the model's trained context as memory allows (KV-cache-size aware); custom values are capped to fit, with a visible notice when clamped; the settings slider adapts to the loaded model's trained length.

Chat & UI

Stop reason shown after each reply (finished / length / context full / stop sequence / cancelled).
Focused thinking view — while reasoning streams, a small window follows the newest text with older lines fading out; expandable as before.
Circular context-usage ring (amber > 80 %, red > 95 %).
Unlimited reply length by default (opt-in cap in settings); "Reload to apply" button under the context setting.
Playable HTML preview — single-file web games work (keyboard focus + localStorage shim in the sandbox).
Mermaid: theme-aware, looser parsing, and visible error messages instead of silently showing raw code.
Higher-contrast light theme; "Open models folder" in the model menu; backend errors are bilingual (中文/English).
GPU/memory usage in the hardware panel now reports the app's real footprint.

Release engineering

Cross-platform release CI — pushing a vx.y.z tag builds the Windows installer and macOS dmg and publishes both to one GitHub Release (version consistency is checked against the tag).
scripts/bump-version.sh syncs the version across package.json, tauri.conf.json, Cargo.toml and Cargo.lock.
The in-app updater picks the right asset per platform (.exe / .dmg).

Assets 4

08 Jun 22:21

Fangyuan025

v0.5.1

85631b9

Chaty v0.5.1

Fixed

Softer light theme. Light mode was too harsh — pure-white surfaces with near-black text and heavy drop shadows. Retuned to off-white surfaces, soft-black text, hairline borders, and gentler panel shadows. Dark mode is unchanged.

Install (Windows x64): download Chaty_0.5.1_x64-setup.exe below and run it (per-user, no admin).

Assets 3

08 Jun 21:50

Fangyuan025

v0.5.0

dadbe16

Chaty v0.5.0

A big feature release — the remaining items from the roadmap all landed.

New

Drag & drop a file or image onto the window to attach it instantly.
Sampling controls — Top‑K, Min‑P, repeat penalty, and stop sequences, all in Settings.
Prompt presets — save and one‑click apply multiple system prompts.
Voice picker & speech rate — choose among 11 Kokoro voices and adjust read‑aloud / live‑mode speed.
Themes — light, dark, or follow‑system.
Export conversations to Markdown or JSON, and full‑text search across all your chats from the sidebar.
In‑app model downloader — paste a HuggingFace repo (e.g. Qwen/Qwen3-4B-GGUF), pick a GGUF, and download it into your models folder with a live progress bar.
Mermaid diagrams — ```mermaid code blocks render as diagrams (lazy‑loaded so the app stays light).

Install (Windows x64): download Chaty_0.5.0_x64-setup.exe below and run it (per‑user, no admin).

Assets 3

08 Jun 04:55

Fangyuan025

v0.4.3

9a7a235

Chaty v0.4.3

Fixed

Qwen3.5+ thinking is controllable again. These models dropped the /no_think soft switch and default to reasoning on. The think toggle is re-enabled for them, and turning it off now actually disables thinking (the backend pre-fills an empty <think></think> block). Title/summary/voice helpers always skip reasoning on these models for speed.
Settings panel no longer overflows the window under display scaling or a small window — it caps to the viewport and scrolls instead.
No more rubber-band / page shift when the mouse wheel hits the top or bottom of a list; only inner panes scroll now.

Install (Windows x64): download Chaty_0.4.3_x64-setup.exe below and run it (per-user, no admin).

Assets 3

08 Jun 04:24

Fangyuan025

v0.4.2

482b445

Chaty v0.4.2

Qwen3.5+ thinking adaptation

Qwen3.5 dropped the /think · /no_think soft switch (it now uses an enable_thinking template flag, default off). Chaty detects this from the chat template and no longer injects /no_think into the prompt for such models — so the tag can't leak into replies.
For Qwen3.5+, the thinking toggle is disabled with a note that reasoning is auto-managed by the model.
Prior <think> reasoning is never fed back into the model on later turns (per Qwen's guidance), saving context and avoiding confusion.

New

Zoomable HTML preview — zoom buttons plus Ctrl + / - / 0 (0.4×–2.5×).
Configurable context length — Settings now lets you pick the context window (Auto, or up to the model's native trained length, e.g. far beyond the old 8192 cap).
Auto context compaction — as a conversation nears the context limit, older turns are summarised into the prompt automatically so you can keep chatting. Non-destructive: every message still shows in the UI.

Fixed

Clipboard no longer triggers a permission prompt — copy, and right-click cut/copy/paste, now use the native clipboard.

Install (Windows x64): download Chaty_0.4.2_x64-setup.exe below and run it (per-user, no admin).

Assets 3

07 Jun 05:27

Fangyuan025

v0.4.1

3ca745e

Chaty v0.4.1

Chaty v0.4.1 — edit & regenerate

Regenerate any assistant reply — a new Regenerate button re-runs the turn (and drops anything after it).
Edit your messages — hover a message you sent and click the pencil to edit it in place; saving re-sends from that point.

This is also the first release that the in-app auto-updater (added in v0.4.0) can offer you automatically.

Install (Windows x64)

Update from inside Chaty if you're on v0.4.0+, or download Chaty_0.4.1_x64-setup.exe below.

Assets 3

06 Jun 22:18

Fangyuan025

v0.4.0

9a1d887

Chaty v0.4.0

Chaty v0.4.0

New

In-app auto-update. Chaty now checks GitHub for a newer release on launch and offers a one-click update (downloads and runs the installer).
Context-usage meter. The stats line shows how full the model's context window is (ctx 3.2K/8K with a bar).

Fixes

Model switching no longer crashes. Switching models could fail with "failed to initialize inference context: null reference" — the old model is now freed first, and a failed context allocation backs off the GPU offload instead of erroring out.
Better model compatibility. Some models (e.g. certain Qwen variants) failed on the 2nd turn with a decode error and then stayed broken — prompt decoding now falls back to a clean re-decode and self-recovers, so more models work reliably.
No more date‑reciting. The "today's date" hint is only sent when your question is actually time-related, so short prompts no longer make the model recite the date.
Thinking mode is only used (and only injects /no_think) when the model actually supports reasoning; it's disabled in the menu otherwise.
Thinking and Web search are now mutually exclusive (enabling one turns the other off).
Live mode now shows only the sentence currently being spoken, in a larger font (the full transcript is saved to the chat as before).

Install (Windows x64)

Download Chaty_0.4.0_x64-setup.exe below and run it (per-user, no admin). After this version, future updates are offered in-app.

Assets 3

06 Jun 15:26

Fangyuan025

v0.3.3

ddd4f8f

Chaty v0.3.3

Chaty v0.3.3 — graceful out-of-memory handling

OOM-aware model loading. If a model doesn't fit, Chaty now automatically backs off the GPU offload — covering both the weights and the KV-cache/compute buffers (the latter often runs a small GPU out of VRAM even when the weights fit) — and tells you what happened with a toast (e.g. "Low VRAM — GPU offload reduced to 20/28 layers" or "…fell back to CPU").
If even a pure-CPU load runs out of memory, you get a clear message ("Out of memory — try a smaller / more-quantized model, or free up RAM") instead of a cryptic crash.

Install (Windows x64)

Download Chaty_0.3.3_x64-setup.exe below and run it (per-user, no admin).

Assets 3

Releases: Fangyuan025/Chaty

v0.8.0

Install

⚠️ macOS first launch — "Apple could not verify…" / "damaged"

Uh oh!

v0.7.0

Install

⚠️ macOS first launch — "Apple could not verify…" / "damaged"

Uh oh!

v0.6.0 — macOS (Apple Silicon) port

macOS support

Model support

Memory & model switching

Chat & UI

Release engineering

Uh oh!

Chaty v0.5.1

Chaty v0.5.1

Fixed

Uh oh!

Chaty v0.5.0

Chaty v0.5.0

New

Uh oh!

Chaty v0.4.3

Chaty v0.4.3

Fixed

Uh oh!

Chaty v0.4.2

Chaty v0.4.2

Qwen3.5+ thinking adaptation

New

Fixed

Uh oh!

Chaty v0.4.1

Install (Windows x64)

Uh oh!

Chaty v0.4.0

New

Fixes

Install (Windows x64)

Uh oh!

Chaty v0.3.3

Install (Windows x64)

Uh oh!