Releases: bloom500/feral
Feral v0.2.3
Released 2026-06-14 — Windows, macOS (Apple Silicon + Intel), and Linux.
Added
- GPU acceleration. Feral now ships a GPU backend on every platform —
Vulkan on Windows and Linux, Metal on macOS — and offloads the whole model
to the GPU by default. Local models that previously ran CPU-only (slow,
sometimes "not responding" for minutes) now use the graphics card. If the GPU
can't be used — missing or old driver, no Vulkan runtime, or not enough VRAM
for the model's context — Feral automatically falls back to CPU so the model
still loads instead of failing. - Desktop control (opt-in). The agent can now drive native applications
through the OS accessibility tree — list windows, read controls, type, click,
and send real keystrokes. Off by default; enable it under Settings → Agent,
with a Safe mode (confirm every action) and a YOLO mode (no prompts). A hard
denylist (password managers, system security dialogs, Feral itself) can never
be controlled. - Configurable token budget. The agent's conversation budget is now
unlimited by default — no more hitting "budget exhausted" mid-task. Optional
caps (1M/5M/20M/50M) are available under Settings → Agent for cost control. - Live context ring. A live indicator of how full the model's context
window is, so you can see when the conversation is approaching the limit.
Fixed
- Loading a model no longer crashes the machine. Modern models advertise
enormous training contexts (up to 256K), and the KV cache was sized to that
full context and allocated up front — roughly 90 GB for a 4B model, which
instantly exhausted memory (a kernel panic and reboot on macOS, a near-hang
on Windows). The load-time context is now capped to a safe default (8192,
raisable viaFERAL_MAX_CONTEXT), clamped to what the model actually
supports.
Feral v0.2.3-rc.4
ci(release): use TheMrMilchmann/setup-msvc-dev (ilammer action was 404) The previous attempt referenced ilammer/msvc-dev-cmd, which no longer exists as a repository — GitHub failed every matrix job at "Set up job" (action resolution happens for all steps regardless of their `if:`), so Linux and macOS regressed too. Swap to TheMrMilchmann/setup-msvc-dev@v3.0.2 (confirmed to exist) to load the MSVC environment for the Ninja-based Vulkan build. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Feral v0.2.2
Released 2026-06-13 — Windows, macOS (Apple Silicon + Intel), and Linux.
Fixed
- The agent no longer stops mid-reply when it runs out of tokens. Long
writing sessions (reports, theses, articles) were cut off mid-sentence at
the per-call token limit and silently presented as the final answer, forcing
the user to type "continue" by hand. All inference providers now report why
generation ended (finish_reason/done_reason/stop_reason), and on a
mid-answer cutoff the agent automatically resumes exactly where it stopped —
the reply streams on in the same bubble, up to 4 automatic continuations. - More malformed tool-call shapes are caught and retried. The v0.2.1
detector matchedname/toolkeys only; real-world transcripts showed
models emitting{"invoke name="write_file">(a JSON/XML hybrid) and
XML-style<invoke name=...>, which escaped detection — ending the turn
mid-task with the garbage visible in chat. Both shapes are now scrubbed and
trigger the corrective retry nudge. ask_uservalidation errors teach the model the right shape. Every
rejection now includes a complete valid example call, so a model that got
the structure wrong can self-correct on retry instead of failing the same
way twice and giving up.
Feral v0.2.1
Released 2026-06-12 — Windows, macOS (Apple Silicon + Intel), and Linux.
Fixed
- Raw tool calls no longer leak into the chat. Two holes closed: the
inference providers streamed the re-encoded<tool_call>{json}</tool_call>
tag to the UI as visible tokens, and the frontend only suppressed tool-call
text at the start of a streamed answer — prose followed by a mid-message
tool call (e.g.<tool_call>{"name="memory_graph">) was displayed verbatim.
Tool calls now travel only through the parsed content channel, and the
streaming view cuts tool-call text anywhere it appears (including a partial
opener at the end of the buffer), keeping the prose before it visible. - The agent no longer stops mid-task on a malformed tool call. When a
model emitted a tool call with corrupted JSON, the parser scrubbed it but
executed nothing, and the loop treated the turn as a final answer — the
task silently died. The loop now detects the malformed attempt and feeds a
corrective nudge back to the model so it can re-emit a valid call (bounded
at 3 retries). - Parallel tool calls are no longer dropped. All providers (OpenAI-
compatible, Ollama, Anthropic) re-encoded only the first native tool call
of a turn; any additional calls were silently discarded, leaving the model
waiting on results that never arrived. Every call in the turn is now
executed. ask_userworks reliably with native function calling. The native tool
schema declaredquestionsas a bare array with no item structure, so
models had to guess the nested{question, options:[{label}]}shape and
most calls failed validation. Tools can now publish a full JSON Schema per
parameter, andask_userships one — including option labels, descriptions,
therecommendedflag, and multi-select.
Feral v0.2.0
Released 2026-06-12 — Windows, macOS (Apple Silicon + Intel), and Linux.
⚠️ Migration notes from 0.1.x
- Updater key rotation. The original update-signing key was exposed in the
public git history and has been rotated. Upgrading from 0.1.7 or older
requires either installing the transitional 0.1.8 release first (it
carries the new verification key) or downloading the 0.2.0 installer
manually one time. Full plan:docs/UPDATER_KEY_MIGRATION.md. - Onboarding, conversations, models, and BYOK keys carry over unchanged.
New
- Vision — the agent can finally see your images. Pasted screenshots
(Ctrl+V) and uploaded image files now reach the model as real pixel data
(base64 data URLs), not just a[Image attached: …]filename note. Works on
both inference paths: direct chat (BYOK cloud via OpenAIimage_urlcontent
parts) and the Feral Agent sidecar (OpenAI-compatible, Ollamaimages, and
Anthropic base64 blocks). Local llama.cpp GGUF models remain text-only and
keep the filename note. - Memory that actually carries over. New conversations no longer start
cold: the chat tab now injects a "[Memory context]" block (facts from the
shared knowledge graph) into the system prompt on every send, and runs a
background extraction pass after each completed turn that writes
subject–predicate–object triples back into~/.feral/memory-graph.json—
the same graph the agent sidecar maintains. The sidecar side recalls graph
facts at every turn too, extracts from the very first assistant turn
(previously only every 3rd — short chats never learned anything), and the
previously-unregisteredmemory_ops/memory_graphtools are now live so
"remember X" / "forget Y" take effect immediately. - Memory Graph page redesign. Cognee-inspired dark visualization: glowing
neon nodes on a near-black canvas, degree-scaled node sizes, a filter rail
with per-type counts, relation chips, free-text node search, a labels
toggle, and a click-to-inspect detail card showing a node's connections. - MCP Extensions (native connector). Feral now consumes Model Context
Protocol servers through the officialrmcpRust SDK, managed entirely in
the Tauri host. New "Extensions" entry in the sidebar (under Skills) with an
app-store style UI: curated catalog (File Access, Long-term Memory, GitHub,
Web Search, Browser Automation, Deep Reasoning), one-click install, on/off
toggle, "What can it do?" tool listing, and humanized errors. No transports,
JSON-RPC, raw values, internal paths, or API keys ever reach the frontend;
configs (including keys) live backend-side in~/.feral/mcp.json. - Drag & drop + paste attachments — any file type. Files and images can
be dropped onto the chat input, and screenshots paste straight from the
clipboard (Ctrl+V). PDFs and Office documents (.docx, .pptx, .xlsx, .odt)
are now parsed natively (new Rustextract_file_textcommand) so their
text reaches the model; plain-text files of any extension work as before;
and true binaries are attached as a path reference the agent can open with
its file tools instead of becoming a dead "Unsupported format" chip. The
agent path now receives every attachment too — previously files whose text
couldn't be extracted were silently dropped and never reached the model. - macOS and Linux releases. The release pipeline now builds and signs
installers for Windows (.msi/.exe), macOS Apple Silicon + Intel (.dmg),
and Linux (.deb/.rpm) from a single tag, updater manifest included.
The agent sidecar resolves its per-platform binary on all five targets. - Mascot redesign — the real Feral monster. The pixel companion is now
the brand mascot itself: charcoal-black fluffy monster with orange horns,
an orange face plate with big black eyes, tiny white fangs, and a round
orange belly. 55 animated variants across all 22 states, composed from a
single base sprite so the cast stays consistent (laptop typing, thought
clouds and lightbulbs, magnifier searches, party hats, heart eyes, a real
side-run cycle, dissolve-in spawning, and more). It also renders 26%
larger at a crisp integer 3× pixel scale, so every state, variant, and
effect is clearly readable without crowding the chat input. A procedural
pixel-effects layer plays around it per state: confetti on celebrate,
rising hearts, drifting Z's while sleeping, thought dots, an orbiting
search ring, a drawn-in green check on done, flashing error cross, work
sparks during tool calls. - Sonnet-style agent voice. SOUL.md rewritten (super friendly + ultra
useful), plus new bundled IDENTITY.md and AGENTS.md companions — each
user-overridable at~/.feral/<NAME>.md— composed into the system prompt. - Contributor guide.
docs/CONTRIBUTOR_GUIDE.md: three-runtime
architecture, IPC protocols, test matrix, build & release flow.
Stability
-
macOS/Linux: cloud model selection works in agent mode. The agent
sidecar binary is now resolved next to the main executable
(Contents/MacOS/feral-agentin the .app bundle,/usr/bin/feral-agent
on deb/rpm installs) where Tauri actually places it — previously only the
resource directory was checked, so the sidecar was silently dead on
macOS/Linux production installs and picking a cloud (BYOK) model from the
model selector did nothing. Model-switch failures (sidecar offline,
provider disabled, missing key) now surface as error notifications
instead of vanishing silently. -
Agent stop actually stops (this release). The Stop button's signal now
travels the whole chain: newferal_stop_generationTauri command →stop
message over sidecar stdin →AgentLoop.stop(sessionId)aborts the
in-flight inference fetch and any running tool. Previously the frontend
called a Rust command that didn't exist, so agent generations were
unstoppable. -
Agent tasks survive chat/tab switches. A live per-session mirror
(lib/feralLiveSession.ts) accumulates streamed content, tool bubbles, and
agent phase even while another chat is on screen; re-opening the in-flight
conversation rehydrates all of it instead of showing the stale disk
snapshot (which made tasks look reset during long tool runs). -
Tool-call bubbles behave. The mascot's tool-call stack now grows upward
from above the mascot (it could previously extend down over the typing
bar), and each finished bubble fades out on its own after 4s instead of
piling up until the turn ended. -
Unified stream stop (A2). One stop entry point (
lib/streamControl.ts)
routes Stop to whichever path (chat backend / agent sidecar) actually owns
the in-flight generation. Previously the Stop button in Agent mode told the
idle chat backend to stop while the sidecar kept generating. Feral streams
are stoppable per-session, and a new send interrupts the previous in-flight
stream on both paths. -
Sidecar supervision (#11). The Tauri shell now watches the agent sidecar
process, restarts it with backoff on crashes, and shows an "agent offline /
restarting" banner. Before: a sidecar crash made Agent mode silently mute. -
GGUF chat template (A4). The prompt format is now read from the model's
own GGUF metadata (tokenizer.chat_template) via llama.cpp's template
engine; the filename-based guess is only a fallback. A renamed GGUF no
longer gets a wrong template and corrupted output. -
Real tokenizer endpoint (P3). New
/tokenizeroute on the local API
backed by the loaded model's actual vocabulary; the agent's context
accounting no longer relies on GPT-2 BPE guesses, and token counts are
cached per message text. -
Idle-timeout transparency (#13). A stream that stalls for 300s is now
reported as an explained error ("model stopped responding…") instead of
silently ending in a fake "stopped by user" state. -
No more mid-reasoning dead ends. When a completion exhausts its
per-call token limit while the model is still thinking, the agent loop now
feeds the partial reasoning back and asks the model to finish — up to 4
automatic continuations — instead of surfacing "(The model used all
available tokens on reasoning and produced no answer)". Tasks complete
end-to-end regardless of the Max Tokens slider.
Error handling
- Root React ErrorBoundary (#9). A render exception now shows a recovery
screen (try again / reload) instead of a blank window. - Humanized inference errors (#10). Raw provider errors (401/429/network/
context overflow…) are mapped to plain-English messages with a fix-it
action (e.g. "key rejected → Open Cloud Keys"); the raw error stays
available under "Technical details". Local-engine failures no longer leak
[Error: …]text into the chat transcript. - Cron visibility (X3). Scheduled jobs now run through the full agent
loop (tools, memory, budgets) instead of a bare LLM call, and their results
and failures surface as toasts. Failed runs emit acron_errorevent.
First-run experience
- Zero-models flow (#14). After the first model download finishes, it is
loaded automatically (when nothing else is loaded), taking a fresh install
from "empty app" to "ready to chat" without a manual Load click. - Hardware-aware recommendation (#15). The onboarding wizard's final step
now reads your RAM/VRAM/Vulkan detection and recommends a concrete model
size + quantization to download. - Slow-start indicator (#16). While a model loads (with %) or a long
prompt prefills, the chat shows an explanatory status instead of silent
dots. - Onboarding sequencing (#17). The agent-creation flow now actually
mounts (it was unreachable) and never stacks on top of the first-run
wizard.
UI
- Tool-call bubbles (#18) are interactive: finished calls expand to show
the tool's output (or error), and long-running tools show live
retry/progress notes. - **Empty states (#19...
Feral v0.1.6
Skills
- Claude Code-style skill injection. The system prompt now carries only a short "Available skills" menu (id, name, description) instead of dumping every installed skill's full body. The new
read_skilltool lets the LLM load the fullSKILL.mdbody of any installed skill on demand. Skills are refreshed every turn from Rust, so installing or removing a skill mid-conversation shows up in the next message without resetting the session.
Agent
- Helpful message on empty thinking completions. When a model is cut off mid-reasoning and produces no visible answer, the chat now shows a clear message (distinguishing "model used all tokens on reasoning" from a true empty response) instead of a silent empty bubble.
selectLocalAgentroutes through the model store. In Agent mode, picking a local model from the chat header now goes through the store'sload()method, so the ModelPill renders a real progress bar (withrole="progressbar") instead of bare text.
Feral v0.1.5
Mascot
- 8-bit animated mascot. A 16×16 pixel-art fluffy black monster with orange horns, big eyes, and two fangs now lives permanently on the typing bar. Reacts to what you're doing: blinks while idle, looks down while you type, eyes dart side-to-side while thinking, scans down while calling a tool, hops happily when the model finishes.
- Idle boredom run. After 18 seconds of inactivity the mascot gets bored, switches to a side-profile silhouette, and runs across the full width of the typing bar — leaving small pixel dust puffs in its wake — then flips around and runs back. Any activity (typing, streaming) snaps it straight back to the perch.
- Reduced-motion support. All canvas animations respect
prefers-reduced-motion.
Agent
- Token cap removed. No more daily or per-conversation token budget. Feral Agent runs on BYOK (user pays own provider), so capping was pointless and caused agent sessions to silently stall. Budget can be re-enabled via
FERAL_BUDGET_DAY/FERAL_BUDGET_CONVERSATIONenv vars if needed. - CI sidecar fix. Release builds now compile the Feral Agent sidecar from the vendored
FeralAgent/directory in the monorepo instead of cloning an outdated external repository. Eliminates a class of "agent not responding in production release" bugs.
UI fixes
- Real app version in sidebar. The version badge now reads from the Tauri API instead of the previously hardcoded
v0.1.3string. - Context ring in agent mode. The ring no longer stays stale when using the agent. It now shows a comet-arc activity indicator during agent streaming (the sidecar doesn't emit live token counts, so the spinning arc is the honest signal).
Feral v0.1.4
Agent mode
- Native Feral Agent runtime. Agents now run on a built-in Feral Agent sidecar (Bun/TS) wired directly into the chat stream — no external gateway process. A Chat/Agent toggle in the composer switches modes and auto-loads the selected local model into the agent engine.
- DeepResearch & adaptive reasoning. Dynamic max-iteration budgets for deep-research and complex tasks, model-fitness scoring, error-correcting control loop, and persistent agent memory.
- Sturdier tool calls. Parser now handles Gemma-style
<tool_call>, bracket and bare-JSON formats, and theargumentskey; adds silent tool calls and an empty-response fallback; raises token budgets for thinking models.
Chat & UI
- Live context ring. Real token usage straight from the model — exact prompt tokens from llama.cpp locally, real usage stats from cloud providers — with a hover popover showing tokens used, free space and message count, and the model's true context window instead of an estimate.
- Streaming polish. Words fade in one-by-one as tokens stream, and a phase indicator shows Thinking / Calling tool / Processing.
- Thinking blocks. Support for multiple formats (
<think>,<thinking>,<|channel>), a thinking timer, and blocks that now persist across chat and tab switches. - Response resilience. Partial responses are always persisted, and responses survive navigating away and back, tab switches, and hot-swapping the active model.
Stability & performance
- Fix a
GGML_ASSERTcrash on long agent prompts by chunking the prefill batch. - Memoize message rendering so streaming no longer re-parses already-completed messages.
- Warning-free
cargo clippyon the inference build, repo hygiene (.gitattributes, corrected.gitignore), andunist-util-visitpinned as a direct dependency.
Feral v0.1.3
What's new in v0.1.3
SkillHub — New Feature
The biggest addition in this release. Skills extend what your AI can do — think of them as plugins for your local models.
- Installed tab: see, preview, and remove every skill you have installed
- Discover tab: browse remote skills from the Feral manifest, preview content before installing
- Community tab: 27 community skills from ClawHub and GitHub contributors, browsable directly in the app — Skill Vetter, Self-Improving Agent, GitHub CLI, Google Workspace, Multi Search Engine, Obsidian, and more
- Import tab: install any skill by URL or local file path, with a live content preview before committing
- Install / remove flows: one-click install and remove with error feedback
- Click-outside to close: drawer dismisses naturally without needing a close button
Auto-updater
- Version number now shown correctly in Settings → General and the sidebar footer
- Update-available toast appears automatically when a new release is out
- "Check for updates" button in Settings gives instant feedback
CI / Release
- Release workflow now builds, signs, and publishes installer artifacts automatically on every version tag
latest.jsonand.sigfiles are uploaded automatically — auto-update works for all users from this release forward
Full changelog: v0.1.2...v0.1.3
Feral v0.1.2
What's new in v0.1.2
Models Tab — Complete Overhaul
- Browse HuggingFace redesigned: each card shows a smart Download pill pre-set to your hardware's ideal quant (e.g. Q4_K_M for 8 GB VRAM)
- One-click download: clicking the pill starts the download immediately — no need to open variants first
- Per-variant download buttons: expand variants to see individual Download buttons next to each quantization
- Compatibility badges: Fits / May be slow / Won't fit indicators with hover popover showing details per variant
- Smarter recommendations: full-precision (F32/F16/BF16) and split shards deprioritized automatically
- Download sidebar popover: click the download icon in the sidebar to see active download progress, cancel, and status
Performance
- System info instant: GPU/RAM info appears immediately when opening Models tab (pre-computed at startup)
- No terminal flash: GPU detection no longer spawns visible console windows on Windows
- Faster model scan: local model detection no longer blocks other IPC calls
UI & Window
- Frameless window: native title bar removed — custom controls integrated into the UI
- Window controls on all pages: minimize/maximize/close visible on Chat, Models and Settings tabs
- Dialog titles visible: fixed black text on dark background in all dialogs
- Delete spinner: circular animation when deleting a local model
- Scrollbar styling: thin themed scrollbars on Browse and Local Models tabs
- Empty state centered: No models installed message now correctly centered