Skip to content

Releases: bloom500/feral

Feral v0.2.3

14 Jun 21:08

Choose a tag to compare

Released 2026-06-14 — Windows, macOS (Apple Silicon + Intel), and Linux.

Added

  • GPU acceleration. Feral now ships a GPU backend on every platform —
    Vulkan on Windows and Linux, Metal on macOS — and offloads the whole model
    to the GPU by default. Local models that previously ran CPU-only (slow,
    sometimes "not responding" for minutes) now use the graphics card. If the GPU
    can't be used — missing or old driver, no Vulkan runtime, or not enough VRAM
    for the model's context — Feral automatically falls back to CPU so the model
    still loads instead of failing.
  • Desktop control (opt-in). The agent can now drive native applications
    through the OS accessibility tree — list windows, read controls, type, click,
    and send real keystrokes. Off by default; enable it under Settings → Agent,
    with a Safe mode (confirm every action) and a YOLO mode (no prompts). A hard
    denylist (password managers, system security dialogs, Feral itself) can never
    be controlled.
  • Configurable token budget. The agent's conversation budget is now
    unlimited by default — no more hitting "budget exhausted" mid-task. Optional
    caps (1M/5M/20M/50M) are available under Settings → Agent for cost control.
  • Live context ring. A live indicator of how full the model's context
    window is, so you can see when the conversation is approaching the limit.

Fixed

  • Loading a model no longer crashes the machine. Modern models advertise
    enormous training contexts (up to 256K), and the KV cache was sized to that
    full context and allocated up front — roughly 90 GB for a 4B model, which
    instantly exhausted memory (a kernel panic and reboot on macOS, a near-hang
    on Windows). The load-time context is now capped to a safe default (8192,
    raisable via FERAL_MAX_CONTEXT), clamped to what the model actually
    supports.

Feral v0.2.3-rc.4

14 Jun 20:19

Choose a tag to compare

Feral v0.2.3-rc.4 Pre-release
Pre-release
ci(release): use TheMrMilchmann/setup-msvc-dev (ilammer action was 404)

The previous attempt referenced ilammer/msvc-dev-cmd, which no longer exists
as a repository — GitHub failed every matrix job at "Set up job" (action
resolution happens for all steps regardless of their `if:`), so Linux and
macOS regressed too. Swap to TheMrMilchmann/setup-msvc-dev@v3.0.2 (confirmed
to exist) to load the MSVC environment for the Ninja-based Vulkan build.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Feral v0.2.2

12 Jun 22:21

Choose a tag to compare

Released 2026-06-13 — Windows, macOS (Apple Silicon + Intel), and Linux.

Fixed

  • The agent no longer stops mid-reply when it runs out of tokens. Long
    writing sessions (reports, theses, articles) were cut off mid-sentence at
    the per-call token limit and silently presented as the final answer, forcing
    the user to type "continue" by hand. All inference providers now report why
    generation ended (finish_reason / done_reason / stop_reason), and on a
    mid-answer cutoff the agent automatically resumes exactly where it stopped —
    the reply streams on in the same bubble, up to 4 automatic continuations.
  • More malformed tool-call shapes are caught and retried. The v0.2.1
    detector matched name/tool keys only; real-world transcripts showed
    models emitting {"invoke name="write_file"> (a JSON/XML hybrid) and
    XML-style <invoke name=...>, which escaped detection — ending the turn
    mid-task with the garbage visible in chat. Both shapes are now scrubbed and
    trigger the corrective retry nudge.
  • ask_user validation errors teach the model the right shape. Every
    rejection now includes a complete valid example call, so a model that got
    the structure wrong can self-correct on retry instead of failing the same
    way twice and giving up.

Feral v0.2.1

12 Jun 21:03

Choose a tag to compare

Released 2026-06-12 — Windows, macOS (Apple Silicon + Intel), and Linux.

Fixed

  • Raw tool calls no longer leak into the chat. Two holes closed: the
    inference providers streamed the re-encoded <tool_call>{json}</tool_call>
    tag to the UI as visible tokens, and the frontend only suppressed tool-call
    text at the start of a streamed answer — prose followed by a mid-message
    tool call (e.g. <tool_call>{"name="memory_graph">) was displayed verbatim.
    Tool calls now travel only through the parsed content channel, and the
    streaming view cuts tool-call text anywhere it appears (including a partial
    opener at the end of the buffer), keeping the prose before it visible.
  • The agent no longer stops mid-task on a malformed tool call. When a
    model emitted a tool call with corrupted JSON, the parser scrubbed it but
    executed nothing, and the loop treated the turn as a final answer — the
    task silently died. The loop now detects the malformed attempt and feeds a
    corrective nudge back to the model so it can re-emit a valid call (bounded
    at 3 retries).
  • Parallel tool calls are no longer dropped. All providers (OpenAI-
    compatible, Ollama, Anthropic) re-encoded only the first native tool call
    of a turn; any additional calls were silently discarded, leaving the model
    waiting on results that never arrived. Every call in the turn is now
    executed.
  • ask_user works reliably with native function calling. The native tool
    schema declared questions as a bare array with no item structure, so
    models had to guess the nested {question, options:[{label}]} shape and
    most calls failed validation. Tools can now publish a full JSON Schema per
    parameter, and ask_user ships one — including option labels, descriptions,
    the recommended flag, and multi-select.

Feral v0.2.0

12 Jun 18:04

Choose a tag to compare

Released 2026-06-12 — Windows, macOS (Apple Silicon + Intel), and Linux.

⚠️ Migration notes from 0.1.x

  • Updater key rotation. The original update-signing key was exposed in the
    public git history and has been rotated. Upgrading from 0.1.7 or older
    requires either installing the transitional 0.1.8 release first (it
    carries the new verification key) or downloading the 0.2.0 installer
    manually one time. Full plan: docs/UPDATER_KEY_MIGRATION.md.
  • Onboarding, conversations, models, and BYOK keys carry over unchanged.

New

  • Vision — the agent can finally see your images. Pasted screenshots
    (Ctrl+V) and uploaded image files now reach the model as real pixel data
    (base64 data URLs), not just a [Image attached: …] filename note. Works on
    both inference paths: direct chat (BYOK cloud via OpenAI image_url content
    parts) and the Feral Agent sidecar (OpenAI-compatible, Ollama images, and
    Anthropic base64 blocks). Local llama.cpp GGUF models remain text-only and
    keep the filename note.
  • Memory that actually carries over. New conversations no longer start
    cold: the chat tab now injects a "[Memory context]" block (facts from the
    shared knowledge graph) into the system prompt on every send, and runs a
    background extraction pass after each completed turn that writes
    subject–predicate–object triples back into ~/.feral/memory-graph.json
    the same graph the agent sidecar maintains. The sidecar side recalls graph
    facts at every turn too, extracts from the very first assistant turn
    (previously only every 3rd — short chats never learned anything), and the
    previously-unregistered memory_ops / memory_graph tools are now live so
    "remember X" / "forget Y" take effect immediately.
  • Memory Graph page redesign. Cognee-inspired dark visualization: glowing
    neon nodes on a near-black canvas, degree-scaled node sizes, a filter rail
    with per-type counts, relation chips, free-text node search, a labels
    toggle, and a click-to-inspect detail card showing a node's connections.
  • MCP Extensions (native connector). Feral now consumes Model Context
    Protocol servers through the official rmcp Rust SDK, managed entirely in
    the Tauri host. New "Extensions" entry in the sidebar (under Skills) with an
    app-store style UI: curated catalog (File Access, Long-term Memory, GitHub,
    Web Search, Browser Automation, Deep Reasoning), one-click install, on/off
    toggle, "What can it do?" tool listing, and humanized errors. No transports,
    JSON-RPC, raw values, internal paths, or API keys ever reach the frontend;
    configs (including keys) live backend-side in ~/.feral/mcp.json.
  • Drag & drop + paste attachments — any file type. Files and images can
    be dropped onto the chat input, and screenshots paste straight from the
    clipboard (Ctrl+V). PDFs and Office documents (.docx, .pptx, .xlsx, .odt)
    are now parsed natively (new Rust extract_file_text command) so their
    text reaches the model; plain-text files of any extension work as before;
    and true binaries are attached as a path reference the agent can open with
    its file tools instead of becoming a dead "Unsupported format" chip. The
    agent path now receives every attachment too — previously files whose text
    couldn't be extracted were silently dropped and never reached the model.
  • macOS and Linux releases. The release pipeline now builds and signs
    installers for Windows (.msi/.exe), macOS Apple Silicon + Intel (.dmg),
    and Linux (.deb/.rpm) from a single tag, updater manifest included.
    The agent sidecar resolves its per-platform binary on all five targets.
  • Mascot redesign — the real Feral monster. The pixel companion is now
    the brand mascot itself: charcoal-black fluffy monster with orange horns,
    an orange face plate with big black eyes, tiny white fangs, and a round
    orange belly. 55 animated variants across all 22 states, composed from a
    single base sprite so the cast stays consistent (laptop typing, thought
    clouds and lightbulbs, magnifier searches, party hats, heart eyes, a real
    side-run cycle, dissolve-in spawning, and more). It also renders 26%
    larger at a crisp integer 3× pixel scale, so every state, variant, and
    effect is clearly readable without crowding the chat input. A procedural
    pixel-effects layer plays around it per state: confetti on celebrate,
    rising hearts, drifting Z's while sleeping, thought dots, an orbiting
    search ring, a drawn-in green check on done, flashing error cross, work
    sparks during tool calls.
  • Sonnet-style agent voice. SOUL.md rewritten (super friendly + ultra
    useful), plus new bundled IDENTITY.md and AGENTS.md companions — each
    user-overridable at ~/.feral/<NAME>.md — composed into the system prompt.
  • Contributor guide. docs/CONTRIBUTOR_GUIDE.md: three-runtime
    architecture, IPC protocols, test matrix, build & release flow.

Stability

  • macOS/Linux: cloud model selection works in agent mode. The agent
    sidecar binary is now resolved next to the main executable
    (Contents/MacOS/feral-agent in the .app bundle, /usr/bin/feral-agent
    on deb/rpm installs) where Tauri actually places it — previously only the
    resource directory was checked, so the sidecar was silently dead on
    macOS/Linux production installs and picking a cloud (BYOK) model from the
    model selector did nothing. Model-switch failures (sidecar offline,
    provider disabled, missing key) now surface as error notifications
    instead of vanishing silently.

  • Agent stop actually stops (this release). The Stop button's signal now
    travels the whole chain: new feral_stop_generation Tauri command → stop
    message over sidecar stdin → AgentLoop.stop(sessionId) aborts the
    in-flight inference fetch and any running tool. Previously the frontend
    called a Rust command that didn't exist, so agent generations were
    unstoppable.

  • Agent tasks survive chat/tab switches. A live per-session mirror
    (lib/feralLiveSession.ts) accumulates streamed content, tool bubbles, and
    agent phase even while another chat is on screen; re-opening the in-flight
    conversation rehydrates all of it instead of showing the stale disk
    snapshot (which made tasks look reset during long tool runs).

  • Tool-call bubbles behave. The mascot's tool-call stack now grows upward
    from above the mascot (it could previously extend down over the typing
    bar), and each finished bubble fades out on its own after 4s instead of
    piling up until the turn ended.

  • Unified stream stop (A2). One stop entry point (lib/streamControl.ts)
    routes Stop to whichever path (chat backend / agent sidecar) actually owns
    the in-flight generation. Previously the Stop button in Agent mode told the
    idle chat backend to stop while the sidecar kept generating. Feral streams
    are stoppable per-session, and a new send interrupts the previous in-flight
    stream on both paths.

  • Sidecar supervision (#11). The Tauri shell now watches the agent sidecar
    process, restarts it with backoff on crashes, and shows an "agent offline /
    restarting" banner. Before: a sidecar crash made Agent mode silently mute.

  • GGUF chat template (A4). The prompt format is now read from the model's
    own GGUF metadata (tokenizer.chat_template) via llama.cpp's template
    engine; the filename-based guess is only a fallback. A renamed GGUF no
    longer gets a wrong template and corrupted output.

  • Real tokenizer endpoint (P3). New /tokenize route on the local API
    backed by the loaded model's actual vocabulary; the agent's context
    accounting no longer relies on GPT-2 BPE guesses, and token counts are
    cached per message text.

  • Idle-timeout transparency (#13). A stream that stalls for 300s is now
    reported as an explained error ("model stopped responding…") instead of
    silently ending in a fake "stopped by user" state.

  • No more mid-reasoning dead ends. When a completion exhausts its
    per-call token limit while the model is still thinking, the agent loop now
    feeds the partial reasoning back and asks the model to finish — up to 4
    automatic continuations — instead of surfacing "(The model used all
    available tokens on reasoning and produced no answer)". Tasks complete
    end-to-end regardless of the Max Tokens slider.

Error handling

  • Root React ErrorBoundary (#9). A render exception now shows a recovery
    screen (try again / reload) instead of a blank window.
  • Humanized inference errors (#10). Raw provider errors (401/429/network/
    context overflow…) are mapped to plain-English messages with a fix-it
    action (e.g. "key rejected → Open Cloud Keys"); the raw error stays
    available under "Technical details". Local-engine failures no longer leak
    [Error: …] text into the chat transcript.
  • Cron visibility (X3). Scheduled jobs now run through the full agent
    loop (tools, memory, budgets) instead of a bare LLM call, and their results
    and failures surface as toasts. Failed runs emit a cron_error event.

First-run experience

  • Zero-models flow (#14). After the first model download finishes, it is
    loaded automatically (when nothing else is loaded), taking a fresh install
    from "empty app" to "ready to chat" without a manual Load click.
  • Hardware-aware recommendation (#15). The onboarding wizard's final step
    now reads your RAM/VRAM/Vulkan detection and recommends a concrete model
    size + quantization to download.
  • Slow-start indicator (#16). While a model loads (with %) or a long
    prompt prefills, the chat shows an explanatory status instead of silent
    dots.
  • Onboarding sequencing (#17). The agent-creation flow now actually
    mounts (it was unreachable) and never stacks on top of the first-run
    wizard.

UI

  • Tool-call bubbles (#18) are interactive: finished calls expand to show
    the tool's output (or error), and long-running tools show live
    retry/progress notes.
  • **Empty states (#19...
Read more

Feral v0.1.6

06 Jun 12:43

Choose a tag to compare

Skills

  • Claude Code-style skill injection. The system prompt now carries only a short "Available skills" menu (id, name, description) instead of dumping every installed skill's full body. The new read_skill tool lets the LLM load the full SKILL.md body of any installed skill on demand. Skills are refreshed every turn from Rust, so installing or removing a skill mid-conversation shows up in the next message without resetting the session.

Agent

  • Helpful message on empty thinking completions. When a model is cut off mid-reasoning and produces no visible answer, the chat now shows a clear message (distinguishing "model used all tokens on reasoning" from a true empty response) instead of a silent empty bubble.
  • selectLocalAgent routes through the model store. In Agent mode, picking a local model from the chat header now goes through the store's load() method, so the ModelPill renders a real progress bar (with role="progressbar") instead of bare text.

Feral v0.1.5

06 Jun 10:21

Choose a tag to compare

Mascot

  • 8-bit animated mascot. A 16×16 pixel-art fluffy black monster with orange horns, big eyes, and two fangs now lives permanently on the typing bar. Reacts to what you're doing: blinks while idle, looks down while you type, eyes dart side-to-side while thinking, scans down while calling a tool, hops happily when the model finishes.
  • Idle boredom run. After 18 seconds of inactivity the mascot gets bored, switches to a side-profile silhouette, and runs across the full width of the typing bar — leaving small pixel dust puffs in its wake — then flips around and runs back. Any activity (typing, streaming) snaps it straight back to the perch.
  • Reduced-motion support. All canvas animations respect prefers-reduced-motion.

Agent

  • Token cap removed. No more daily or per-conversation token budget. Feral Agent runs on BYOK (user pays own provider), so capping was pointless and caused agent sessions to silently stall. Budget can be re-enabled via FERAL_BUDGET_DAY / FERAL_BUDGET_CONVERSATION env vars if needed.
  • CI sidecar fix. Release builds now compile the Feral Agent sidecar from the vendored FeralAgent/ directory in the monorepo instead of cloning an outdated external repository. Eliminates a class of "agent not responding in production release" bugs.

UI fixes

  • Real app version in sidebar. The version badge now reads from the Tauri API instead of the previously hardcoded v0.1.3 string.
  • Context ring in agent mode. The ring no longer stays stale when using the agent. It now shows a comet-arc activity indicator during agent streaming (the sidecar doesn't emit live token counts, so the spinning arc is the honest signal).

Feral v0.1.4

05 Jun 23:13

Choose a tag to compare

Agent mode

  • Native Feral Agent runtime. Agents now run on a built-in Feral Agent sidecar (Bun/TS) wired directly into the chat stream — no external gateway process. A Chat/Agent toggle in the composer switches modes and auto-loads the selected local model into the agent engine.
  • DeepResearch & adaptive reasoning. Dynamic max-iteration budgets for deep-research and complex tasks, model-fitness scoring, error-correcting control loop, and persistent agent memory.
  • Sturdier tool calls. Parser now handles Gemma-style <tool_call>, bracket and bare-JSON formats, and the arguments key; adds silent tool calls and an empty-response fallback; raises token budgets for thinking models.

Chat & UI

  • Live context ring. Real token usage straight from the model — exact prompt tokens from llama.cpp locally, real usage stats from cloud providers — with a hover popover showing tokens used, free space and message count, and the model's true context window instead of an estimate.
  • Streaming polish. Words fade in one-by-one as tokens stream, and a phase indicator shows Thinking / Calling tool / Processing.
  • Thinking blocks. Support for multiple formats (<think>, <thinking>, <|channel>), a thinking timer, and blocks that now persist across chat and tab switches.
  • Response resilience. Partial responses are always persisted, and responses survive navigating away and back, tab switches, and hot-swapping the active model.

Stability & performance

  • Fix a GGML_ASSERT crash on long agent prompts by chunking the prefill batch.
  • Memoize message rendering so streaming no longer re-parses already-completed messages.
  • Warning-free cargo clippy on the inference build, repo hygiene (.gitattributes, corrected .gitignore), and unist-util-visit pinned as a direct dependency.

Feral v0.1.3

01 Jun 07:29

Choose a tag to compare

What's new in v0.1.3

SkillHub — New Feature

The biggest addition in this release. Skills extend what your AI can do — think of them as plugins for your local models.

  • Installed tab: see, preview, and remove every skill you have installed
  • Discover tab: browse remote skills from the Feral manifest, preview content before installing
  • Community tab: 27 community skills from ClawHub and GitHub contributors, browsable directly in the app — Skill Vetter, Self-Improving Agent, GitHub CLI, Google Workspace, Multi Search Engine, Obsidian, and more
  • Import tab: install any skill by URL or local file path, with a live content preview before committing
  • Install / remove flows: one-click install and remove with error feedback
  • Click-outside to close: drawer dismisses naturally without needing a close button

Auto-updater

  • Version number now shown correctly in Settings → General and the sidebar footer
  • Update-available toast appears automatically when a new release is out
  • "Check for updates" button in Settings gives instant feedback

CI / Release

  • Release workflow now builds, signs, and publishes installer artifacts automatically on every version tag
  • latest.json and .sig files are uploaded automatically — auto-update works for all users from this release forward

Full changelog: v0.1.2...v0.1.3

Feral v0.1.2

31 May 11:58

Choose a tag to compare

What's new in v0.1.2

Models Tab — Complete Overhaul

  • Browse HuggingFace redesigned: each card shows a smart Download pill pre-set to your hardware's ideal quant (e.g. Q4_K_M for 8 GB VRAM)
  • One-click download: clicking the pill starts the download immediately — no need to open variants first
  • Per-variant download buttons: expand variants to see individual Download buttons next to each quantization
  • Compatibility badges: Fits / May be slow / Won't fit indicators with hover popover showing details per variant
  • Smarter recommendations: full-precision (F32/F16/BF16) and split shards deprioritized automatically
  • Download sidebar popover: click the download icon in the sidebar to see active download progress, cancel, and status

Performance

  • System info instant: GPU/RAM info appears immediately when opening Models tab (pre-computed at startup)
  • No terminal flash: GPU detection no longer spawns visible console windows on Windows
  • Faster model scan: local model detection no longer blocks other IPC calls

UI & Window

  • Frameless window: native title bar removed — custom controls integrated into the UI
  • Window controls on all pages: minimize/maximize/close visible on Chat, Models and Settings tabs
  • Dialog titles visible: fixed black text on dark background in all dialogs
  • Delete spinner: circular animation when deleting a local model
  • Scrollbar styling: thin themed scrollbars on Browse and Local Models tabs
  • Empty state centered: No models installed message now correctly centered