Skip to content

Feral v0.1.4

Choose a tag to compare

@github-actions github-actions released this 05 Jun 23:13

Agent mode

  • Native Feral Agent runtime. Agents now run on a built-in Feral Agent sidecar (Bun/TS) wired directly into the chat stream — no external gateway process. A Chat/Agent toggle in the composer switches modes and auto-loads the selected local model into the agent engine.
  • DeepResearch & adaptive reasoning. Dynamic max-iteration budgets for deep-research and complex tasks, model-fitness scoring, error-correcting control loop, and persistent agent memory.
  • Sturdier tool calls. Parser now handles Gemma-style <tool_call>, bracket and bare-JSON formats, and the arguments key; adds silent tool calls and an empty-response fallback; raises token budgets for thinking models.

Chat & UI

  • Live context ring. Real token usage straight from the model — exact prompt tokens from llama.cpp locally, real usage stats from cloud providers — with a hover popover showing tokens used, free space and message count, and the model's true context window instead of an estimate.
  • Streaming polish. Words fade in one-by-one as tokens stream, and a phase indicator shows Thinking / Calling tool / Processing.
  • Thinking blocks. Support for multiple formats (<think>, <thinking>, <|channel>), a thinking timer, and blocks that now persist across chat and tab switches.
  • Response resilience. Partial responses are always persisted, and responses survive navigating away and back, tab switches, and hot-swapping the active model.

Stability & performance

  • Fix a GGML_ASSERT crash on long agent prompts by chunking the prefill batch.
  • Memoize message rendering so streaming no longer re-parses already-completed messages.
  • Warning-free cargo clippy on the inference build, repo hygiene (.gitattributes, corrected .gitignore), and unist-util-visit pinned as a direct dependency.