Skip to content

OpenHuman v0.57.44

Latest

Choose a tag to compare

@github-actions github-actions released this 17 Jun 23:06
Immutable release. Only release title and notes can be modified.

The Orchestration & Everyday-Flow Upgrade

This release brings a big leap in how OpenHuman runs—from smarter multi-agent orchestration and skills, to a cleaner two-panel app shell, to faster and quieter memory + inference pipelines. 🚀

Highlights

Intelligence & multi-agent orchestration 🧠

OpenHuman’s “intelligence layer” gets much more real: async sub-agents can now be steered while running and awaited cleanly, teammate agents can actually execute tasks end-to-end (with messaging captured in timelines), and durable workflow orchestration now has a first-class UI with an approval gate for high-cost runs. Worktree isolation is also wired through RPC + UI so complex runs can stay contained and observable.

(#3641, #3637, #3697, #3696) — Thank you @senamakel and @oxoxDev!

Skills, vision, and better agent grounding 🧰

Skills graduate to a catalog-driven run_skill flow with an isolated executor and mid-session refresh—making skill runs more reliable and less invasive to chat history. There’s also a new vision-v1 tier plus a dedicated vision sub-agent for image understanding, and tighter numeric grounding rules to keep numbers copied exactly from observed sources.

(#3722, #3699, #3678) — Thank you @sanil-23 and @sam!

Chat & background work UX polish 💬

Detached sub-agents now have a proper “Background tasks” panel and deliver results back into chat in an idle-gated, batched way—so you don’t get interrupted mid-turn. Conversations also get a key rendering fix: task insights now appear once, after the full response. Plus, proactive messages are now limited to fresh threads, and resume-on-reopen errors are classified and threads are “de-poisoned” to recover cleanly.

(#3735, #3726, #3713, #3714) — Thank you @sanil-23!

A new two-panel desktop shell & settings experience 🧭

The desktop app’s layout continues its transformation: TwoPanelLayout becomes the backbone across Settings/Brain/Connections/Chat, settings panels are unified with shared primitives for consistent navigation, and the root shell now adopts a unified sidebar with Chat as the home experience. There’s also a dedicated Desktop Agent setup panel to make permissions, seamless mode, always-on, and wake-word setup feel centralized and clear.

(#3643, #3646, #3751, #3634) — Thank you @senamakel and @M3gA-Mind!

MCP, connectors, and provider setup—faster and more actionable 🔌

MCP servers are now usable end-to-end by the agent (including OAuth flows and an agent bridge), boot is faster thanks to concurrent server spawning, and 401s are surfaced as an actionable “needs sign-in” state. Claude Code becomes an opt-in provider with a macOS auth fix and OpenHuman memory over MCP, and Composio/memory-source reconciliation is significantly more efficient. A small but helpful settings addition also adds a “Get API key” link for Kimi.

(#3648, #3429, #3719, #3647, #3432, #3685) — Thank you @sanil-23, @mysma-9403, and @Huyuochi!

Memory & retrieval that feels more “right” 🧠

Memory status reporting is smarter (only unrecoverable failures show as “error,” and the degraded latch clears on any completed extraction). There’s a new RPC to fully delete a memory source by exact source_id, and a new cover_window retrieval mode that powers a more relevant “last 24h” morning brief. Under the hood, extraction scoring avoids unnecessary cloning for lower overhead.

(#3630, #3631, #3664, #3700, #3436) — Thank you @sanil-23, @michaelk1957, and @mysma-9403!

Inference, embeddings, and observability hardening 🛡️

This range focuses heavily on making the app quieter, safer, and more resilient: OAuth requests now use a Codex-compatible client version; Responses API role/kind mismatches are fixed; extraction now caps max_tokens and correctly treats provider 402s as non-retryable to stop storms. Embeddings get multiple guardrails (refuse empty inputs, block chat-only custom endpoints with no embeddings API, and prevent LM Studio “no models loaded” floods). BYO provider 401/403 auth failures are demoted from Sentry, SQLite “disk full” noise is filtered, and model test failures are properly classified.

(#3414, #3626, #3617, #3670, #3629, #3688, #3689, #3672, #3676) — Thank you @Junwon, @oxoxDev, @CodeGhost21, @obchain, and @Zhang!

Meetings & voice in the real world 🗣️

Meetings level up in multiple directions: the Meet bot can now participate in-call (streaming, approvals, active vs listen-only), Recent Calls is populated correctly for the backend meet flow, and after-call summaries + context labels make threads immediately scannable. Calendar-triggered prompts and a Meeting Assistant settings UI help you join at the right time. Telegram bots also now accept inbound voice messages and transcribe them before dispatching to the agent.

(#3677, #3710, #3709, #3721, #3679) — Thank you @YellowSnnowmann and @ron Liu!

Security, stability, and test coverage ✅

A major hardening audit pass tightens core RPC auth, deep-link CSRF protections, inference resilience, and secrets handling—with regression tests included. Agent harness behavior now has robust Rust + browser E2E coverage (plus docs updates), the desktop Appium suite is repaired after the IA refactors, and observability regressions are explicitly tested for known transient shapes. There’s also a deps bump in the Postgres stack to clear recent RUSTSEC advisories.

(#3652, #3665, #3666, #3667, #3669, #3668, #3649, #3658, #3659, #3660, #3661, #3662, #3690) — Thank you @senamakel, @YellowSnnowmann, @Zhang, and @obchain!

Core plumbing & compatibility improvements ⚙️

Under-the-hood changes improve compatibility and reduce surprises: CDP scanners are migrated to an in-process transport (dropping the TCP DevTools port), Notion markdown fetch is now bounded-concurrent, managed-tool budget probing is cached behind a short TTL, subagents can auto-compact summarize when context fills, scheduler routing is scoped to avoid misroutes for calendar/meet-link reads, legacy health RPC names now alias to health_snapshot, and unknown-method probe traffic no longer pages Sentry. Login troubleshooting also improves by surfacing a previously silent /auth/me store failure.

(#3636, #3433, #3431, #3694, #3731, #3743, #3744, #3734) — Thank you @oxoxDev and @sanil-23!

Cloud/marketplace integration (opt-in) ☁️

An opt-in AgentBox adapter adds marketplace-style /run + job polling endpoints and can auto-register a GMI OpenAI-compatible provider at startup—kept fully off by default to avoid impacting desktop builds.

(#3651) — Thank you @CodeGhost21!

Additional highlights 🔗

A few more focused contributions round out this release with targeted fixes and improvements that are worth calling out. (#3622, #3632, #3698) — Thank you @YellowSnnowmann, @senamakel, and @oxoxDev!

New Contributors

Thanks for your first contributions—welcome to the OpenHuman crew!

  • Thank you @Huyuochi for adding a Kimi provider “Get API key” affiliate link to streamline setup. (#3685)
  • Thank you @ron Liu for enabling Telegram inbound voice messages with transcription support. (#3679)
  • Thank you @sam for tightening numeric grounding rules in agent prompts (with regression coverage). (#3678)
  • Thank you @Zhang for expanding observability regression coverage across multiple real-world transient/error shapes. (#3658, #3659, #3660, #3661, #3662, #3676)

Hope you’ll join the Discord too—there are exclusive roles and contributor rewards waiting.

Contributor Credits

Thank you to everyone who shipped this release: @CodeGhost21, @Huyuochi, @Junwon, @mega Mind, @michaelk1957, @mysma-9403, @obchain, @oxoxDev, @ron Liu, @sam, @sanil-23, @steven Enamakel, @steven Enamakel's Droid, @YellowSnnowmann, @Zhang, and @github-actions[bot].

Full Compare

v0.57.40...v0.57.44