feat(local-ai): sequential multi-model downloads + multimodal local runtime by senamakel · Pull Request #48 · tinyhumansai/openhuman

senamakel · 2026-03-28T03:59:47Z

Summary

Reworked local AI runtime to support capability-specific assets: chat, vision, embeddings, STT (whisper.cpp), and TTS (piper).
Added sequential core download pipeline so bootstrap pulls models one-by-one in deterministic order.
Added per-asset download/status APIs and surfaced them in Settings Local Model panel with direct download actions.
Extended Tauri/core/frontend command surfaces for vision prompt, embeddings, STT, TTS, and asset status.
Added local-ai config fields for per-capability model IDs, preload flags, quantization preference, and speech model download URLs.

Problem

Local model bootstrap previously focused on a single model/runtime path and did not provide a complete flow for vision/embedding/speech assets.
Settings lacked controls to trigger capability-specific downloads and monitor their state.
Competing bootstrap calls could create inconsistent progress behavior.

Solution

Implemented capability-aware local runtime orchestration in Rust core with explicit model/asset states.
Added openhuman.local_ai_download_asset and openhuman.local_ai_assets_status RPC methods and Tauri bindings.
Added sequential downloader in core (chat -> vision -> embedding -> stt -> tts) with progress/warning updates.
Wired Settings Local Model panel to display capability cards and trigger per-capability downloads.
Kept STT implementation on whisper.cpp (whisper-cli) and TTS on piper, with workspace model paths and configurable download URLs.

Testing

yarn -s compile
cargo check --manifest-path src-tauri/Cargo.toml
Other checks run (list commands)
Manual validation completed (list scenarios)

Other checks run:

yarn -s tsc --noEmit
pre-push hooks (prettier --check, eslint, tsc --noEmit)

Manual validation completed:

Verified core compiles with new local AI config/runtime fields.
Verified Settings panel compiles with capability controls and test actions.
Verified branch push with pre-push hooks active.

Impact

Desktop/runtime: local AI now supports multi-capability asset management and sequential downloads via core.
UI: Settings Local Model panel now exposes per-capability status and download actions.
Performance: reduced competing bootstrap/download behavior by serializing the download flow.
Compatibility: legacy local_ai.model_id remains supported as chat fallback.

Breaking Changes

None
Yes (describe clearly, including migration steps)

…tation - Removed references to the TinyHumans memory client and its dependencies from Cargo.toml and Cargo.lock. - Introduced a new local memory client using SQLite for persistent storage, including methods for storing, querying, and managing memory documents and chunks. - Updated memory management commands to work with the new local implementation, ensuring compatibility with existing functionality. - Enhanced error handling and logging for memory operations, improving overall reliability and user feedback.

…tion - Changed the default download URL to the new GGUF format for Qwen3-1.7B. - Updated the default artifact name to reflect the new naming convention for the model.

- Introduced a new `LocalAiPromptParams` struct for handling prompt requests. - Implemented the `prompt` method in the `LocalAiService` to process prompts with optional token limits and a no-think mode. - Updated the Tauri command `openhuman_local_ai_prompt` to invoke the new prompt functionality. - Enhanced the Local Model panel to include a UI for testing custom prompts, allowing users to input prompts and view responses. - Added error handling for prompt execution in the UI, improving user feedback during interactions.

…e runtimes - Added `backend_preference` field to `LocalAiConfig` for specifying preferred runtime. - Implemented backend resolution logic to select between CPU, Metal, CUDA, and Vulkan based on user preference and feature availability. - Updated `LocalAiStatus` to include fields for tracking active backend and performance metrics. - Introduced runtime backend enumeration and support functions to manage backend capabilities. - Enhanced Cargo.toml files to include features for Metal, CUDA, and Vulkan support in both core and Tauri projects.

- Updated `LocalAiService` to include backend status tracking with `active_backend` and `backend_reason` fields. - Improved inference methods to accept a `no_think` parameter, allowing for more flexible prompt processing. - Enhanced latency and token metrics tracking during inference, providing better performance insights. - Adjusted default token limits for various inference methods to optimize user experience.

- Updated inference methods in `LocalAiService` to include a `no_think` parameter for improved prompt processing. - Added backend status tracking with `active_backend`, `backend_reason`, and performance metrics such as `last_latency_ms`, `prompt_toks_per_sec`, and `gen_toks_per_sec`. - Enhanced the UI in `LocalModelPanel` to display backend status and performance metrics, improving user experience and transparency. - Adjusted token limits for various inference methods to optimize functionality.

- Removed the `mistralrs` dependency and related configurations from `Cargo.toml` and `LocalAiConfig`. - Changed default model provider from `mistralrs` to `ollama`, updating associated default values for model ID, download URL, and artifact name. - Simplified backend handling in `LocalAiService`, ensuring consistent use of the `ollama` provider for model management and status reporting. - Enhanced `LocalAiStatus` to reflect the new provider and model path format.

…ds and asset management - Introduced new parameters and methods for vision prompting, embedding, transcription, and text-to-speech (TTS) functionalities in the Local AI service. - Enhanced the LocalAiConfig structure to support additional model IDs and preload options for various capabilities. - Updated the LocalModelPanel UI to facilitate user interaction with new AI features, including asset status and download triggers. - Implemented backend commands for managing local AI assets and their statuses, improving overall functionality and user experience.

- Added methods for downloading STT and TTS models, including configuration for download URLs in LocalAiConfig. - Enhanced the LocalAiService with a new `download_all_models` method to manage model downloads and status updates. - Introduced error handling for model availability checks, marking the service as degraded if downloads fail. - Updated dispatch logic to trigger full model downloads, improving the initialization process for local AI services.

- Implemented new commands for fetching recent vision summaries and flushing the vision queue in the accessibility module. - Enhanced the AccessibilityEngine to manage vision state, including queue depth and last vision summary. - Updated the AccessibilityPanel UI to display vision state and recent summaries, allowing users to trigger flush actions. - Added Redux state management for vision-related data, including loading states and error handling. - Expanded onboarding steps to include user consent for local model usage, improving privacy and resource management awareness.

- Introduced a new 'Channels' section in the MiniSidebar for navigating to messaging settings. - Added a Messaging Channels panel in SettingsHome for configuring Telegram and Discord authentication modes. - Implemented channel connection management in MessagingPanel, including connection status updates and error handling. - Created a routing utility to resolve preferred authentication modes for messaging channels. - Enhanced socket service to handle real-time updates for channel connection statuses. - Added API service for managing channel connections, including connect and disconnect functionalities. - Updated thread API to support outbound routing based on the selected messaging channel.

…iption capabilities - Changed the default vision model from `qwen2.5vl:3b` to `moondream:1.8b` in the local AI configuration. - Implemented a new function `openhumanLocalAiTranscribeBytes` for transcribing audio from byte arrays, improving flexibility in audio input handling. - Enhanced the `Conversations` component to support voice recording and transcription, including state management for recording and playback. - Added error handling and user feedback for audio transcription processes, improving overall user experience.

…ccessibilityEngine - Reformatted code in the AccessibilityEngine for better readability, including consistent indentation and line breaks. - Enhanced the clarity of the `analyze_frame_with_vision` function signature by spreading parameters across multiple lines. - Improved the readability of temporary path creation in the `capture_screen_image_ref` function.

…-llm-3

…nhance accessibility features - Added a new command to request specific accessibility permissions on macOS, including screen recording, accessibility, and input monitoring. - Updated the AccessibilityPanel and onboarding steps to utilize the new permission request functionality, improving user experience and compliance with macOS requirements. - Introduced a MemoryWorkspace component for managing memory documents, enhancing the intelligence features of the application. - Refactored related Redux actions and state management to support the new permission handling and memory functionalities.

- Added synchronization summary text to the SkillsGrid and SkillCard components, providing users with insights on sync counts, local data size, and last sync time. - Implemented a new function to derive skill sync summary text based on skill state and sync statistics. - Updated the SkillsGrid to display sync metrics in a new column, improving the visibility of synchronization status. - Enhanced the SkillManager to manage sync statistics, including tracking sync durations and local data metrics. - Refactored Redux state management to include persistent sync metrics for each skill, ensuring accurate reporting and user feedback.

- Added new commands for listing, updating, removing, running, and retrieving run history of cron jobs. - Introduced a dedicated CronJobsPanel for managing scheduled jobs, enhancing user interface for cron job configuration. - Updated navigation components to include links to the new cron jobs settings. - Enhanced the core server to support cron job operations, ensuring integration with existing functionality. - Implemented error handling and user feedback for cron job actions, improving overall user experience.

- Simplified error handling in cron job management functions by removing unnecessary line breaks. - Enhanced the readability of the `dispatch` function in the core server by consolidating related code. - Improved formatting in the AccessibilityEngine for better consistency and clarity in permission requests.

- Changed the image reference in README.md from JPG to PNG format for better compatibility. - Removed the old JPG file and added the new PNG file to the documentation directory.

senamakel added 23 commits March 27, 2026 17:02

Merge remote-tracking branch 'upstream/main' into feat/local-memory

4361657

chore: update subproject commit reference in skills

fa12d5b

chore: update subproject commit reference in skills

756cc3d

fix: update default model URL and artifact name for Qwen AI configura…

b2fe26e

…tion - Changed the default download URL to the new GGUF format for Qwen3-1.7B. - Updated the default artifact name to reflect the new naming convention for the model.

chore(format): apply pre-push formatting for local-ai updates

40d059b

Merge remote-tracking branch 'upstream/feat/channels' into feat/local…

28dafd8

…-llm-3

fix(docs): update image format in README and replace JPG with PNG

982138d

- Changed the image reference in README.md from JPG to PNG format for better compatibility. - Removed the old JPG file and added the new PNG file to the documentation directory.

senamakel marked this pull request as ready for review March 28, 2026 17:32

senamakel merged commit b5037b0 into tinyhumansai:main Mar 28, 2026
1 check passed

senamakel deleted the feat/local-llm-3 branch March 28, 2026 17:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(local-ai): sequential multi-model downloads + multimodal local runtime#48

feat(local-ai): sequential multi-model downloads + multimodal local runtime#48
senamakel merged 23 commits into
tinyhumansai:mainfrom
senamakel:feat/local-llm-3

senamakel commented Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

senamakel commented Mar 28, 2026

Summary

Problem

Solution

Testing

Impact

Breaking Changes

Related

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant