Skip to content

feat(local-ai): sequential multi-model downloads + multimodal local runtime#48

Merged
senamakel merged 23 commits into
tinyhumansai:mainfrom
senamakel:feat/local-llm-3
Mar 28, 2026
Merged

feat(local-ai): sequential multi-model downloads + multimodal local runtime#48
senamakel merged 23 commits into
tinyhumansai:mainfrom
senamakel:feat/local-llm-3

Conversation

@senamakel
Copy link
Copy Markdown
Member

Summary

  • Reworked local AI runtime to support capability-specific assets: chat, vision, embeddings, STT (whisper.cpp), and TTS (piper).
  • Added sequential core download pipeline so bootstrap pulls models one-by-one in deterministic order.
  • Added per-asset download/status APIs and surfaced them in Settings Local Model panel with direct download actions.
  • Extended Tauri/core/frontend command surfaces for vision prompt, embeddings, STT, TTS, and asset status.
  • Added local-ai config fields for per-capability model IDs, preload flags, quantization preference, and speech model download URLs.

Problem

  • Local model bootstrap previously focused on a single model/runtime path and did not provide a complete flow for vision/embedding/speech assets.
  • Settings lacked controls to trigger capability-specific downloads and monitor their state.
  • Competing bootstrap calls could create inconsistent progress behavior.

Solution

  • Implemented capability-aware local runtime orchestration in Rust core with explicit model/asset states.
  • Added openhuman.local_ai_download_asset and openhuman.local_ai_assets_status RPC methods and Tauri bindings.
  • Added sequential downloader in core (chat -> vision -> embedding -> stt -> tts) with progress/warning updates.
  • Wired Settings Local Model panel to display capability cards and trigger per-capability downloads.
  • Kept STT implementation on whisper.cpp (whisper-cli) and TTS on piper, with workspace model paths and configurable download URLs.

Testing

  • yarn -s compile
  • cargo check --manifest-path src-tauri/Cargo.toml
  • Other checks run (list commands)
  • Manual validation completed (list scenarios)

Other checks run:

  • yarn -s tsc --noEmit
  • pre-push hooks (prettier --check, eslint, tsc --noEmit)

Manual validation completed:

  • Verified core compiles with new local AI config/runtime fields.
  • Verified Settings panel compiles with capability controls and test actions.
  • Verified branch push with pre-push hooks active.

Impact

  • Desktop/runtime: local AI now supports multi-capability asset management and sequential downloads via core.
  • UI: Settings Local Model panel now exposes per-capability status and download actions.
  • Performance: reduced competing bootstrap/download behavior by serializing the download flow.
  • Compatibility: legacy local_ai.model_id remains supported as chat fallback.

Breaking Changes

  • None
  • Yes (describe clearly, including migration steps)

Related

  • Issue(s): N/A
  • Follow-up PR(s)/TODOs:
    • Add automatic binary bootstrap for whisper.cpp/piper where absent (models now auto-download; binaries still expected on PATH or env override).
    • Improve per-capability progress UX for speech asset downloads.

…tation

- Removed references to the TinyHumans memory client and its dependencies from Cargo.toml and Cargo.lock.
- Introduced a new local memory client using SQLite for persistent storage, including methods for storing, querying, and managing memory documents and chunks.
- Updated memory management commands to work with the new local implementation, ensuring compatibility with existing functionality.
- Enhanced error handling and logging for memory operations, improving overall reliability and user feedback.
…tion

- Changed the default download URL to the new GGUF format for Qwen3-1.7B.
- Updated the default artifact name to reflect the new naming convention for the model.
- Introduced a new `LocalAiPromptParams` struct for handling prompt requests.
- Implemented the `prompt` method in the `LocalAiService` to process prompts with optional token limits and a no-think mode.
- Updated the Tauri command `openhuman_local_ai_prompt` to invoke the new prompt functionality.
- Enhanced the Local Model panel to include a UI for testing custom prompts, allowing users to input prompts and view responses.
- Added error handling for prompt execution in the UI, improving user feedback during interactions.
…e runtimes

- Added `backend_preference` field to `LocalAiConfig` for specifying preferred runtime.
- Implemented backend resolution logic to select between CPU, Metal, CUDA, and Vulkan based on user preference and feature availability.
- Updated `LocalAiStatus` to include fields for tracking active backend and performance metrics.
- Introduced runtime backend enumeration and support functions to manage backend capabilities.
- Enhanced Cargo.toml files to include features for Metal, CUDA, and Vulkan support in both core and Tauri projects.
- Updated `LocalAiService` to include backend status tracking with `active_backend` and `backend_reason` fields.
- Improved inference methods to accept a `no_think` parameter, allowing for more flexible prompt processing.
- Enhanced latency and token metrics tracking during inference, providing better performance insights.
- Adjusted default token limits for various inference methods to optimize user experience.
- Updated inference methods in `LocalAiService` to include a `no_think` parameter for improved prompt processing.
- Added backend status tracking with `active_backend`, `backend_reason`, and performance metrics such as `last_latency_ms`, `prompt_toks_per_sec`, and `gen_toks_per_sec`.
- Enhanced the UI in `LocalModelPanel` to display backend status and performance metrics, improving user experience and transparency.
- Adjusted token limits for various inference methods to optimize functionality.
- Removed the `mistralrs` dependency and related configurations from `Cargo.toml` and `LocalAiConfig`.
- Changed default model provider from `mistralrs` to `ollama`, updating associated default values for model ID, download URL, and artifact name.
- Simplified backend handling in `LocalAiService`, ensuring consistent use of the `ollama` provider for model management and status reporting.
- Enhanced `LocalAiStatus` to reflect the new provider and model path format.
…ds and asset management

- Introduced new parameters and methods for vision prompting, embedding, transcription, and text-to-speech (TTS) functionalities in the Local AI service.
- Enhanced the LocalAiConfig structure to support additional model IDs and preload options for various capabilities.
- Updated the LocalModelPanel UI to facilitate user interaction with new AI features, including asset status and download triggers.
- Implemented backend commands for managing local AI assets and their statuses, improving overall functionality and user experience.
- Added methods for downloading STT and TTS models, including configuration for download URLs in LocalAiConfig.
- Enhanced the LocalAiService with a new `download_all_models` method to manage model downloads and status updates.
- Introduced error handling for model availability checks, marking the service as degraded if downloads fail.
- Updated dispatch logic to trigger full model downloads, improving the initialization process for local AI services.
- Implemented new commands for fetching recent vision summaries and flushing the vision queue in the accessibility module.
- Enhanced the AccessibilityEngine to manage vision state, including queue depth and last vision summary.
- Updated the AccessibilityPanel UI to display vision state and recent summaries, allowing users to trigger flush actions.
- Added Redux state management for vision-related data, including loading states and error handling.
- Expanded onboarding steps to include user consent for local model usage, improving privacy and resource management awareness.
- Introduced a new 'Channels' section in the MiniSidebar for navigating to messaging settings.
- Added a Messaging Channels panel in SettingsHome for configuring Telegram and Discord authentication modes.
- Implemented channel connection management in MessagingPanel, including connection status updates and error handling.
- Created a routing utility to resolve preferred authentication modes for messaging channels.
- Enhanced socket service to handle real-time updates for channel connection statuses.
- Added API service for managing channel connections, including connect and disconnect functionalities.
- Updated thread API to support outbound routing based on the selected messaging channel.
…iption capabilities

- Changed the default vision model from `qwen2.5vl:3b` to `moondream:1.8b` in the local AI configuration.
- Implemented a new function `openhumanLocalAiTranscribeBytes` for transcribing audio from byte arrays, improving flexibility in audio input handling.
- Enhanced the `Conversations` component to support voice recording and transcription, including state management for recording and playback.
- Added error handling and user feedback for audio transcription processes, improving overall user experience.
…ccessibilityEngine

- Reformatted code in the AccessibilityEngine for better readability, including consistent indentation and line breaks.
- Enhanced the clarity of the `analyze_frame_with_vision` function signature by spreading parameters across multiple lines.
- Improved the readability of temporary path creation in the `capture_screen_image_ref` function.
…nhance accessibility features

- Added a new command to request specific accessibility permissions on macOS, including screen recording, accessibility, and input monitoring.
- Updated the AccessibilityPanel and onboarding steps to utilize the new permission request functionality, improving user experience and compliance with macOS requirements.
- Introduced a MemoryWorkspace component for managing memory documents, enhancing the intelligence features of the application.
- Refactored related Redux actions and state management to support the new permission handling and memory functionalities.
- Added synchronization summary text to the SkillsGrid and SkillCard components, providing users with insights on sync counts, local data size, and last sync time.
- Implemented a new function to derive skill sync summary text based on skill state and sync statistics.
- Updated the SkillsGrid to display sync metrics in a new column, improving the visibility of synchronization status.
- Enhanced the SkillManager to manage sync statistics, including tracking sync durations and local data metrics.
- Refactored Redux state management to include persistent sync metrics for each skill, ensuring accurate reporting and user feedback.
- Added new commands for listing, updating, removing, running, and retrieving run history of cron jobs.
- Introduced a dedicated CronJobsPanel for managing scheduled jobs, enhancing user interface for cron job configuration.
- Updated navigation components to include links to the new cron jobs settings.
- Enhanced the core server to support cron job operations, ensuring integration with existing functionality.
- Implemented error handling and user feedback for cron job actions, improving overall user experience.
- Simplified error handling in cron job management functions by removing unnecessary line breaks.
- Enhanced the readability of the `dispatch` function in the core server by consolidating related code.
- Improved formatting in the AccessibilityEngine for better consistency and clarity in permission requests.
- Changed the image reference in README.md from JPG to PNG format for better compatibility.
- Removed the old JPG file and added the new PNG file to the documentation directory.
@senamakel senamakel marked this pull request as ready for review March 28, 2026 17:32
@senamakel senamakel merged commit b5037b0 into tinyhumansai:main Mar 28, 2026
1 check passed
@senamakel senamakel deleted the feat/local-llm-3 branch March 28, 2026 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant