feat(local-ai): sequential multi-model downloads + multimodal local runtime#48
Merged
Merged
Conversation
…tation - Removed references to the TinyHumans memory client and its dependencies from Cargo.toml and Cargo.lock. - Introduced a new local memory client using SQLite for persistent storage, including methods for storing, querying, and managing memory documents and chunks. - Updated memory management commands to work with the new local implementation, ensuring compatibility with existing functionality. - Enhanced error handling and logging for memory operations, improving overall reliability and user feedback.
…tion - Changed the default download URL to the new GGUF format for Qwen3-1.7B. - Updated the default artifact name to reflect the new naming convention for the model.
- Introduced a new `LocalAiPromptParams` struct for handling prompt requests. - Implemented the `prompt` method in the `LocalAiService` to process prompts with optional token limits and a no-think mode. - Updated the Tauri command `openhuman_local_ai_prompt` to invoke the new prompt functionality. - Enhanced the Local Model panel to include a UI for testing custom prompts, allowing users to input prompts and view responses. - Added error handling for prompt execution in the UI, improving user feedback during interactions.
…e runtimes - Added `backend_preference` field to `LocalAiConfig` for specifying preferred runtime. - Implemented backend resolution logic to select between CPU, Metal, CUDA, and Vulkan based on user preference and feature availability. - Updated `LocalAiStatus` to include fields for tracking active backend and performance metrics. - Introduced runtime backend enumeration and support functions to manage backend capabilities. - Enhanced Cargo.toml files to include features for Metal, CUDA, and Vulkan support in both core and Tauri projects.
- Updated `LocalAiService` to include backend status tracking with `active_backend` and `backend_reason` fields. - Improved inference methods to accept a `no_think` parameter, allowing for more flexible prompt processing. - Enhanced latency and token metrics tracking during inference, providing better performance insights. - Adjusted default token limits for various inference methods to optimize user experience.
- Updated inference methods in `LocalAiService` to include a `no_think` parameter for improved prompt processing. - Added backend status tracking with `active_backend`, `backend_reason`, and performance metrics such as `last_latency_ms`, `prompt_toks_per_sec`, and `gen_toks_per_sec`. - Enhanced the UI in `LocalModelPanel` to display backend status and performance metrics, improving user experience and transparency. - Adjusted token limits for various inference methods to optimize functionality.
- Removed the `mistralrs` dependency and related configurations from `Cargo.toml` and `LocalAiConfig`. - Changed default model provider from `mistralrs` to `ollama`, updating associated default values for model ID, download URL, and artifact name. - Simplified backend handling in `LocalAiService`, ensuring consistent use of the `ollama` provider for model management and status reporting. - Enhanced `LocalAiStatus` to reflect the new provider and model path format.
…ds and asset management - Introduced new parameters and methods for vision prompting, embedding, transcription, and text-to-speech (TTS) functionalities in the Local AI service. - Enhanced the LocalAiConfig structure to support additional model IDs and preload options for various capabilities. - Updated the LocalModelPanel UI to facilitate user interaction with new AI features, including asset status and download triggers. - Implemented backend commands for managing local AI assets and their statuses, improving overall functionality and user experience.
- Added methods for downloading STT and TTS models, including configuration for download URLs in LocalAiConfig. - Enhanced the LocalAiService with a new `download_all_models` method to manage model downloads and status updates. - Introduced error handling for model availability checks, marking the service as degraded if downloads fail. - Updated dispatch logic to trigger full model downloads, improving the initialization process for local AI services.
- Implemented new commands for fetching recent vision summaries and flushing the vision queue in the accessibility module. - Enhanced the AccessibilityEngine to manage vision state, including queue depth and last vision summary. - Updated the AccessibilityPanel UI to display vision state and recent summaries, allowing users to trigger flush actions. - Added Redux state management for vision-related data, including loading states and error handling. - Expanded onboarding steps to include user consent for local model usage, improving privacy and resource management awareness.
- Introduced a new 'Channels' section in the MiniSidebar for navigating to messaging settings. - Added a Messaging Channels panel in SettingsHome for configuring Telegram and Discord authentication modes. - Implemented channel connection management in MessagingPanel, including connection status updates and error handling. - Created a routing utility to resolve preferred authentication modes for messaging channels. - Enhanced socket service to handle real-time updates for channel connection statuses. - Added API service for managing channel connections, including connect and disconnect functionalities. - Updated thread API to support outbound routing based on the selected messaging channel.
…iption capabilities - Changed the default vision model from `qwen2.5vl:3b` to `moondream:1.8b` in the local AI configuration. - Implemented a new function `openhumanLocalAiTranscribeBytes` for transcribing audio from byte arrays, improving flexibility in audio input handling. - Enhanced the `Conversations` component to support voice recording and transcription, including state management for recording and playback. - Added error handling and user feedback for audio transcription processes, improving overall user experience.
…ccessibilityEngine - Reformatted code in the AccessibilityEngine for better readability, including consistent indentation and line breaks. - Enhanced the clarity of the `analyze_frame_with_vision` function signature by spreading parameters across multiple lines. - Improved the readability of temporary path creation in the `capture_screen_image_ref` function.
…nhance accessibility features - Added a new command to request specific accessibility permissions on macOS, including screen recording, accessibility, and input monitoring. - Updated the AccessibilityPanel and onboarding steps to utilize the new permission request functionality, improving user experience and compliance with macOS requirements. - Introduced a MemoryWorkspace component for managing memory documents, enhancing the intelligence features of the application. - Refactored related Redux actions and state management to support the new permission handling and memory functionalities.
- Added synchronization summary text to the SkillsGrid and SkillCard components, providing users with insights on sync counts, local data size, and last sync time. - Implemented a new function to derive skill sync summary text based on skill state and sync statistics. - Updated the SkillsGrid to display sync metrics in a new column, improving the visibility of synchronization status. - Enhanced the SkillManager to manage sync statistics, including tracking sync durations and local data metrics. - Refactored Redux state management to include persistent sync metrics for each skill, ensuring accurate reporting and user feedback.
- Added new commands for listing, updating, removing, running, and retrieving run history of cron jobs. - Introduced a dedicated CronJobsPanel for managing scheduled jobs, enhancing user interface for cron job configuration. - Updated navigation components to include links to the new cron jobs settings. - Enhanced the core server to support cron job operations, ensuring integration with existing functionality. - Implemented error handling and user feedback for cron job actions, improving overall user experience.
- Simplified error handling in cron job management functions by removing unnecessary line breaks. - Enhanced the readability of the `dispatch` function in the core server by consolidating related code. - Improved formatting in the AccessibilityEngine for better consistency and clarity in permission requests.
- Changed the image reference in README.md from JPG to PNG format for better compatibility. - Removed the old JPG file and added the new PNG file to the documentation directory.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Problem
Solution
openhuman.local_ai_download_assetandopenhuman.local_ai_assets_statusRPC methods and Tauri bindings.chat -> vision -> embedding -> stt -> tts) with progress/warning updates.whisper-cli) and TTS on piper, with workspace model paths and configurable download URLs.Testing
yarn -s compilecargo check --manifest-path src-tauri/Cargo.tomlOther checks run:
yarn -s tsc --noEmitprettier --check,eslint,tsc --noEmit)Manual validation completed:
Impact
local_ai.model_idremains supported as chat fallback.Breaking Changes
Related