Skip to content

feat: Phase 23 Multimodal, Local LLM, Security Hardening#9

Merged
diegofelipeee merged 1 commit intomainfrom
dev
Feb 18, 2026
Merged

feat: Phase 23 Multimodal, Local LLM, Security Hardening#9
diegofelipeee merged 1 commit intomainfrom
dev

Conversation

@diegofelipeee
Copy link
Collaborator

🚀 Phase 23 — Multimodal, Local LLM, Security Hardening

Summary

This PR brings 14 commits with major new capabilities: multimodal support (vision + voice + image generation), local LLM provider (Ollama), security hardening (audit integrity check, webhook alerts, RBAC enforcement), and full dashboard-driven configuration (no ENV files needed).


🎯 What's New

Multimodal Support

  • Vision Input — Send images to the agent via Telegram, WhatsApp, Discord. Base64 forwarded to GPT-4o / Claude for analysis.
  • Voice STT/TTS — Telegram and WhatsApp voice messages transcribed via Whisper, responses synthesized via ElevenLabs. Dedicated /api/chat/voice, /api/voice/synthesize, /api/voice/transcribe endpoints.
  • Image Generation — New image_generate tool (12th built-in) supporting DALL-E 3, Leonardo AI (with polling), and Stable Diffusion (AUTOMATIC1111 API).

Local LLM Support (9th Provider)

  • OllamaProvider — Supports Ollama, LM Studio, llama.cpp, and any OpenAI-compatible local server.
  • Auto-discovers installed models via /api/tags with 30s cache refresh.
  • OpenAI-compatible mode (/v1/chat/completions) for tool calling; native Ollama fallback (/api/chat).
  • Priority 11 in failover chain — if all cloud providers fail, local model picks up seamlessly.
  • Dashboard Settings: URL input (not password field), validated by fetching model list before saving.

Security Hardening

  • Audit Hash Chain — SHA-256 linked hash chain for tamper detection. Migration 003 adds hash + previousHash columns.
  • Startup Integrity Check — Auto-verify hash chain on boot (deferred 10s). Logs OK or broadcasts critical WebSocket alert.
  • Generic Webhook Alerts — POST JSON to any custom URL (Slack, Discord, etc.) on every security alert. Configurable via dashboard.
  • RBAC Hard Enforcement — Authenticated non-admin tokens → 403 on admin mutation routes. RBAC_ENFORCE toggle blocks anonymous too.
  • Audit Log Rotation — 90-day retention, auto-daily cleanup, manual trigger via API.
  • Rate-Limited Telegram Alerts — 60s cooldown + batching to prevent 429 spam.

Dashboard & Configuration

  • 17 pages (added Voice page with TTS/STT testing, voice listing, config).
  • All service keys via dashboard — Leonardo AI, ElevenLabs, Stable Diffusion URL, Voice toggle, Security Webhook URL, RBAC Enforce toggle, Ollama URL — all saved to encrypted Vault.
  • No ENV files required — everything configurable from frontend or terminal.
  • 15 dashboard screenshots added to README gallery.

CI/CD & Docs

  • CI workflow fixed: artifact caching, lint job, dev branch triggers.
  • PR template improved with security type and better formatting.
  • README updated: 9 providers, 12 tools, 17 pages, Multimodal section, Security Hardening in completed roadmap.

📊 Stats

Metric Value
Commits 14
Files changed 45
Lines added ~2,789
Lines removed ~119
Tests 53/53 passing
New packages/files ollama.ts, image-generator.ts, Voice.tsx, migration 003

🧪 Testing

  • All 53 API tests pass (npx vitest run --reporter=verbose)
  • Tested: health, providers, tools, plugins, workflows, sessions, voice, webhooks, sandbox, rate limits, backup, IP filter, MCP, memory, RAG, planner, API keys, GDPR, GitHub, RSS, security summary/stats, audit integrity/export/rotation/events, image_generate tool, voice engine endpoints
  • Dashboard builds cleanly (Vite), all packages build (tsup)

🔐 Security Notes

  • Audit hash chain verified on every boot — tampering triggers critical alert
  • RBAC blocks authenticated non-admin users from admin routes (403)
  • Webhook alerts fire for rate limits, prompt injection, integrity failures
  • All secrets stored in AES-256-GCM encrypted Vault, never in plaintext ENV

📁 Key Files Changed

  • packages/agent/src/providers/ollama.ts — New OllamaProvider (336 lines)
  • packages/tools/src/tools/image-generator.ts — New ImageGeneratorTool (281 lines)
  • packages/dashboard/src/pages/Voice.tsx — New Voice dashboard page (397 lines)
  • packages/dashboard/src/pages/Settings.tsx — Service config UI (+398 lines)
  • packages/core/src/gateway/server.ts — Security hardening (+335 lines)
  • packages/core/src/gateway/chat-routes.ts — Voice/service/provider endpoints (+329 lines)
  • packages/security/src/audit-logger.ts — Hash chain, rotation, integrity (+237 lines)
  • packages/channels/src/telegram.ts — Vision + voice handlers
  • packages/channels/src/whatsapp.ts — Vision + voice handlers
  • tests/api.test.ts — New test suites (+132 lines)

- Hard enforcement: authenticated non-admin tokens get 403 on admin routes
  - POST/PUT/PATCH/DELETE to admin routes checked
  - Invalid tokens also blocked (treated as guest with hasToken=true)
  - Admin tokens pass through normally
- Anonymous requests (no token) still allowed through for backward compat
  - Dashboard doesn't have login yet  soft enforcement for anonymous
- RBAC_ENFORCE toggle: set to true to block ALL requests (incl. anonymous)
  - Configurable via dashboard Settings > Security > RBAC Hard Enforcement
  - Stored in Vault, loaded on startup
  - Red toggle to signal restrictive nature
- Admin routes protected: vault backup, audit export/rotate/integrity,
  security stats, GDPR, provider keys, IP filter, pairing
- 53/53 tests passing
@diegofelipeee diegofelipeee merged commit 7648a10 into main Feb 18, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments