Mins Bot

A floating desktop AI assistant built with Java 17, Spring Boot, and JavaFX. Think Jarvis for your PC — a swirling orb sits on your desktop, expanding into a full AI command center with voice, vision, proactive actions, browser automation, and connections to 10 messaging platforms.

Features

Desktop UI

Floating window — always-on-top, draggable orb with custom title bar
Tabbed interface — Chat, Browser, Agents, Integrations, Setup, Skills, Schedules, Todo, Directives, Personality, Knowledge, Voice, Calibration, Workflows, Templates, Marketplace, Dashboard, Multi-Agent, Automations
Command palette — Ctrl+K for quick access to all commands and tabs
Chat search — Ctrl+F to search through message history
Keyboard shortcuts — Ctrl+/ to view all shortcuts, Ctrl+L to clear chat
Smooth transitions — animated tab switching and message appearances
Sound effects — subtle audio feedback for sent/received/notification/error (toggleable)
Styled tooltips — hover over any toolbar icon for a descriptive tooltip
System tray — minimize to tray for background operation

AI & Chat

Multi-model support — OpenAI (GPT-5.1, GPT-4o), Google Gemini (2.5 Pro, 3 Flash), Anthropic Claude (Opus, Sonnet), and local models via Ollama
100+ built-in tools — the AI can invoke tools across files, browser, system, media, health, finance, GitHub, and more
Dynamic tool routing — AI classifier selects relevant tools per message (respects 128-tool API limit)
Task planning — numbered checklist before executing complex multi-step tasks
Autonomous mode — works on directives independently when you're idle
Chat memory — persistent transcript history across restarts

Jarvis Mode — Proactive Intelligence

Proactive Action Mode (lightning bolt icon) — continuously monitors your screen, pending tasks, and directives, then takes action automatically
- Screen check every 15s — detects dialogs, forms, errors, notifications and acts on them
- Task check every 30s — completes pending to-do items proactively
- Directive check every 60s — executes standing directives
- Safety: skips when you're actively working, 60s cooldown per action, speaks actions aloud via TTS
Jarvis Watch Mode (eye icon) — AI actively comments on your screen like a real assistant
- [COMMENT] — conversational tips, warnings, and observations appear as chat messages
- [REACT] — auto-types into forms, quizzes, and prompts
- [SILENT] — stays quiet when nothing interesting is happening
- 10-second cooldown between comments, semantic deduplication to avoid repetition
Auto-pilot (brain icon) — proactive screen help suggestions
Keyboard & mouse control (keyboard icon) — allow the bot to click and type on your behalf

Voice & Vision

Voice input — speech-to-text via Web Speech API and native microphone capture
Text-to-speech — ElevenLabs, Fish Audio, OpenAI TTS, or Windows native voice
Gemini Live — real-time bidirectional audio streaming with language translation
Screen analysis — live screen capture with Gemini vision + OCR before every AI response
Webcam — capture and analyze webcam feed
Audio listening — background audio capture and transcription with model selection

Browser & Automation

Chrome DevTools Protocol — control your real Chrome browser (navigate, click, extract data, fill forms)
JavaScript injection — browserFillForm and browserExecuteJs for instant form filling via CDP (~10ms)
Playwright — headless browser automation for web scraping
System control — execute system commands, manage processes, control applications
Window manager — arrange windows, app switching, snap to edges (Win+Arrow)
Bot window control — tell the bot to move itself ("move yourself to the left")
Automations — custom trigger/action rules ("when message contains X, do Y")
Clipboard history — tracks last 200 clipboard entries, searchable by AI

Life Management Tools

Personal Profile & Memory

Life profile — 11 sections: Routines, Preferences, Relationships, Goals, Health, Finance, Locations, Vehicles, Pets, Important Dates, Notes
Episodic memory — stores life events as searchable JSON episodes with type, tags, people, mood, importance
Auto-memory extraction — automatically detects life facts from conversations and saves them
Personal config — name, birthdate, family, work info loaded into every AI response
Knowledge base — upload documents (PDF, Word, Excel, code, etc.) for AI reference

Health Tracker (11 tools)

Log water, meals, exercise, weight, mood, sleep, medications
Daily health summaries and multi-day trend analysis
Set and track health goals

Finance Tracker (13 tools)

Log expenses and income with categories
Monthly budgets with real-time tracking
Bill tracking with due date alerts
Debt overview and financial goal tracking
Monthly reports by category

Proactive Engine

Morning briefings, break reminders, hydration reminders
Meeting prep, bill reminders, relationship nudges
Weekly goal check-ins, weather alerts
Custom rules with quiet hours support

Developer & Productivity Tools

GitHub Integration (18 tools)

List repos, branches, README content, search repos
Create/list/comment on issues and pull requests
View notifications, activity feed, gists
Monitor CI/CD workflow runs

Video Creation (Remotion)

Scaffold and manage Remotion projects
Create custom React video compositions
Quick text videos with animated typography
Slideshow videos from images with crossfade transitions
Render to MP4 via CLI

Media & Entertainment

Music control — play/pause/skip, volume control via media keys (Spotify, Windows Media Player, any player)
Video downloader — yt-dlp wrapper for YouTube, TikTok, and 1000+ sites (auto-installs via winget)
Image tools — resize, crop, filter
Screen recording — capture and export MP4
QR codes — generate and scan

Social & Trend Intelligence

Social monitor — track contacts, birthdays, posts, and mentions across platforms
Trend scout — monitor YouTube/web for topics you're interested in, surfaces what's trending
Habit detection — learns your routines from time/day patterns, persisted to habit_events.json
Feedback loop — rates suggestions, improves over time
Skill auto-creation — save named workflows with trigger phrases, auto-executes on match

Other Tools

Email — send/read via SMTP/IMAP + Gmail API
Calendar — Google Calendar integration
Web search — Serper, SerpAPI, or DuckDuckGo
Web monitoring — track website changes
Code audit — clone repos, scan for SQL injection, hardcoded secrets, unused imports
File operations — read, write, search, download, export, Excel, Word, PDF
Utilities — calculator, unit conversion, hash, timers
Software management — install/uninstall via winget
Network diagnostics — ping, traceroute, port scan
Printer control — list printers, print documents

Pluggable Skills Library (81 self-contained skills)

Each skill is a self-contained sub-package (com.minsbot.skills.<name>) with its own Config + Service + Controller, exposed at /api/skills/<name>/*, and also available to the chat LLM as @Tool methods via 8 themed wrapper classes. All disabled by default — enable per-skill via app.skills.<name>.enabled=true in application.properties.

Category	Skill count	Examples
Dev & Data	17	encoder, hashcalc, jsontools, yamltools, csvtools, sqlformatter, difftool, regextester, regexinferrer, markdowntools, dockerfilelint, loganalyzer, httptester, cronvalidator, fakedatagen, randomgen, unitconvert
SEO	6	metaanalyzer, keywordextractor, sitemapchecker, robotschecker, readability, sluggenerator
Marketing	5	utmbuilder, subjectanalyzer, charcounter, abtestcalc, hashtagsuggest
Security Analyst	10	passwordstrength, hibpcheck, jwtinspector, certinspector, headeraudit, dnslookup, cvelookup, hashidentifier, secretsscan, emailvalidator
Privacy/Encryption	3	piiredactor, exifstripper, encryptionaes
Science/Math	7	probabilitycalc, matrixops, physicscalc, geodistance, statsbasics, geometrycalc, langdetector
Finance	6	financecalc, taxcalc, realestatecalc, stockindicators, breakevencalc, depreciationcalc, cashflowforecast
Health/Fitness	5	bmicalc, macrocalc, pacecalc, heartratezones, medicalunits
Productivity	11	notes, reminders, timer, clipboardhistory, okrtracker, timezoneconvert, meetingcost, slacalc, pomodoroplanner, flashcardmaker, netinfo
Content/Writing	7	writingtools, headlineanalyzer, citationformatter, markdownhtml, numberwords, colortools, imagemeta
Culinary	1	recipescaler
Education	1	gradecalc
File system	1	diskscan

LLM wrapper classes in com.minsbot.agent.tools.* (each registered as a ToolRouter category):

SkillDevTools → dev_skills
SkillProductivityTools → productivity_skills
SkillSeoMarketingTools → seo_marketing_skills
SkillSecurityTools → security_skills
SkillProfessionTools → profession_skills
SkillDataToolsExtra → data_skills_extra
SkillCalcTools → calc_skills
SkillExtrasTools → extras_skills

The AI classifier routes user queries to the right category automatically (e.g. "is my password strong?" → security_skills, "what's 5% compound interest over 10 years?" → profession_skills / calc_skills).

Background Agents

Parallel agents — launch up to 24 concurrent AI agents on isolated missions
Per-agent model selection — pick GPT-5.4, GPT-4o-mini, Claude, or Gemini per agent
Agent dashboard — live progress bars, log stream, plan view, status badges
Download results — export agent output as a Markdown file
Cancel / remove — cancel running agents or clear finished ones

Dashboard & Analytics

Usage metrics — token counts, response times, tool invocation counts
Module stats — which vision/audio models are active and how many calls they've handled
Status bar — 2-row bar shows vision engine, audio module, and live counts

Messaging Integrations (10 Platforms)

Connect the same AI to any combination — all share the same reply logic:

Platform	Webhook Endpoint	Config Prefix
Viber	`POST /api/viber/webhook`	`app.viber.*`
Telegram	`POST /api/telegram/webhook`	`app.telegram.*`
Discord	`POST /api/discord/interactions`	`app.discord.*`
Slack	`POST /api/slack/events`	`app.slack.*`
WhatsApp	`POST /api/whatsapp/webhook`	`app.whatsapp.*`
Messenger	`POST /api/messenger/webhook`	`app.messenger.*`
LINE	`POST /api/line/webhook`	`app.line.*`
Teams	`POST /api/teams/messages`	`app.teams.*`
WeChat	`POST /api/wechat/webhook`	`app.wechat.*`
Signal	`POST /api/signal/webhook`	`app.signal.*`

All integrations are disabled by default and conditionally loaded — disabled platforms don't consume memory.

Requirements

Java 17 (JDK 17 or later)
Maven 3.6+
Windows, macOS, or Linux
Node.js 18+ (optional — for Remotion video creation)
API keys for AI services you want to use (see Configuration)

Quick Start

1. Clone the repository

git clone https://github.com/ceasis/mins-bot.git
cd mins-bot

2. Configure your API keys

Create a file called application-secrets.properties in the project root (this file is gitignored):

# Required — at least one AI provider
spring.ai.openai.api-key=YOUR_OPENAI_API_KEY

# Optional — Gemini
gemini.api.key=YOUR_GEMINI_API_KEY

# Optional — Claude
ANTHROPIC_API_KEY=YOUR_ANTHROPIC_KEY

# Optional — GitHub
GITHUB_TOKEN=YOUR_GITHUB_TOKEN

# Optional — ElevenLabs TTS
app.elevenlabs.api-key=YOUR_ELEVENLABS_API_KEY
app.elevenlabs.voice-id=YOUR_VOICE_ID

# Optional — Email
spring.mail.host=smtp.gmail.com
spring.mail.username=YOUR_EMAIL
spring.mail.password=YOUR_APP_PASSWORD

3. Build

mvn clean package -DskipTests

4. Run

Option A — Maven (recommended)

mvn spring-boot:run

Option B — Batch script (Windows)

run.bat

Option C — JAR

java --add-modules javafx.controls,javafx.web,javafx.fxml \
     --add-opens java.base/java.lang=ALL-UNNAMED \
     -jar target/mins-bot-1.0.0-SNAPSHOT.jar

Option D — Windows Installer

build-installer.bat

Creates an MSI installer in target/installer/ (requires JDK 17+ and WiX Toolset).

5. Use

The chat panel appears on your desktop
Type a message and press Enter
Click the microphone for voice input
Ctrl+K to open command palette
Toolbar icons: eye (watch), keyboard (control), headphones (listen), brain (autopilot), lightning (proactive)

Configuration

All configuration lives in src/main/resources/application.properties. Sensitive values go in application-secrets.properties (gitignored).

Window Settings

Property	Description	Default
`server.port`	HTTP port	`8765`
`app.window.expanded.width`	Chat panel width (px)	`456`
`app.window.expanded.height`	Chat panel height (px)	`520`
`app.window.always-on-top`	Keep window above all others	`true`

AI Models

Property	Description	Default
`spring.ai.openai.chat.options.model`	OpenAI chat model	`gpt-5.1`
`app.gemini.vision-model`	Gemini vision model	`gemini-3-flash-preview`
`app.gemini.reasoning-model`	Gemini reasoning model	`gemini-2.5-pro`
`app.claude.model`	Claude model	`claude-opus-4-6`
`app.tool-classifier.model`	Tool routing classifier	`gpt-4o-mini`

Feature Toggles

Property	Description	Default
`app.planning.enabled`	Task planning before execution	`true`
`app.autonomous.enabled`	Autonomous mode when idle	`true`
`app.chat.live-screen-on-message`	Auto-capture screen before replies	`true`
`app.proactive.enabled`	Proactive engine (briefings, reminders)	`false`
`app.proactive-action.enabled`	Proactive action mode (auto-act)	`false`
`app.cdp.enabled`	Chrome DevTools Protocol	`true`
`app.tray.enabled`	System tray icon	`true`

Proactive Mode Settings

Property	Description	Default
`app.proactive-action.screen-check-seconds`	Screen check interval	`15`
`app.proactive-action.task-check-seconds`	Task check interval	`30`
`app.proactive-action.directive-check-seconds`	Directive check interval	`60`
`app.proactive.check-interval-ms`	Proactive engine check interval	`300000`
`app.proactive.quiet-hours-start`	Quiet hours start	`22`
`app.proactive.quiet-hours-end`	Quiet hours end	`7`

Keyboard Shortcuts

Shortcut	Action
`Ctrl+K`	Command palette
`Ctrl+/`	Shortcuts help
`Ctrl+F`	Search chat messages
`Ctrl+L`	Clear chat
`Enter`	Send message
`Arrow Up/Down`	Input history
`Esc`	Close overlay/palette

Documentation

Document	Description
docs/SETUP.md	Complete setup guide — prerequisites, env vars, platform setup, troubleshooting
docs/TOOLS.md	Full tool reference — 100+ tools organized by category
CLAUDE.md	Developer context for AI assistants

Project Structure

mins-bot/
├── pom.xml
├── LICENSE
├── CLAUDE.md                              # AI developer context
├── run.bat / build-installer.bat          # Windows launch scripts
├── docs/
│   ├── SETUP.md                           # Full setup guide
│   └── TOOLS.md                           # Tool reference (100+ tools)
│
├── src/main/java/com/minsbot/
│   ├── FloatingAppLauncher.java           # JavaFX entry point
│   ├── MinsbotApplication.java            # Spring Boot entry point
│   ├── ChatService.java                   # Core agent loop & AI orchestration
│   ├── ChatController.java               # REST API endpoints
│   │
│   ├── agent/
│   │   ├── ProactiveActionService.java    # Jarvis-like auto-action engine
│   │   ├── ProactiveEngineService.java    # Briefings, reminders, nudges
│   │   ├── EpisodicMemoryService.java     # Life event memory system
│   │   ├── AutoMemoryExtractor.java       # Auto-detect life facts from chat
│   │   ├── ScreenStateService.java        # Screen capture + AI analysis
│   │   ├── SystemContextProvider.java     # System prompt builder
│   │   └── tools/                         # 100+ tool implementations
│   │       ├── ToolRouter.java            # Dynamic tool selection
│   │       ├── ToolClassifierService.java # AI-powered tool categorization
│   │       ├── HealthTrackerTools.java    # Health logging & trends
│   │       ├── FinanceTrackerTools.java   # Expense/budget tracking
│   │       ├── GitHubTools.java           # GitHub API (18 tools)
│   │       ├── RemotionVideoTools.java    # Programmatic video creation
│   │       ├── LifeProfileTools.java      # Personal life profile
│   │       ├── EpisodicMemoryTools.java   # Memory recall & search
│   │       └── ...
│   │
│   ├── skills/                            # 81 pluggable skill packages (Config/Service/Controller each)
│   │   ├── encoder/ hashcalc/ jsontools/ regextester/ randomgen/
│   │   ├── unitconvert/ notes/ reminders/ timer/ clipboardhistory/
│   │   ├── netinfo/ metaanalyzer/ keywordextractor/ sitemapchecker/
│   │   ├── robotschecker/ readability/ sluggenerator/ utmbuilder/
│   │   ├── subjectanalyzer/ charcounter/ abtestcalc/ hashtagsuggest/
│   │   ├── passwordstrength/ hibpcheck/ jwtinspector/ certinspector/
│   │   ├── headeraudit/ dnslookup/ cvelookup/ hashidentifier/
│   │   ├── secretsscan/ emailvalidator/ colortools/ imagemeta/
│   │   ├── writingtools/ citationformatter/ statsbasics/ financecalc/
│   │   ├── taxcalc/ realestatecalc/ stockindicators/ meetingcost/
│   │   ├── timezoneconvert/ slacalc/ bmicalc/ medicalunits/
│   │   ├── recipescaler/ geometrycalc/ langdetector/ gradecalc/
│   │   ├── okrtracker/ cronvalidator/ csvtools/ difftool/
│   │   ├── yamltools/ sqlformatter/ markdowntools/ dockerfilelint/
│   │   ├── probabilitycalc/ matrixops/ physicscalc/ geodistance/
│   │   ├── breakevencalc/ depreciationcalc/ cashflowforecast/
│   │   ├── macrocalc/ pacecalc/ heartratezones/ headlineanalyzer/
│   │   ├── markdownhtml/ numberwords/ piiredactor/ exifstripper/
│   │   ├── encryptionaes/ fakedatagen/ flashcardmaker/ loganalyzer/
│   │   ├── pomodoroplanner/ httptester/ regexinferrer/ diskscan/
│   │   └── ...
│   └── [Platform]*.java                   # 10 messaging integrations
│
├── src/main/resources/
│   ├── application.properties
│   └── static/                            # Frontend (HTML/CSS/JS)
│
├── src/test/java/                         # 51 unit tests
│   └── com/minsbot/agent/
│       ├── EpisodicMemoryServiceTest.java
│       └── tools/
│           ├── HealthTrackerToolsTest.java
│           ├── FinanceTrackerToolsTest.java
│           └── LifeProfileToolsTest.java
│
└── ~/mins_bot_data/                       # Persistent data (user home)
    ├── personal_config.txt
    ├── life_profile.txt
    ├── directives.txt
    ├── health/                            # Health logs
    ├── finance/                           # Finance logs
    ├── episodic_memory/                   # Life event JSONs
    ├── knowledge_base/                    # Uploaded documents
    ├── remotion/                          # Video project
    └── videos/                            # Rendered videos

API Endpoints

Chat

Method	Endpoint	Description
`POST`	`/api/chat`	Send a message, get a reply
`GET`	`/api/chat/history`	Load recent chat history
`GET`	`/api/chat/status`	Poll tool execution updates
`POST`	`/api/chat/clear`	Clear memory and transcript

Modes

Method	Endpoint	Description
`POST`	`/api/proactive-action/toggle`	Toggle proactive action mode
`GET`	`/api/status/proactive-action`	Proactive action status
`POST`	`/api/autopilot/toggle`	Toggle auto-pilot
`GET`	`/api/status/autopilot`	Auto-pilot status

System

Method	Endpoint	Description
`GET`	`/api/health`	System health check
`GET`	`/api/version`	App version
`POST`	`/api/briefing`	Generate daily briefing

Knowledge Base

Method	Endpoint	Description
`POST`	`/api/kb/upload`	Upload document
`GET`	`/api/kb/list`	List documents
`GET`	`/api/kb/read/{name}`	Read document
`DELETE`	`/api/kb/{name}`	Delete document

Roadmap

Recently Shipped

Near-term

Smart home integration — Home Assistant / MQTT for lights, thermostat, locks, cameras
Contact CRM — relationship tracking with last interaction, birthday alerts, gift ideas
Daily briefing dashboard — visual home screen with weather, calendar, tasks, health, budget
Sidebar navigation — collapsible icon sidebar replacing the horizontal tab bar
Rich message cards — structured cards for health logs, finance, weather, bills

Mid-term

Wake word detection — always-listening "Hey Mins" trigger
Workflow builder — visual drag-and-drop automation chains
Location awareness — GPS/IP-based triggers ("you're near the grocery store")
Subscription tracker — track all subscriptions, total cost, renewal reminders
Docker management — list/start/stop containers, view logs

Long-term

Plugin marketplace — community-created skills and tools
Multi-device sync — seamless handoff between desktop, phone, and wearables
Voice cloning — custom Jarvis voice via voice training
AR/camera integration — point phone camera for real-world AI assistance
Offline mode — local LLM fallback when internet is unavailable

Troubleshooting

Problem	Solution
`no jfxwebkit in java.library.path`	Use `mvn spring-boot:run` or add `-Djava.library.path=target/javafx-natives`
`JavaFX runtime components are missing`	Add `--add-modules javafx.controls,javafx.web,javafx.fxml` to VM options
Window doesn't appear	Check port 8765 is free; try `http://localhost:8765/` in a browser
Circular dependency on startup	Check for `@Lazy` annotations on ToolRouter injections
Voice not working	Try opening `http://localhost:8765/` in Chrome instead of the JavaFX window
Messaging webhook not receiving	Ensure HTTPS (ngrok) and correct webhook URL
Ollama models not loading	Install from ollama.com and pull your model
GitHub tools not working	Set `GITHUB_TOKEN` environment variable with a Personal Access Token
Remotion render fails	Ensure Node.js 18+ is installed; run `setupRemotion` first

License

This project is licensed under the GNU Affero General Public License v3.0 with the Commons Clause license condition.

You may:

Use, view, and modify the source code
Distribute copies under the same license terms
Use it for personal and internal purposes

You may not:

Sell the software or offer it as a paid hosted service
Distribute modified versions without sharing the source code
Remove license or copyright notices

See LICENSE for the full text.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.claude		.claude
.vscode		.vscode
com/google/genai/types		com/google/genai/types
docs		docs
plugins/published		plugins/published
src		src
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
HIERARCHY.md		HIERARCHY.md
LICENSE		LICENSE
MinsBot-Browser.bat		MinsBot-Browser.bat
MinsBot.bat		MinsBot.bat
README.md		README.md
SPECS.md		SPECS.md
STRATEGY.md		STRATEGY.md
application-secrets.properties.example		application-secrets.properties.example
build-installer.bat		build-installer.bat
dev.bat		dev.bat
pom.xml		pom.xml
run.bat		run.bat

Folders and files

Latest commit

History

Repository files navigation

Mins Bot

Features

Desktop UI

AI & Chat

Jarvis Mode — Proactive Intelligence

Voice & Vision

Browser & Automation

Life Management Tools

Personal Profile & Memory

Health Tracker (11 tools)

Finance Tracker (13 tools)

Proactive Engine

Developer & Productivity Tools

GitHub Integration (18 tools)

Video Creation (Remotion)

Media & Entertainment

Social & Trend Intelligence

Other Tools

Pluggable Skills Library (81 self-contained skills)

Background Agents

Dashboard & Analytics

Messaging Integrations (10 Platforms)

Requirements

Quick Start

1. Clone the repository

2. Configure your API keys

3. Build

4. Run

5. Use

Configuration

Window Settings

AI Models

Feature Toggles

Proactive Mode Settings

Keyboard Shortcuts

Documentation

Project Structure

API Endpoints

Chat

Modes

System

Knowledge Base

Roadmap

Recently Shipped

Near-term

Mid-term

Long-term

Troubleshooting

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages