Releases: dortanes/atlas
v0.2.3
🛸 Atlas v0.2.3 — Voice Control, Settings Redesign & Smart Defaults
Talk to Atlas hands-free and enjoy a completely rebuilt settings experience.
What's new
- 🎙️ Local Speech-to-Text — offline voice recognition via Vosk, no cloud API required. Just say the persona's name and start talking
- 👂 Wake word activation — Atlas listens for the active persona's name and starts capturing your command automatically
- 🏝️ Listening Island — a floating UI shows your live transcript as you speak, so you know exactly what Atlas hears
- 🖥️ Full-page Settings — the settings panel has been rebuilt from scratch with sidebar navigation, auto-save, and a cleaner layout
- 🧠 Intelligence tab — LLM and generation parameters are now grouped under a single "Intelligence" tab
- 🔊 Voice tab — TTS and STT settings live together in one unified "Voice" tab
- 😀 Emoji personas — pick an emoji avatar for each persona with the new built-in emoji picker
- 🔄 Reset to defaults — every settings section now has a one-click reset button
- 🖱️ Multi-monitor screenshots — vision actions now target the correct display on multi-monitor setups
- 🔇 TTS interruption fix — voice playback stops immediately when you interrupt the agent
Full changelog
See CHANGELOG for a detailed list of all added, changed, and removed items.
⚠️ Windows only. macOS & Linux support is planned.
v0.2.2
🛸 Atlas v0.2.2 — File Logging, Session Tracing & Tray Fix
Persistent logs, clear session boundaries, and a smoother settings experience.
What's new
- 📝 Persistent file logging — all log output is now saved to disk, so you can review what happened even after closing the app
- 📏 Auto-rotation — log file is automatically trimmed at 5 MB, keeping the last ~1 MB of recent context
- 🔖 Session banners — each request is wrapped in clear
SESSION START/SESSION ENDmarkers for easy tracing - 📄 "Open Log File" button — jump straight to the log file from Settings → General
- 🔒 Tray icon fix — clicking the tray icon while Settings is open no longer collapses the main window
- 📁 Folder naming — internal
userDatafolders now follow Chromium naming conventions (capitalized)
Full changelog
See CHANGELOG for a detailed list of all added and changed items.
⚠️ Windows only. macOS & Linux support is planned.
v0.2.1
🛸 Atlas v0.2.1 — Context Caching, Cursor Animations & Alice TTS
Smarter computer use, real-time action overlay, new free TTS provider, and a bunch of under-the-hood improvements.
What's new
- 🎯 Agent cursor overlay — see the agent's mouse moving, clicking, typing, and scrolling on screen in real time
- 🗣 Alice TTS — new free text-to-speech provider, no API key required (Settings → TTS → Yandex Alice)
- ⚡ Gemini context caching — stable parts of the system prompt are cached, reducing token usage and speeding up responses
- 🖥 Computer Use optimization — fewer redundant screenshots, dynamic delays, smarter retry logic
- 💬 Better error handling — when an action fails, the agent explains what went wrong in plain language instead of showing raw errors
- 🚫 SendKeys blocked — text input via WScript.Shell is intercepted and redirected to robotjs for reliable non-Latin character support
- 🌐 Navigate action — the agent opens URLs in the current tab instead of spawning new ones
- 🌍 Same-language task planning — step descriptions now match the language of your command
- 🧹 Auto-clear on new command — old responses and search results are cleared when you send a new request
- 📁 OneDrive-safe file paths — shell commands resolve Desktop, Documents, etc. dynamically even when redirected by OneDrive
- ⚙️ Removed "Always on Top" setting — window management is now handled automatically
- 🔧 Prompt refinements — improved instructions across all prompts for better accuracy and language consistency
Full changelog
See CHANGELOG for a detailed list of all added, changed, and removed items.
Getting started
- Download the
.exebelow - Launch Atlas → click tray icon → Settings
- Paste your Gemini API key → set recommended models in Settings → LLM tab:
- Text:
gemini-3.1-flash-lite-preview - Vision:
gemini-3-flash-preview(paid) /gemini-3.1-flash-lite-preview(free)
- Text:
- (Optional) Enable TTS: Settings → TTS → select Yandex Alice (free) or ElevenLabs (API key required)
- Press
Ctrl+Space→ go 🚀
⚠️ Windows only. macOS & Linux support is planned.
v0.2.0
🛸 Atlas v0.2.0 — Computer Use, Smart Actions & File Search
Major update: native screen control, fast action routing, file search, and a bunch of quality-of-life improvements.
What's new
- 🖥 Native computer control via Gemini Computer Use API — the agent sees the screen and performs actions (clicks, typing, scrolling, navigation)
- ⚡ Fast system actions without screenshots — opening/closing apps, volume control, system info
- 🧠 Three-way request classification — conversation, quick action, or screen interaction
- 📋 Task planning — the agent breaks down complex requests into steps with a progress bar in the UI
- 📂 Local file search — find files on disk with results displayed in Search Island (Open / Reveal actions)
- 🔄 Migrated from nut-js to robotjs for better performance and reliability
- 🔧 Session debug logger — detailed per-request logs with timers, togglable in Settings
⚠️ TTS error handling — dismissable warnings for quota, auth, and rate-limit errors with auto-disable on quota exhaustion- 🔍 Search query normalization — strips paths, wildcards, and invalid patterns from LLM-generated queries
- 🎨 UI improvements — progress bar for multi-step tasks, file search results, new Settings options
and more
Getting started
- Download the
.exebelow - Launch Atlas → click tray icon → Settings
- Paste your Gemini API key → set recommended models in Settings -> LLM tab:
- Text:
gemini-3.1-flash-lite-preview - Vision:
gemini-3-flash-preview(paid) /gemini-3.1-flash-lite-preview(free)
- Text:
- Press
Ctrl+Space→ go 🚀
⚠️ Windows only. macOS & Linux support is planned.
v0.1.0
🛸 Atlas v0.1.0 — First Public Release
The first MVP release of Atlas — an AI agent that lives on your desktop as a transparent overlay.
What's inside
- 🔮 The Orb — animated AI state indicator (idle → thinking → acting)
- 🏝 Islands — floating context panels: actions, responses, permissions, task queue
- 🧠 Gemini & OpenAI — multi-provider LLM with streaming responses
- 🖥 Screen vision — sees your display and controls mouse, keyboard, and terminal
- 🔍 Web search — built-in DuckDuckGo integration
- 🗣 Voice output — streaming TTS via ElevenLabs
- 🎭 Personas — multiple AI agents with isolated memory and custom prompts
- 🛡 Safety system — permission prompts before risky actions
Getting started
- Download the
.exebelow - Launch Atlas → click tray icon → Settings
- Paste your Gemini API key → press
Ctrl+Space→ go 🚀
⚠️ Windows only. macOS & Linux support is planned.