A floating macOS AI assistant that sees your screen, remembers your session, and acts across every app — powered by Gemini 2.5 Flash.
Nova is a lightweight macOS overlay app that lives in your menu bar and activates on a global hotkey. Press Ctrl+Option from anywhere — Nova captures your screen, sends it to Gemini 2.5 Flash along with your query and a rolling memory of your session, and responds intelligently: speaking an answer out loud, animating a cursor to the relevant part of your screen, and sliding in a glassmorphic response panel when there's something worth reading.
It works across every app, every window, every context — without switching focus or breaking your flow.
📹 [Demo video link — coming soon]
Press Ctrl+Option from any app to open the Nova Command Bar at the top-center of your screen. Type your query and hit Enter — Nova handles the rest. No switching apps, no context loss.
On every query, Nova captures your screen using ScreenCaptureKit and passes it directly to Gemini's vision model. It always has context about what's in front of you — so you never need to describe your screen.
Nova's signature feature. After Gemini responds, the cursor flies to the most relevant UI element on screen — a button, a field, a data cell, anything. Gemini always provides a target coordinate, so the cursor is active on every single query. It's not just an answer — it's a visual point.
When a response is worth reading — a list, a multi-step explanation, formatted text — a sleek frosted-glass panel slides in from the right side of the screen. It renders full Markdown, auto-dismisses when no longer needed, and never covers the center of your workspace.
Nova maintains a rolling in-memory log of your last ~15 interactions across the entire session. Every query you make is enriched with this history — so if you were analyzing an Excel model 10 minutes ago and you're now drafting an email, Nova already knows. No re-explaining, no context loss between apps.
Click the expand button in the Command Bar or menu panel to reveal the Context Web — a live graph showing every Nova interaction in your session as nodes, with edges drawn only when Gemini determines two queries are genuinely semantically related.
Each node displays:
- The app name (color-coded per app)
- A query title like "Gmail — investor email draft" or "Excel — Q3 margin analysis"
Edges are Gemini-decided — not time-proximity heuristics. Being opened 5 minutes apart is not a connection. Sharing the same financial model, topic, or project is. The graph evolves in real time as you work.
Alongside the Context Web, the Session Pulse widget shows live session stats:
- Total queries made
- Number of cross-app context links Gemini identified
- Topic keywords extracted from the session
Nova lives in your macOS menu bar. Click the icon to open a compact control panel with:
- Session memory count
- Quick access to Expand/Collapse widgets
- Clear Memory button
- Quit
Every Gemini response is spoken aloud using macOS's built-in AVSpeechSynthesizer — fast, zero-latency, no API cost. The spoken version is always concise; the full response lives in the panel.
| Component | Technology |
|---|---|
| AI Vision + Chat | Gemini 2.5 Flash (Google AI) |
| Screen Capture | ScreenCaptureKit |
| Text-to-Speech | AVSpeechSynthesizer (macOS native) |
| Cursor Animation | CGEvent + CGDisplay |
| UI Framework | SwiftUI + NSVisualEffectView |
| Language | Swift 5.10 |
| Platform | macOS 14+ |
Nova/
├── leanring-buddy/
│ ├── leanring_buddyApp.swift # App entry point, window wiring
│ ├── CompanionManager.swift # Central state + query pipeline
│ ├── GeminiAPI.swift # Gemini 2.5 Flash integration + JSON parsing
│ ├── NovaAIProvider.swift # Response model (NovaAIResponse)
│ ├── NovaSessionMemory.swift # Rolling session memory + context summary
│ ├── NovaCommandBar.swift # Global hotkey Command Bar UI
│ ├── NovaResponsePanel.swift # Glassmorphic sliding response panel
│ ├── NovaExpandedOverlay.swift # Context Web graph + Session Pulse widgets
│ ├── OverlayWindow.swift # Screen overlay + animated cursor (BlueCursorView)
│ ├── CompanionPanelView.swift # Menu bar dropdown panel
│ ├── MenuBarPanelManager.swift # Menu bar status item manager
│ └── NovaTTSManager.swift # Text-to-speech manager
User presses Ctrl+Option
↓
Nova Command Bar opens
↓
User types query → hits Enter
↓
ScreenCaptureKit captures current screen
↓
[Query + Screenshot + Session Memory Context] → Gemini 2.5 Flash
↓
Gemini returns structured JSON:
spokenResponse, displayText, showPanel,
pointAction (x, y, label),
sessionTopics, queryTitle,
relatedToIndices, connectionReasons
↓
├── AVSpeechSynthesizer speaks the short response
├── Cursor animates to pointAction coordinate
├── Response panel slides in (if showPanel = true)
└── Session memory appended with UUID-stable connections
- macOS 14 (Sonoma) or later
- Xcode 15+
- A Google AI Studio API key (free tier)
-
Clone the repository:
git clone https://github.com/yourusername/Nova.git cd Nova -
Open the Xcode project:
open leanring-buddy.xcodeproj
-
Add your Gemini API key:
- Open
GeminiAPI.swift - Replace the
apiKeyplaceholder with your Google AI Studio key
- Open
-
Set your signing team:
- In Xcode, select the
leanring-buddytarget - Go to Signing & Capabilities
- Set Team to your Apple ID and update the Bundle Identifier
- In Xcode, select the
-
Build and run with
Cmd+R -
Grant permissions when prompted:
- Screen Recording — required for ScreenCaptureKit
- Accessibility — required for cursor animation
| Action | How |
|---|---|
| Activate Nova | Ctrl+Option from any app |
| Submit a query | Type in the Command Bar → Enter |
| Dismiss the response panel | Click anywhere outside, or press Escape |
| Open menu panel | Click the Nova icon in the menu bar |
| Toggle Context Web | Click the expand icon (↗) in the Command Bar or menu panel |
| Clear session memory | Menu panel → "Clear Memory" |
Nova is built around three principles:
-
Non-intrusive — Every window is a floating overlay that never steals focus. The response panel slides in from the edge; the command bar appears and disappears. Your workflow is never interrupted, only augmented.
-
Context-first — Instead of being a stateless Q&A tool, Nova builds a live map of your session. It knows what you were doing 10 queries ago and uses that to give smarter, more connected answers.
-
Show, don't just tell — The animated cursor makes Nova's intelligence visible. When it points to something on screen, you can see it thinking. This isn't just a UX decision — it's the whole feel of the product.
CSBA × HBA Hack Day — UT Austin, April 12, 2026 Submitted for: Best Use of Gemini · Tech × Business Track · Overall
Nova is built on top of Clicky by Farza Majeed.
MIT — see LICENSE for details.