Skip to content

feat(desktop): add floating control bar with AI chat and screenshot capture#4792

Merged
kodjima33 merged 12 commits intomainfrom
feat/floating-control-bar
Feb 13, 2026
Merged

feat(desktop): add floating control bar with AI chat and screenshot capture#4792
kodjima33 merged 12 commits intomainfrom
feat/floating-control-bar

Conversation

@kodjima33
Copy link
Copy Markdown
Collaborator

Summary

  • Adds a floating, always-on-top control bar to the macOS desktop app with play/pause recording, "Ask omi" AI chat, and show/hide toggle
  • Captures a screenshot when opening AI input, shown as a thumbnail with remove/retake options
  • Global hotkeys: Cmd+\ to toggle bar visibility, Cmd+Enter to open AI input panel

Test plan

  • Build: cd desktop && xcrun swift build -c debug --package-path Desktop
  • After sign-in + onboarding, floating bar appears near top of screen
  • Drag bar to reposition; position persists across restarts
  • Click play/pause to toggle recording with timer
  • Press Cmd+Enter to open AI input with auto-captured screenshot
  • Type a question and press Enter for streaming markdown response
  • Press Escape to close AI panel
  • Press Cmd+\ to toggle bar visibility

🤖 Generated with Claude Code

kodjima33 and others added 12 commits February 13, 2026 01:25
…agement

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… dark blur

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ager

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kodjima33 kodjima33 merged commit 47ca7a6 into main Feb 13, 2026
1 check passed
@kodjima33 kodjima33 deleted the feat/floating-control-bar branch February 13, 2026 06:27
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant new feature: a floating control bar with AI chat and screenshot capabilities. The implementation is comprehensive, covering UI, state management, window handling, and global shortcuts. The code is well-structured into separate views and managers.

My review focuses on a few key areas for improvement:

  • Potential crash: There's a force unwrap that could lead to a crash.
  • Performance: A synchronous I/O call on the main thread could cause UI lag.
  • Maintainability: Some parts of the code use hardcoded values for layout calculations, which can be fragile. The logic for handling streaming AI responses is also a bit convoluted and could be simplified.
  • Privacy: A user's local file path is being sent to the AI service, which is a privacy concern.

Overall, this is a great addition. Addressing these points will make the feature more robust, performant, and secure.

Comment on lines +466 to +471
if !barWindow!.state.showingAIResponse {
withAnimation(.spring(response: 0.4, dampingFraction: 0.8)) {
barWindow?.state.showingAIResponse = true
}
barWindow?.resizeToResponseHeightPublic(animated: true)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Force unwrapping barWindow! here can lead to a crash if the weak reference becomes nil. It's crucial to safely unwrap optionals, especially weak ones inside closures.

Suggested change
if !barWindow!.state.showingAIResponse {
withAnimation(.spring(response: 0.4, dampingFraction: 0.8)) {
barWindow?.state.showingAIResponse = true
}
barWindow?.resizeToResponseHeightPublic(animated: true)
}
if let barWindow, !barWindow.state.showingAIResponse {
withAnimation(.spring(response: 0.4, dampingFraction: 0.8)) {
barWindow.state.showingAIResponse = true
}
barWindow.resizeToResponseHeightPublic(animated: true)
}

Comment on lines +112 to +121
private var needsExpansion: Bool {
let font = NSFont.systemFont(ofSize: 13)
let attributes = [NSAttributedString.Key.font: font]
let size = (userInput as NSString).boundingRect(
with: NSSize(width: 350, height: CGFloat.greatestFiniteMagnitude),
options: .usesLineFragmentOrigin,
attributes: attributes
).size
return size.height > font.pointSize * 1.5
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The needsExpansion property uses a hardcoded width of 350 to calculate whether the text needs to be expandable. This will not behave correctly if the window is resized, as the AI response view is resizable. The width should be determined dynamically from the view's geometry to ensure the expansion logic is always accurate.

}

// Screenshot thumbnail
if let url = screenshotURL, let nsImage = NSImage(contentsOf: url) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

NSImage(contentsOf: url) is a synchronous, blocking I/O call. When placed inside a view's body, it runs on the main thread and can cause UI stutters or unresponsiveness, especially with larger images or slower disk access. It's best practice to load images asynchronously.

You could introduce a @State private var thumbnailImage: NSImage? and load the image into it within a Task inside .onAppear and .onChange(of: screenshotURL).

Comment on lines +88 to +92
onHeightChange: { [weak state] height in
guard let state = state else { return }
let totalHeight = 60 + height + 24 + (state.screenshotURL != nil ? 60 : 0)
state.inputViewHeight = totalHeight
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The height calculation for the input view uses several magic numbers (60, 24, 60). This makes the code fragile and hard to maintain. If the layout of any subviews changes, this calculation will need to be manually updated, and it's easy to miss, leading to layout bugs.

It would be more robust to use SwiftUI's layout system to determine the height, for example by using PreferenceKeys to pass heights up the view hierarchy.

Comment on lines +479 to +482
var fullMessage = message
if let url = screenshotURL {
fullMessage = "[Screenshot of user's screen attached: \(url.path)]\n\n\(message)"
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The user's local file path for the screenshot is being included in the prompt sent to the AI. This is a privacy concern as it leaks information about the user's file system structure, including their username. The prompt should indicate that a screenshot is attached without revealing the local path. The image data itself should be sent if the API supports it, or this feature should be re-evaluated if it's text-only.

Suggested change
var fullMessage = message
if let url = screenshotURL {
fullMessage = "[Screenshot of user's screen attached: \(url.path)]\n\n\(message)"
}
var fullMessage = message
if screenshotURL != nil {
fullMessage = "[Screenshot of user's screen is attached]\n\n\(message)"
}

.sink { [weak barWindow] messages in
guard let lastMessage = messages.last, lastMessage.sender == .ai else { return }
if lastMessage.isStreaming {
barWindow?.updateAIResponse(type: "data", text: "")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This call to updateAIResponse(type: "data", text: "") is confusing. It seems to be used only for its side effects (updating isAILoading, showingAIResponse, and resizing the window), while the text: "" argument is ignored because the text is overwritten on the next line. This makes the code hard to follow and error-prone, as there are also redundant state updates in this block.

Consider refactoring to make the state transitions more explicit. For example, create a separate function like beginStreamingResponse() to handle the UI state changes, and then update the text directly. This would improve readability and maintainability.

ellaaicare pushed a commit to ellaaicare/omi that referenced this pull request Apr 12, 2026
…apture (BasedHardware#4792)

## Summary
- Adds a floating, always-on-top control bar to the macOS desktop app
with play/pause recording, "Ask omi" AI chat, and show/hide toggle
- Captures a screenshot when opening AI input, shown as a thumbnail with
remove/retake options
- Global hotkeys: Cmd+\ to toggle bar visibility, Cmd+Enter to open AI
input panel

## Test plan
- [ ] Build: `cd desktop && xcrun swift build -c debug --package-path
Desktop`
- [ ] After sign-in + onboarding, floating bar appears near top of
screen
- [ ] Drag bar to reposition; position persists across restarts
- [ ] Click play/pause to toggle recording with timer
- [ ] Press Cmd+Enter to open AI input with auto-captured screenshot
- [ ] Type a question and press Enter for streaming markdown response
- [ ] Press Escape to close AI panel
- [ ] Press Cmd+\ to toggle bar visibility

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Glucksberg pushed a commit to Glucksberg/omi-local that referenced this pull request Apr 28, 2026
…apture (BasedHardware#4792)

## Summary
- Adds a floating, always-on-top control bar to the macOS desktop app
with play/pause recording, "Ask omi" AI chat, and show/hide toggle
- Captures a screenshot when opening AI input, shown as a thumbnail with
remove/retake options
- Global hotkeys: Cmd+\ to toggle bar visibility, Cmd+Enter to open AI
input panel

## Test plan
- [ ] Build: `cd desktop && xcrun swift build -c debug --package-path
Desktop`
- [ ] After sign-in + onboarding, floating bar appears near top of
screen
- [ ] Drag bar to reposition; position persists across restarts
- [ ] Click play/pause to toggle recording with timer
- [ ] Press Cmd+Enter to open AI input with auto-captured screenshot
- [ ] Type a question and press Enter for streaming markdown response
- [ ] Press Escape to close AI panel
- [ ] Press Cmd+\ to toggle bar visibility

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant