feat(desktop): add floating control bar with AI chat and screenshot capture#4792
feat(desktop): add floating control bar with AI chat and screenshot capture#4792
Conversation
…agement Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… dark blur Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ager Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a significant new feature: a floating control bar with AI chat and screenshot capabilities. The implementation is comprehensive, covering UI, state management, window handling, and global shortcuts. The code is well-structured into separate views and managers.
My review focuses on a few key areas for improvement:
- Potential crash: There's a force unwrap that could lead to a crash.
- Performance: A synchronous I/O call on the main thread could cause UI lag.
- Maintainability: Some parts of the code use hardcoded values for layout calculations, which can be fragile. The logic for handling streaming AI responses is also a bit convoluted and could be simplified.
- Privacy: A user's local file path is being sent to the AI service, which is a privacy concern.
Overall, this is a great addition. Addressing these points will make the feature more robust, performant, and secure.
| if !barWindow!.state.showingAIResponse { | ||
| withAnimation(.spring(response: 0.4, dampingFraction: 0.8)) { | ||
| barWindow?.state.showingAIResponse = true | ||
| } | ||
| barWindow?.resizeToResponseHeightPublic(animated: true) | ||
| } |
There was a problem hiding this comment.
Force unwrapping barWindow! here can lead to a crash if the weak reference becomes nil. It's crucial to safely unwrap optionals, especially weak ones inside closures.
| if !barWindow!.state.showingAIResponse { | |
| withAnimation(.spring(response: 0.4, dampingFraction: 0.8)) { | |
| barWindow?.state.showingAIResponse = true | |
| } | |
| barWindow?.resizeToResponseHeightPublic(animated: true) | |
| } | |
| if let barWindow, !barWindow.state.showingAIResponse { | |
| withAnimation(.spring(response: 0.4, dampingFraction: 0.8)) { | |
| barWindow.state.showingAIResponse = true | |
| } | |
| barWindow.resizeToResponseHeightPublic(animated: true) | |
| } |
| private var needsExpansion: Bool { | ||
| let font = NSFont.systemFont(ofSize: 13) | ||
| let attributes = [NSAttributedString.Key.font: font] | ||
| let size = (userInput as NSString).boundingRect( | ||
| with: NSSize(width: 350, height: CGFloat.greatestFiniteMagnitude), | ||
| options: .usesLineFragmentOrigin, | ||
| attributes: attributes | ||
| ).size | ||
| return size.height > font.pointSize * 1.5 | ||
| } |
There was a problem hiding this comment.
The needsExpansion property uses a hardcoded width of 350 to calculate whether the text needs to be expandable. This will not behave correctly if the window is resized, as the AI response view is resizable. The width should be determined dynamically from the view's geometry to ensure the expansion logic is always accurate.
| } | ||
|
|
||
| // Screenshot thumbnail | ||
| if let url = screenshotURL, let nsImage = NSImage(contentsOf: url) { |
There was a problem hiding this comment.
NSImage(contentsOf: url) is a synchronous, blocking I/O call. When placed inside a view's body, it runs on the main thread and can cause UI stutters or unresponsiveness, especially with larger images or slower disk access. It's best practice to load images asynchronously.
You could introduce a @State private var thumbnailImage: NSImage? and load the image into it within a Task inside .onAppear and .onChange(of: screenshotURL).
| onHeightChange: { [weak state] height in | ||
| guard let state = state else { return } | ||
| let totalHeight = 60 + height + 24 + (state.screenshotURL != nil ? 60 : 0) | ||
| state.inputViewHeight = totalHeight | ||
| }, |
There was a problem hiding this comment.
The height calculation for the input view uses several magic numbers (60, 24, 60). This makes the code fragile and hard to maintain. If the layout of any subviews changes, this calculation will need to be manually updated, and it's easy to miss, leading to layout bugs.
It would be more robust to use SwiftUI's layout system to determine the height, for example by using PreferenceKeys to pass heights up the view hierarchy.
| var fullMessage = message | ||
| if let url = screenshotURL { | ||
| fullMessage = "[Screenshot of user's screen attached: \(url.path)]\n\n\(message)" | ||
| } |
There was a problem hiding this comment.
The user's local file path for the screenshot is being included in the prompt sent to the AI. This is a privacy concern as it leaks information about the user's file system structure, including their username. The prompt should indicate that a screenshot is attached without revealing the local path. The image data itself should be sent if the API supports it, or this feature should be re-evaluated if it's text-only.
| var fullMessage = message | |
| if let url = screenshotURL { | |
| fullMessage = "[Screenshot of user's screen attached: \(url.path)]\n\n\(message)" | |
| } | |
| var fullMessage = message | |
| if screenshotURL != nil { | |
| fullMessage = "[Screenshot of user's screen is attached]\n\n\(message)" | |
| } |
| .sink { [weak barWindow] messages in | ||
| guard let lastMessage = messages.last, lastMessage.sender == .ai else { return } | ||
| if lastMessage.isStreaming { | ||
| barWindow?.updateAIResponse(type: "data", text: "") |
There was a problem hiding this comment.
This call to updateAIResponse(type: "data", text: "") is confusing. It seems to be used only for its side effects (updating isAILoading, showingAIResponse, and resizing the window), while the text: "" argument is ignored because the text is overwritten on the next line. This makes the code hard to follow and error-prone, as there are also redundant state updates in this block.
Consider refactoring to make the state transitions more explicit. For example, create a separate function like beginStreamingResponse() to handle the UI state changes, and then update the text directly. This would improve readability and maintainability.
…apture (BasedHardware#4792) ## Summary - Adds a floating, always-on-top control bar to the macOS desktop app with play/pause recording, "Ask omi" AI chat, and show/hide toggle - Captures a screenshot when opening AI input, shown as a thumbnail with remove/retake options - Global hotkeys: Cmd+\ to toggle bar visibility, Cmd+Enter to open AI input panel ## Test plan - [ ] Build: `cd desktop && xcrun swift build -c debug --package-path Desktop` - [ ] After sign-in + onboarding, floating bar appears near top of screen - [ ] Drag bar to reposition; position persists across restarts - [ ] Click play/pause to toggle recording with timer - [ ] Press Cmd+Enter to open AI input with auto-captured screenshot - [ ] Type a question and press Enter for streaming markdown response - [ ] Press Escape to close AI panel - [ ] Press Cmd+\ to toggle bar visibility 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…apture (BasedHardware#4792) ## Summary - Adds a floating, always-on-top control bar to the macOS desktop app with play/pause recording, "Ask omi" AI chat, and show/hide toggle - Captures a screenshot when opening AI input, shown as a thumbnail with remove/retake options - Global hotkeys: Cmd+\ to toggle bar visibility, Cmd+Enter to open AI input panel ## Test plan - [ ] Build: `cd desktop && xcrun swift build -c debug --package-path Desktop` - [ ] After sign-in + onboarding, floating bar appears near top of screen - [ ] Drag bar to reposition; position persists across restarts - [ ] Click play/pause to toggle recording with timer - [ ] Press Cmd+Enter to open AI input with auto-captured screenshot - [ ] Type a question and press Enter for streaming markdown response - [ ] Press Escape to close AI panel - [ ] Press Cmd+\ to toggle bar visibility 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Summary
Test plan
cd desktop && xcrun swift build -c debug --package-path Desktop🤖 Generated with Claude Code