-
-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Voice Input Integration in ChatPanel
1. Current State: No Voice Infrastructure Exists
A grep search across all .tsx files for every relevant audio/voice keyword (SpeechRecognition, webkitSpeechRecognition, MediaRecorder, AudioContext, whisper, deepgram, transcri) returns zero matches. There is no microphone, voice, or audio feature of any kind in the codebase today.
2. The Exact Integration Point Inside ChatPanel
The voice button belongs between the <Textarea> closing tag and the submit <Button> inside the relative flex items-start wrapper div. This is the absolute-positioned button zone on the right side of the input row.
Currently the layout has two absolute-positioned elements stacked on the right:
- The
ArrowRightsubmit button sits atright-2(mobile) /right-2(desktop). 1
The microphone button should be inserted before that submit button at a position like right-9 or right-10, and the <Textarea>'s right padding (pr-12) would need to increase to pr-20 to accommodate two right-side icons. 2
For mobile, the pattern used by attachment and submit is that those actions are delegated to MobileIconsBar via the ChatPanelRef imperative handle. A startVoiceInput method should similarly be added to the ChatPanelRef interface and exposed via useImperativeHandle, and a Mic button added to MobileIconsBar. 3 4
3. The setInput / debouncedGetSuggestions Pipeline to Hook Into
The transcribed text must call these two in sequence — exactly the same pattern used in the <Textarea>'s onChange handler: 5
debouncedGetSuggestions requires ≥ 2 words and a 500ms debounce window, then calls the getSuggestions server action with the value and current mapData, streaming back PartialRelated suggestions via onSuggestionsChange: 6
getSuggestions itself takes (query: string, mapData: MapData) and uses streamObject against the configured LLM: 7
4. Available Dependencies to Use
A. lucide-react — Already Installed, Has Mic / MicOff Icons
lucide-react ^0.507.0 is already a dependency. The existing icon imports in chat-panel.tsx demonstrate the pattern: 8
Simply add Mic and MicOff to that import line — no new package needed for the UI.
B. Web Speech API — Browser-native, Zero New Dependencies
The Web Speech API (window.SpeechRecognition / window.webkitSpeechRecognition) requires no npm packages. The new state refs fit exactly alongside the existing debounceTimeoutRef, inputRef, formRef, and fileInputRef refs already declared: 9
Add isRecording: boolean state alongside existing state vars: 10
The recognition.onresult callback feeds the transcript string directly into setInput(transcript) and debouncedGetSuggestions(transcript), then inputRef.current?.focus() (same pattern as the existing focus effect): 11
Limitations: No iOS Safari support, accuracy varies, requires HTTPS.
C. OpenAI Whisper — Highest Accuracy, OPENAI_API_KEY Already Configured
For a server-side transcription path using OpenAI Whisper, the OPENAI_API_KEY is already referenced by getModel(): 12 13
The MediaRecorder → Blob → FormData → API route pattern is already established in compressImage/header-search-button.tsx: 14 15
A new route app/api/transcribe/route.ts (parallel to the existing routes in app/api/) would receive the audio Blob, forward it to OpenAI's POST /v1/audio/transcriptions (whisper-1), and return the transcript text. The existing API route convention is: 16
What to add: The openai npm package (bun add openai) — note that @ai-sdk/openai ^1.3.24 is the Vercel AI SDK adapter and does not expose the Whisper transcription endpoint: 17
5. Mobile Imperative Handle Wiring
On mobile, the ChatPanelRef exposes methods that chat.tsx calls imperatively. A startVoiceInput method should be added to ChatPanelRef and useImperativeHandle, then MobileIconsBar extended with an onVoiceClick prop alongside the existing onAttachmentClick / onSubmitClick: 18 19 20
Architecture Summary
flowchart TD
A["Mic Button (chat-panel.tsx, right of Textarea)"]
B["Web Speech API\n(window.SpeechRecognition)\nZero deps, browser-only"]
C["MediaRecorder Blob\n(browser API)"]
D["app/api/transcribe/route.ts\n(new — needs openai npm pkg)"]
E["OpenAI Whisper\nwhisper-1 model"]
F["transcript string"]
G["setInput(transcript)\n(chat-panel.tsx line 249)"]
H["debouncedGetSuggestions(transcript)\n(chat-panel.tsx line 250)"]
I["getSuggestions(value, mapData)\n(lib/actions/suggest.ts)"]
J["onSuggestionsChange → SuggestionsDropdown"]
A --> B
A --> C
B --> F
C --> D --> E --> F
F --> G
F --> H
H --> I --> J
Notes
@ai-sdk/openai(^1.3.24) is the Vercel AI SDK provider adapter — it does not wrap the Whisper audio transcription endpoint. You would need tobun add openaito get the full OpenAI Node.js SDK, or use a rawfetchcall tohttps://api.openai.com/v1/audio/transcriptionswith the existingOPENAI_API_KEY.- The Web Speech API approach is the fastest to integrate with zero new dependencies — just add
Mic/MicOfffrom the already-installedlucide-react, addisRecordingstate, and auseCallbackhandler that callssetInput+debouncedGetSuggestionsin itsonresultcallback. - The breakpoint difference between
ChatPanel(<= 1024px) andChat(< 768px) forisMobilemeans the microphone button's desktop/mobile rendering logic should followChatPanel's ownisMobilestate, not the parent's. 21 22
Citations
File: components/chat-panel.tsx (L9-9)
import { ArrowRight, Plus, Paperclip, X, Sprout } from 'lucide-react'File: components/chat-panel.tsx (L25-54)
export interface ChatPanelRef {
handleAttachmentClick: () => void
submitForm: () => void
}
export const ChatPanel = forwardRef<ChatPanelRef, ChatPanelProps>(({ messages, input, setInput, onSuggestionsChange }, ref) => {
const [, setMessages] = useUIState<typeof AI>()
const { submit, clearChat } = useActions()
const { mapProvider } = useSettingsStore()
const [isMobile, setIsMobile] = useState(false)
const [selectedFile, setSelectedFile] = useState<File | null>(null)
const [suggestions, setSuggestionsState] = useState<PartialRelated | null>(null)
const setSuggestions = useCallback((s: PartialRelated | null) => {
setSuggestionsState(s)
onSuggestionsChange?.(s)
}, [onSuggestionsChange, setSuggestionsState])
const { mapData } = useMapData()
const debounceTimeoutRef = useRef<NodeJS.Timeout | null>(null)
const inputRef = useRef<HTMLTextAreaElement>(null)
const formRef = useRef<HTMLFormElement>(null)
const fileInputRef = useRef<HTMLInputElement>(null)
useImperativeHandle(ref, () => ({
handleAttachmentClick() {
fileInputRef.current?.click()
},
submitForm() {
formRef.current?.requestSubmit()
}
}));File: components/chat-panel.tsx (L57-64)
useEffect(() => {
const checkMobile = () => {
setIsMobile(window.innerWidth <= 1024)
}
checkMobile()
window.addEventListener('resize', checkMobile)
return () => window.removeEventListener('resize', checkMobile)
}, [])File: components/chat-panel.tsx (L134-158)
const debouncedGetSuggestions = useCallback(
(value: string) => {
if (debounceTimeoutRef.current) {
clearTimeout(debounceTimeoutRef.current)
}
const wordCount = value.trim().split(/\s+/).filter(Boolean).length
if (wordCount < 2) {
setSuggestions(null)
return
}
debounceTimeoutRef.current = setTimeout(async () => {
const suggestionsStream = await getSuggestions(value, mapData)
for await (const partialSuggestions of readStreamableValue(
suggestionsStream
)) {
if (partialSuggestions) {
setSuggestions(partialSuggestions as PartialRelated)
}
}
}, 500) // 500ms debounce delay
},
[mapData, setSuggestions]
)File: components/chat-panel.tsx (L160-162)
useEffect(() => {
inputRef.current?.focus()
}, [])File: components/chat-panel.tsx (L232-247)
<Textarea
ref={inputRef}
name="input"
rows={1}
maxRows={isMobile ? 3 : 5}
tabIndex={0}
placeholder="Explore"
spellCheck={false}
value={input}
data-testid="chat-input"
className={cn(
'resize-none w-full min-h-12 rounded-fill border border-input pl-14 pr-12 pt-3 pb-1 text-sm ring-offset-background file:border-0 file:bg-transparent file:text-sm file:font-medium placeholder:text-muted-foreground focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 disabled:cursor-not-allowed disabled:opacity-50',
isMobile
? 'mobile-chat-input input bg-background'
: 'bg-muted'
)}File: components/chat-panel.tsx (L248-251)
onChange={e => {
setInput(e.target.value)
debouncedGetSuggestions(e.target.value)
}}File: components/chat-panel.tsx (L276-289)
<Button
type="submit"
size={'icon'}
variant={'ghost'}
className={cn(
'absolute top-1/2 transform -translate-y-1/2',
isMobile ? 'right-1' : 'right-2'
)}
disabled={input.length === 0 && !selectedFile}
aria-label="Send message"
data-testid="chat-submit"
>
<ArrowRight size={isMobile ? 18 : 20} />
</Button>File: components/mobile-icons-bar.tsx (L23-65)
interface MobileIconsBarProps {
onAttachmentClick: () => void;
onSubmitClick: () => void;
}
export const MobileIconsBar: React.FC<MobileIconsBarProps> = ({ onAttachmentClick, onSubmitClick }) => {
const [, setMessages] = useUIState<typeof AI>()
const { clearChat } = useActions()
const { toggleCalendar } = useCalendarToggle()
const handleNewChat = async () => {
setMessages([])
await clearChat()
}
return (
<div className="mobile-icons-bar-content">
<Button variant="ghost" size="icon" onClick={handleNewChat} data-testid="mobile-new-chat-button">
<Plus className="h-[1.2rem] w-[1.2rem]" />
</Button>
<ProfileToggle />
<MapToggle />
<Button variant="ghost" size="icon" onClick={toggleCalendar} title="Open Calendar" data-testid="mobile-calendar-button">
<CalendarDays className="h-[1.2rem] w-[1.2rem] transition-all rotate-0 scale-100" />
</Button>
<Button variant="ghost" size="icon" data-testid="mobile-search-button">
<Search className="h-[1.2rem] w-[1.2rem] transition-all rotate-0 scale-100" />
</Button>
<a href="https://buy.stripe.com/14A3cv7K72TR3go14Nasg02" target="_blank" rel="noopener noreferrer">
<Button variant="ghost" size="icon">
<TentTree className="h-[1.2rem] w-[1.2rem] transition-all rotate-0 scale-100" />
</Button>
</a>
<Button variant="ghost" size="icon" onClick={onAttachmentClick} data-testid="mobile-attachment-button">
<Paperclip className="h-[1.2rem] w-[1.2rem] transition-all rotate-0 scale-100" />
</Button>
<Button variant="ghost" size="icon" data-testid="mobile-submit-button" onClick={onSubmitClick}>
<ArrowRight className="h-[1.2rem] w-[1.2rem] transition-all rotate-0 scale-100" />
</Button>
<History location="header" />
<ModeToggle />
</div>
)File: lib/actions/suggest.ts (L9-45)
export async function getSuggestions(
query: string,
mapData: MapData
) {
const objectStream = createStreamableValue<PartialRelated>()
const systemPrompt = `As a helpful assistant, your task is to generate a set of three query suggestions based on the user's partial input. The user is currently interacting with a map, and the following data represents the current map view: ${JSON.stringify(mapData)}. Use this location context to provide relevant suggestions.
For instance, if the user's partial query is "best coffee near" and the map context is centered on San Francisco, your output should follow this format:
"{
"items": [
{ "query": "best coffee near downtown San Francisco" },
{ "query": "top-rated independent coffee shops in SF" },
{ "query": "coffee shops with outdoor seating in San Francisco" }
]
}"
Generate three queries that anticipate the user's needs, offering logical next steps for their search. The suggestions should be concise and directly related to the partial query and map context.`
;(async () => {
const result = await streamObject({
model: (await getModel()) as LanguageModel,
system: systemPrompt,
messages: [{ role: 'user', content: query }],
schema: relatedSchema
})
for await (const obj of result.partialObjectStream) {
if (obj && typeof obj === 'object' && 'items' in obj) {
objectStream.update(obj as PartialRelated)
}
}
objectStream.done()
})()
return objectStream.valueFile: lib/utils/index.ts (L24-30)
export async function getModel(requireVision: boolean = false) {
const selectedModel = await getSelectedModel();
const xaiApiKey = process.env.XAI_API_KEY;
const gemini3ProApiKey = process.env.GEMINI_3_PRO_API_KEY;
const awsAccessKeyId = process.env.AWS_ACCESS_KEY_ID;
const awsSecretAccessKey = process.env.AWS_SECRET_ACCESS_KEY;File: lib/utils/index.ts (L121-124)
const openai = createOpenAI({
apiKey: openaiApiKey,
});
return openai('gpt-4o');File: components/header-search-button.tsx (L60-74)
let mapboxBlob: Blob | null = null;
let googleBlob: Blob | null = null;
if (mapProvider === 'mapbox' && map) {
// Capture Mapbox
const canvas = map.getCanvas()
const rawMapboxBlob = await new Promise<Blob | null>(resolve => {
canvas.toBlob(resolve, 'image/png')
})
if (rawMapboxBlob) {
mapboxBlob = await compressImage(rawMapboxBlob).catch(e => {
console.error('Failed to compress Mapbox image:', e);
return rawMapboxBlob;
});
}File: components/header-search-button.tsx (L123-141)
const formData = new FormData()
if (mapboxBlob) formData.append('file_mapbox', mapboxBlob, 'mapbox_capture.png')
if (googleBlob) formData.append('file_google', googleBlob, 'google_capture.png')
// Keep 'file' for backward compatibility if needed, or just use the first available
formData.append('file', (mapboxBlob || googleBlob)!, 'map_capture.png')
formData.append('action', 'resolution_search')
formData.append('timezone', mapData.currentTimezone || 'UTC')
formData.append('drawnFeatures', JSON.stringify(mapData.drawnFeatures || []))
const center = mapProvider === 'mapbox' && map ? map.getCenter() : mapData.cameraState?.center;
if (center) {
formData.append('latitude', center.lat.toString())
formData.append('longitude', center.lng.toString())
}
const responseMessage = await actions.submit(formData)
setMessages((currentMessages: any[]) => [...currentMessages, responseMessage as any])File: app/api/chat/route.ts (L1-15)
import { NextResponse, NextRequest } from 'next/server';
import { saveChat, createMessage, NewChat, NewMessage } from '@/lib/actions/chat-db';
import { getCurrentUserIdOnServer } from '@/lib/auth/get-current-user';
// import { generateUUID } from '@/lib/utils'; // Assuming generateUUID is in lib/utils as per PR context - not needed for PKs
// This is a simplified POST handler. PR #533's version might be more complex,
// potentially handling streaming AI responses and then saving.
// For now, this focuses on the database interaction part.
export async function POST(request: NextRequest) {
try {
const userId = await getCurrentUserIdOnServer();
if (!userId) {
return NextResponse.json({ error: 'Unauthorized' }, { status: 401 });
}File: package.json (L19-22)
"@ai-sdk/amazon-bedrock": "^1.1.6",
"@ai-sdk/anthropic": "^1.2.12",
"@ai-sdk/google": "^1.2.22",
"@ai-sdk/openai": "^1.3.24",File: components/chat.tsx (L42-50)
const chatPanelRef = useRef<ChatPanelRef>(null);
const handleAttachment = () => {
chatPanelRef.current?.handleAttachmentClick();
};
const handleMobileSubmit = () => {
chatPanelRef.current?.submitForm();
};File: components/chat.tsx (L56-70)
useEffect(() => {
// Check if device is mobile
const checkMobile = () => {
setIsMobile(window.innerWidth < 768)
}
// Initial check
checkMobile()
// Add event listener for window resize
window.addEventListener('resize', checkMobile)
// Cleanup
return () => window.removeEventListener('resize', checkMobile)
}, [])File: components/chat.tsx (L134-145)
<div className="mobile-icons-bar">
<MobileIconsBar onAttachmentClick={handleAttachment} onSubmitClick={handleMobileSubmit} />
</div>
<div className="mobile-chat-input-area">
<ChatPanel
ref={chatPanelRef}
messages={messages}
input={input}
setInput={setInput}
onSuggestionsChange={setSuggestions}
/>
</div>