Is your proposal related to a problem?
Cal Video currently has no live captioning support. For Deaf and hard-of-hearing users, this makes Cal Video inaccessible for professional meetings. One of our contributors is Deaf and encounters this barrier directly when using Cal Video for scheduled calls. Daily.co, which already powers Cal Video, natively supports real-time transcription via startTranscription() with less than 300ms latency — meaning this capability already exists in our infrastructure and requires no new video provider or external service dependency.
Describe the solution you'd like
Add an opt-in live captions toggle to Cal Video that activates Daily.co's built-in transcription service. The feature has four components totalling under 200 lines of production code:
1. Prisma schema (~5 lines)
Add liveCaptionsEnabled Boolean @default(false) to the User model so the preference persists across sessions.
2. tRPC endpoints (~80 lines)
viewer.getLiveCaptionsEnabled — reads the user's preference on room load
viewer.setLiveCaptionsEnabled — updates the preference when the toggle is clicked
3. Daily.co domain configuration (~60 lines)
A single admin API route at apps/web/pages/api/integrations/dailyvideo/enable-transcription.ts that calls Daily's REST API to enable transcription on the domain. Called once at deployment by an admin. Uses the existing DAILY_API_KEY already in .env. An optional DEEPGRAM_API_KEY environment variable allows self-hosters to use their own Deepgram account, but Daily handles Deepgram internally if this is left blank.
4. Cal Video UI (~55 lines)
A CC toggle button in the Cal Video control bar. Each speaker's captions appear in a distinct accessible color so Deaf users can follow multi-speaker conversations visually. Interim captions appear immediately as words are spoken with no buffering delay, then finalize when Deepgram marks them complete. The preference auto-starts transcription on rejoin so the user does not have to re-enable it every meeting.
Describe alternatives you've considered
Third-party caption service (e.g. Otter.ai, Rev): Rejected. Introduces an external paid dependency and requires users to leave Cal Video or install a separate tool.
Browser-native Web Speech API: Rejected. Only captions the local user's own microphone, cannot caption remote participants, and has inconsistent cross-browser support.
Daily Prebuilt enable_live_captions_ui flag: Not applicable. Cal Video uses Daily's call object, not Daily Prebuilt, so this flag has no effect.
The Daily.co native transcription approach is the only solution that works within the existing architecture, requires no new dependencies, and captions all participants simultaneously.
How the user experiences it
- User opens Cal Video settings and enables live captions
- In any Cal Video meeting, a CC button appears in the control bar
- Clicking CC starts real-time captions at the bottom of the screen
- Each speaker's name appears in a unique color alongside their words
- Captions begin appearing within 300ms of speech with no delay
- On the next meeting, captions auto-start without any action from the user
Technical implementation notes
- No new npm packages required
- Uses
DAILY_API_KEY already present in .env
- Follows Cal.com's existing tRPC handler and Zod validation patterns
- Avoids
/packages/prisma/** beyond the single field addition
- Avoids
/apps/web/lib/daily-webhook/** entirely
- All code passes
yarn tsc --noEmit and yarn lint with no errors
- Daily.co connection verified against live domain prior to submission
Accessibility context
This feature directly addresses WCAG 2.1 Success Criterion 1.2.4 (Captions Live), which requires that live audio content in synchronized media has captions. Cal Video as a video conferencing product falls within scope of this criterion. This is also relevant to the European Accessibility Act compliance requirements raised in issue #21507.
PR plan
We plan to submit three sequential PRs once this issue is approved:
- PR 1: Prisma schema field only (~20 lines)
- PR 2: tRPC endpoints and Zod schemas (~80 lines)
- PR 3: Daily.co wiring, admin route, and Cal Video UI (~100 lines)
Each PR will reference this issue with fixes #XXXX.
Is your proposal related to a problem?
Cal Video currently has no live captioning support. For Deaf and hard-of-hearing users, this makes Cal Video inaccessible for professional meetings. One of our contributors is Deaf and encounters this barrier directly when using Cal Video for scheduled calls. Daily.co, which already powers Cal Video, natively supports real-time transcription via
startTranscription()with less than 300ms latency — meaning this capability already exists in our infrastructure and requires no new video provider or external service dependency.Describe the solution you'd like
Add an opt-in live captions toggle to Cal Video that activates Daily.co's built-in transcription service. The feature has four components totalling under 200 lines of production code:
1. Prisma schema (~5 lines)
Add
liveCaptionsEnabled Boolean @default(false)to theUsermodel so the preference persists across sessions.2. tRPC endpoints (~80 lines)
viewer.getLiveCaptionsEnabled— reads the user's preference on room loadviewer.setLiveCaptionsEnabled— updates the preference when the toggle is clicked3. Daily.co domain configuration (~60 lines)
A single admin API route at
apps/web/pages/api/integrations/dailyvideo/enable-transcription.tsthat calls Daily's REST API to enable transcription on the domain. Called once at deployment by an admin. Uses the existingDAILY_API_KEYalready in.env. An optionalDEEPGRAM_API_KEYenvironment variable allows self-hosters to use their own Deepgram account, but Daily handles Deepgram internally if this is left blank.4. Cal Video UI (~55 lines)
A CC toggle button in the Cal Video control bar. Each speaker's captions appear in a distinct accessible color so Deaf users can follow multi-speaker conversations visually. Interim captions appear immediately as words are spoken with no buffering delay, then finalize when Deepgram marks them complete. The preference auto-starts transcription on rejoin so the user does not have to re-enable it every meeting.
Describe alternatives you've considered
Third-party caption service (e.g. Otter.ai, Rev): Rejected. Introduces an external paid dependency and requires users to leave Cal Video or install a separate tool.
Browser-native Web Speech API: Rejected. Only captions the local user's own microphone, cannot caption remote participants, and has inconsistent cross-browser support.
Daily Prebuilt
enable_live_captions_uiflag: Not applicable. Cal Video uses Daily's call object, not Daily Prebuilt, so this flag has no effect.The Daily.co native transcription approach is the only solution that works within the existing architecture, requires no new dependencies, and captions all participants simultaneously.
How the user experiences it
Technical implementation notes
DAILY_API_KEYalready present in.env/packages/prisma/**beyond the single field addition/apps/web/lib/daily-webhook/**entirelyyarn tsc --noEmitandyarn lintwith no errorsAccessibility context
This feature directly addresses WCAG 2.1 Success Criterion 1.2.4 (Captions Live), which requires that live audio content in synchronized media has captions. Cal Video as a video conferencing product falls within scope of this criterion. This is also relevant to the European Accessibility Act compliance requirements raised in issue #21507.
PR plan
We plan to submit three sequential PRs once this issue is approved:
Each PR will reference this issue with
fixes #XXXX.