Skip to content

feat(cal-video): add real-time color-coded live captions for Deaf and hard-of-hearing users via Daily.co native transcription #28784

@G3RRYGL3Z

Description

@G3RRYGL3Z

Is your proposal related to a problem?

Cal Video currently has no live captioning support. For Deaf and hard-of-hearing users, this makes Cal Video inaccessible for professional meetings. One of our contributors is Deaf and encounters this barrier directly when using Cal Video for scheduled calls. Daily.co, which already powers Cal Video, natively supports real-time transcription via startTranscription() with less than 300ms latency — meaning this capability already exists in our infrastructure and requires no new video provider or external service dependency.

Describe the solution you'd like

Add an opt-in live captions toggle to Cal Video that activates Daily.co's built-in transcription service. The feature has four components totalling under 200 lines of production code:

1. Prisma schema (~5 lines)
Add liveCaptionsEnabled Boolean @default(false) to the User model so the preference persists across sessions.

2. tRPC endpoints (~80 lines)

  • viewer.getLiveCaptionsEnabled — reads the user's preference on room load
  • viewer.setLiveCaptionsEnabled — updates the preference when the toggle is clicked

3. Daily.co domain configuration (~60 lines)
A single admin API route at apps/web/pages/api/integrations/dailyvideo/enable-transcription.ts that calls Daily's REST API to enable transcription on the domain. Called once at deployment by an admin. Uses the existing DAILY_API_KEY already in .env. An optional DEEPGRAM_API_KEY environment variable allows self-hosters to use their own Deepgram account, but Daily handles Deepgram internally if this is left blank.

4. Cal Video UI (~55 lines)
A CC toggle button in the Cal Video control bar. Each speaker's captions appear in a distinct accessible color so Deaf users can follow multi-speaker conversations visually. Interim captions appear immediately as words are spoken with no buffering delay, then finalize when Deepgram marks them complete. The preference auto-starts transcription on rejoin so the user does not have to re-enable it every meeting.

Describe alternatives you've considered

Third-party caption service (e.g. Otter.ai, Rev): Rejected. Introduces an external paid dependency and requires users to leave Cal Video or install a separate tool.

Browser-native Web Speech API: Rejected. Only captions the local user's own microphone, cannot caption remote participants, and has inconsistent cross-browser support.

Daily Prebuilt enable_live_captions_ui flag: Not applicable. Cal Video uses Daily's call object, not Daily Prebuilt, so this flag has no effect.

The Daily.co native transcription approach is the only solution that works within the existing architecture, requires no new dependencies, and captions all participants simultaneously.

How the user experiences it

  1. User opens Cal Video settings and enables live captions
  2. In any Cal Video meeting, a CC button appears in the control bar
  3. Clicking CC starts real-time captions at the bottom of the screen
  4. Each speaker's name appears in a unique color alongside their words
  5. Captions begin appearing within 300ms of speech with no delay
  6. On the next meeting, captions auto-start without any action from the user

Technical implementation notes

  • No new npm packages required
  • Uses DAILY_API_KEY already present in .env
  • Follows Cal.com's existing tRPC handler and Zod validation patterns
  • Avoids /packages/prisma/** beyond the single field addition
  • Avoids /apps/web/lib/daily-webhook/** entirely
  • All code passes yarn tsc --noEmit and yarn lint with no errors
  • Daily.co connection verified against live domain prior to submission

Accessibility context

This feature directly addresses WCAG 2.1 Success Criterion 1.2.4 (Captions Live), which requires that live audio content in synchronized media has captions. Cal Video as a video conferencing product falls within scope of this criterion. This is also relevant to the European Accessibility Act compliance requirements raised in issue #21507.

PR plan

We plan to submit three sequential PRs once this issue is approved:

  • PR 1: Prisma schema field only (~20 lines)
  • PR 2: tRPC endpoints and Zod schemas (~80 lines)
  • PR 3: Daily.co wiring, admin route, and Cal Video UI (~100 lines)

Each PR will reference this issue with fixes #XXXX.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ✨ featureNew feature or request🚨 needs approvalThis feature request has not been reviewed yet by the Product Team and needs approval beforehand

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions