Skip to content

fix: add video captions on the video detail page#3284

Merged
ahtesham-quraish merged 6 commits into
mainfrom
ahtesham/video-transcript
May 6, 2026
Merged

fix: add video captions on the video detail page#3284
ahtesham-quraish merged 6 commits into
mainfrom
ahtesham/video-transcript

Conversation

@ahtesham-quraish
Copy link
Copy Markdown
Contributor

@ahtesham-quraish ahtesham-quraish commented May 4, 2026

What are the relevant tickets?

https://github.com/mitodl/hq/issues/11118

Description (What does it do?)

  • transcripts will be uploaded to OVS, for example https://video.odl.mit.edu/videos/ba8f967b6e4540889e42c496482f7bc9/
  • After the ETL, the caption URL should be available to the front-end via resource.video.caption_urls array
  • Additionally, it would be helpful if we could implement the captions in a way that exposes them to Google search indexing. This is probably out of scope, though.

Screenshots (if appropriate):

Screen.Recording.2026-05-04.at.7.36.12.PM.mov

How can this be tested?

if you don't have playlist data locally then in frontend env file you should add the following variable NEXT_PUBLIC_MITOL_API_BASE_URL="https://api.learn.mit.edu" it will connect with prod learn backend
then we need to enable the flag which is video-playlist-page and then visit the following url http://open.odl.local:8062/video-playlist/detail/88444?playlist=88443

Additional Context

@ahtesham-quraish ahtesham-quraish force-pushed the ahtesham/video-transcript branch from 507606b to 61b2031 Compare May 4, 2026 14:26
@ahtesham-quraish ahtesham-quraish changed the title Ahtesham/video transcript fix: add video captions on the video detail page May 4, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

OpenAPI Changes

No changes detected

View full changelog

Unexpected changes? Ensure your branch is up-to-date with main (consider rebasing).

@ahtesham-quraish ahtesham-quraish marked this pull request as ready for review May 5, 2026 06:52
Copilot AI review requested due to automatic review settings May 5, 2026 06:52
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds caption support to the video playback experience in the Video Playlist/Collection pages by wiring caption URLs from the API into the video.js player, and by exposing caption resources for indexing (plus adding VideoObject JSON-LD on the OCW series detail page).

Changes:

  • Pass video.video.caption_urls into VideoJsPlayer as remote text tracks (captions).
  • Add a visually-hidden list of caption (VTT) links to both video detail page variants so crawlers (and screen readers) can discover them.
  • Add VideoObject JSON-LD structured data on VideoSeriesDetailPage (including a captions accessibility signal).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File Description
frontends/main/src/app-pages/VideoPlaylistCollectionPage/VideoSeriesDetailPage.tsx Adds caption URL extraction, passes tracks to the player, adds caption link list, and emits VideoObject JSON-LD.
frontends/main/src/app-pages/VideoPlaylistCollectionPage/VideoJsPlayer.tsx Extends the player wrapper to accept caption tracks and manage remote text tracks on init/update.
frontends/main/src/app-pages/VideoPlaylistCollectionPage/VideoDetailPage.tsx Adds caption URL extraction, passes tracks to the player, and adds a caption link list.

Comment on lines +45 to +48
// Prevent the update effect from running on the very first mount —
// the init effect already handles the initial sources/tracks setup.
const isMountedRef = useRef(false)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +124 to +132
// Update sources / poster / tracks when props change without re-creating the player.
// Skip on first mount — the init effect's ready callback already handled it.
useEffect(() => {
const player = playerRef.current
if (!player) return
if (!player || !isMountedRef.current) return
player.src(sources)
player.poster(poster ?? "")
}, [sources, poster])
addTracks(player, tracks)
}, [sources, poster, tracks])
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +53 to +55
// TextTrackList is array-like at runtime but lacks an index signature in types
// eslint-disable-next-line @typescript-eslint/no-explicit-any
player.removeRemoteTextTrack((existing as any)[i])
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +106 to +112
thumbnailUrl:
video.video?.cover_image_url || video.image?.url || undefined,
contentUrl: video.url ?? undefined,
...(video.video?.duration ? { duration: video.video.duration } : {}),
...(captionUrls.length > 0
? { accessibilityFeature: ["captions"] }
: {}),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahtesham-quraish my inclination is to trust the API... I expect the API duration strings are valid, just the OpenAPI spec doesn't reflect that. It might be worth checking. Not that validating here hurts, but we should in general trust the API / improve the spec rather than patch that sort of thing on the frontend.

Don't let this block you, captions are important to get out.

<p>Captions available for this video:</p>
<ul>
{captionUrls.map((track) => (
<li key={track.language}>
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

<p>Captions available for this video:</p>
<ul>
{captionUrls.map((track) => (
<li key={track.language}>
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +122 to +130
{structuredData && (
<script
type="application/ld+json"
// JSON.stringify does not escape </ by default; replace prevents
// a malicious title/description from breaking out of the script tag.
dangerouslySetInnerHTML={{
__html: JSON.stringify(structuredData).replace(/<\//g, "<\\/"),
}}
/>
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@daniellefrappier18 daniellefrappier18 self-assigned this May 5, 2026
const structuredData =
!isLoading && video
? {
"@context": "https://schema.org",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you are missing uploadDate. Google requires name, description, thumbnailUrl, and uploadDate for a VideoObject to qualify as a video rich result. Without uploadDate, the entire <script type="application/ld+json"> block is silently ignored for rich results.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have used the last_modified date as upload date because in video resource object we don't have any such field.

Copy link
Copy Markdown
Contributor

@daniellefrappier18 daniellefrappier18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but please see my comment regarding the missing uploadDate

@pdpinch
Copy link
Copy Markdown
Member

pdpinch commented May 5, 2026

We would like to get the captions merged and deployed asap. The metadata is secondary and can be fixed up in a follow-up PR if necessary.

Copy link
Copy Markdown
Contributor

@ChristopherChudzicki ChristopherChudzicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to leave the testing / re-review to @daniellefrappier18 , but I added a few comments.

Overall suggestion: This PR adds some SEO work in additions to captions. It would be helpful to mention that in the title / PR description. (It could also have been a separate PR, which often helps things get merged faster.)

That said, thanks for teaching me about JSON-LD!

Comment on lines +106 to +112
thumbnailUrl:
video.video?.cover_image_url || video.image?.url || undefined,
contentUrl: video.url ?? undefined,
...(video.video?.duration ? { duration: video.video.duration } : {}),
...(captionUrls.length > 0
? { accessibilityFeature: ["captions"] }
: {}),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahtesham-quraish my inclination is to trust the API... I expect the API duration strings are valid, just the OpenAPI spec doesn't reflect that. It might be worth checking. Not that validating here hurts, but we should in general trust the API / improve the spec rather than patch that sort of thing on the frontend.

Don't let this block you, captions are important to get out.

<Styled.PageWrapper>
{structuredData && (
<script
type="application/ld+json"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not familiar with JSON-LD objects. This seems like a good addition! 👍

Suggestion: It might be worth dropping a reference (here's the nextjs docs reference, https://nextjs.org/docs/app/guides/json-ld ... but maybe there's a more canonical one from google, since this certainly isn't nextjs-specific).

BTW: The nextjs docs use a slightly broader regex escape.

Comment on lines +292 to +294
{/* Caption track links – visually hidden but present in the DOM so
Googlebot can follow each VTT URL and index the caption text,
associating the transcript content with this video page. */}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious—Was there a source that encourages implementation of track data this way?

Request: I'm hesitant to add this—it could actually be detrimental to screenreader users. The purpose of sr-only visually hidden content is to assist screenreader users / assistive tech, not to make content available to Googlebot. Unless there's a good reference indicating this pattern, I think we should skip it for now.

Alternatives:

  1. Rely on the track elemens in added by VideoJS; SEO bots can see those already (Though as far as I know, google doesn't explicitly tell us they use them.)
  2. Document by the captions track in JSON-LD https://schema.org/VideoObject

src: track.url,
srclang: track.language,
label: track.language_name || track.language,
default: index === 0,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes captions on by default; i dunno if that's our intent.

@ahtesham-quraish ahtesham-quraish merged commit 6063a3d into main May 6, 2026
13 checks passed
@ahtesham-quraish ahtesham-quraish deleted the ahtesham/video-transcript branch May 6, 2026 12:33
@odlbot odlbot mentioned this pull request May 6, 2026
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants