fix: add video captions on the video detail page by ahtesham-quraish · Pull Request #3284 · mitodl/mit-learn

ahtesham-quraish · 2026-05-04T14:25:49Z

What are the relevant tickets?

https://github.com/mitodl/hq/issues/11118

Description (What does it do?)

transcripts will be uploaded to OVS, for example https://video.odl.mit.edu/videos/ba8f967b6e4540889e42c496482f7bc9/
After the ETL, the caption URL should be available to the front-end via resource.video.caption_urls array
Additionally, it would be helpful if we could implement the captions in a way that exposes them to Google search indexing. This is probably out of scope, though.

Screenshots (if appropriate):

Screen.Recording.2026-05-04.at.7.36.12.PM.mov

How can this be tested?

if you don't have playlist data locally then in frontend env file you should add the following variable NEXT_PUBLIC_MITOL_API_BASE_URL="https://api.learn.mit.edu" it will connect with prod learn backend
then we need to enable the flag which is video-playlist-page and then visit the following url http://open.odl.local:8062/video-playlist/detail/88444?playlist=88443

Additional Context

github-actions · 2026-05-04T14:34:20Z

OpenAPI Changes

No changes detected

View full changelog

Unexpected changes? Ensure your branch is up-to-date with main (consider rebasing).

Copilot

Pull request overview

This PR adds caption support to the video playback experience in the Video Playlist/Collection pages by wiring caption URLs from the API into the video.js player, and by exposing caption resources for indexing (plus adding VideoObject JSON-LD on the OCW series detail page).

Changes:

Pass video.video.caption_urls into VideoJsPlayer as remote text tracks (captions).
Add a visually-hidden list of caption (VTT) links to both video detail page variants so crawlers (and screen readers) can discover them.
Add VideoObject JSON-LD structured data on VideoSeriesDetailPage (including a captions accessibility signal).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File	Description
frontends/main/src/app-pages/VideoPlaylistCollectionPage/VideoSeriesDetailPage.tsx	Adds caption URL extraction, passes tracks to the player, adds caption link list, and emits VideoObject JSON-LD.
frontends/main/src/app-pages/VideoPlaylistCollectionPage/VideoJsPlayer.tsx	Extends the player wrapper to accept caption tracks and manage remote text tracks on init/update.
frontends/main/src/app-pages/VideoPlaylistCollectionPage/VideoDetailPage.tsx	Adds caption URL extraction, passes tracks to the player, and adds a caption link list.

ahtesham-quraish · 2026-05-05T09:25:07Z

+  // Prevent the update effect from running on the very first mount —
+  // the init effect already handles the initial sources/tracks setup.
+  const isMountedRef = useRef(false)
+


ahtesham-quraish · 2026-05-05T09:26:18Z

+  // Update sources / poster / tracks when props change without re-creating the player.
+  // Skip on first mount — the init effect's ready callback already handled it.
  useEffect(() => {
    const player = playerRef.current
-    if (!player) return
+    if (!player || !isMountedRef.current) return
    player.src(sources)
    player.poster(poster ?? "")
-  }, [sources, poster])
+    addTracks(player, tracks)
+  }, [sources, poster, tracks])


ahtesham-quraish · 2026-05-05T09:29:06Z

+      // TextTrackList is array-like at runtime but lacks an index signature in types
+      // eslint-disable-next-line @typescript-eslint/no-explicit-any
+      player.removeRemoteTextTrack((existing as any)[i])


ahtesham-quraish · 2026-05-05T09:32:54Z

+          thumbnailUrl:
+            video.video?.cover_image_url || video.image?.url || undefined,
+          contentUrl: video.url ?? undefined,
+          ...(video.video?.duration ? { duration: video.video.duration } : {}),
+          ...(captionUrls.length > 0
+            ? { accessibilityFeature: ["captions"] }
+            : {}),


@ahtesham-quraish my inclination is to trust the API... I expect the API duration strings are valid, just the OpenAPI spec doesn't reflect that. It might be worth checking. Not that validating here hurts, but we should in general trust the API / improve the spec rather than patch that sort of thing on the frontend.

Don't let this block you, captions are important to get out.

ahtesham-quraish · 2026-05-05T09:33:34Z

+              <p>Captions available for this video:</p>
+              <ul>
+                {captionUrls.map((track) => (
+                  <li key={track.language}>


ahtesham-quraish · 2026-05-05T09:33:59Z

+              <p>Captions available for this video:</p>
+              <ul>
+                {captionUrls.map((track) => (
+                  <li key={track.language}>


ahtesham-quraish · 2026-05-05T09:59:45Z

+      {structuredData && (
+        <script
+          type="application/ld+json"
+          // JSON.stringify does not escape </ by default; replace prevents
+          // a malicious title/description from breaking out of the script tag.
+          dangerouslySetInnerHTML={{
+            __html: JSON.stringify(structuredData).replace(/<\//g, "<\\/"),
+          }}
+        />


daniellefrappier18 · 2026-05-05T14:15:36Z

+  const structuredData =
+    !isLoading && video
+      ? {
+          "@context": "https://schema.org",


Looks like you are missing uploadDate. Google requires name, description, thumbnailUrl, and uploadDate for a VideoObject to qualify as a video rich result. Without uploadDate, the entire <script type="application/ld+json"> block is silently ignored for rich results.

I have used the last_modified date as upload date because in video resource object we don't have any such field.

daniellefrappier18

Looks good but please see my comment regarding the missing uploadDate

pdpinch · 2026-05-05T21:11:07Z

We would like to get the captions merged and deployed asap. The metadata is secondary and can be fixed up in a follow-up PR if necessary.

ChristopherChudzicki

I'm going to leave the testing / re-review to @daniellefrappier18 , but I added a few comments.

Overall suggestion: This PR adds some SEO work in additions to captions. It would be helpful to mention that in the title / PR description. (It could also have been a separate PR, which often helps things get merged faster.)

That said, thanks for teaching me about JSON-LD!

ChristopherChudzicki · 2026-05-06T00:22:12Z

+          thumbnailUrl:
+            video.video?.cover_image_url || video.image?.url || undefined,
+          contentUrl: video.url ?? undefined,
+          ...(video.video?.duration ? { duration: video.video.duration } : {}),
+          ...(captionUrls.length > 0
+            ? { accessibilityFeature: ["captions"] }
+            : {}),


@ahtesham-quraish my inclination is to trust the API... I expect the API duration strings are valid, just the OpenAPI spec doesn't reflect that. It might be worth checking. Not that validating here hurts, but we should in general trust the API / improve the spec rather than patch that sort of thing on the frontend.

Don't let this block you, captions are important to get out.

ChristopherChudzicki · 2026-05-06T00:28:17Z

    <Styled.PageWrapper>
+      {structuredData && (
+        <script
+          type="application/ld+json"


I was not familiar with JSON-LD objects. This seems like a good addition! 👍

Suggestion: It might be worth dropping a reference (here's the nextjs docs reference, https://nextjs.org/docs/app/guides/json-ld ... but maybe there's a more canonical one from google, since this certainly isn't nextjs-specific).

BTW: The nextjs docs use a slightly broader regex escape.

ChristopherChudzicki · 2026-05-06T00:53:30Z

+          {/* Caption track links – visually hidden but present in the DOM so
+              Googlebot can follow each VTT URL and index the caption text,
+              associating the transcript content with this video page. */}


Curious—Was there a source that encourages implementation of track data this way?

Request: I'm hesitant to add this—it could actually be detrimental to screenreader users. The purpose of sr-only visually hidden content is to assist screenreader users / assistive tech, not to make content available to Googlebot. Unless there's a good reference indicating this pattern, I think we should skip it for now.

Alternatives:

Rely on the track elemens in added by VideoJS; SEO bots can see those already (Though as far as I know, google doesn't explicitly tell us they use them.)

Document by the captions track in JSON-LD https://schema.org/VideoObject

The schema does have captions, but as far as i can tell, they are not mentioned in https://developers.google.com/search/docs/appearance/structured-data/video

ChristopherChudzicki · 2026-05-06T12:19:53Z

+          src: track.url,
+          srclang: track.language,
+          label: track.language_name || track.language,
+          default: index === 0,


This makes captions on by default; i dunno if that's our intent.

Ahtesham Quraish and others added 2 commits May 4, 2026 12:06

fix: add captions of video

5833c26

fixed the transcript for odl videos

61b2031

ahtesham-quraish force-pushed the ahtesham/video-transcript branch from 507606b to 61b2031 Compare May 4, 2026 14:26

ahtesham-quraish changed the title ~~Ahtesham/video transcript~~ fix: add video captions on the video detail page May 4, 2026

rebase with main

3c027b9

ahtesham-quraish marked this pull request as ready for review May 5, 2026 06:52

Copilot AI review requested due to automatic review settings May 5, 2026 06:52

Copilot started reviewing on behalf of ahtesham-quraish May 5, 2026 06:53 View session

Copilot AI reviewed May 5, 2026

View reviewed changes

add unit test

9d438be

daniellefrappier18 self-assigned this May 5, 2026

daniellefrappier18 reviewed May 5, 2026

View reviewed changes

daniellefrappier18 requested changes May 5, 2026

View reviewed changes

ChristopherChudzicki requested changes May 6, 2026

View reviewed changes

Ahtesham Quraish added 2 commits May 6, 2026 12:02

address the feedback

1c359f1

fix the unit tests

cfa5bff

ChristopherChudzicki reviewed May 6, 2026

View reviewed changes

ChristopherChudzicki approved these changes May 6, 2026

View reviewed changes

daniellefrappier18 approved these changes May 6, 2026

View reviewed changes

ahtesham-quraish merged commit 6063a3d into main May 6, 2026
13 checks passed

ahtesham-quraish deleted the ahtesham/video-transcript branch May 6, 2026 12:33

odlbot mentioned this pull request May 6, 2026

Release 0.66.7 #3303

Merged

1 task

Conversation

ahtesham-quraish commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What are the relevant tickets?

Description (What does it do?)

Screenshots (if appropriate):

How can this be tested?

Additional Context

Uh oh!

github-actions Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

OpenAPI Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

daniellefrappier18 left a comment

Choose a reason for hiding this comment

Uh oh!

pdpinch commented May 5, 2026

Uh oh!

ChristopherChudzicki left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ahtesham-quraish commented May 4, 2026 •

edited

Loading

github-actions Bot commented May 4, 2026 •

edited

Loading