New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transcripts #3
Comments
Should support linking to an external document, to keep feed sizes down |
What are the existing standards? |
mrss covers subtitle http://www.rssboard.org/media-rss#media-subtitle http://www.rssboard.org/media-rss#media-text |
A good summary of choices: https://en.wikipedia.org/wiki/Timed_text |
One Very Good Thing™ to do would be to act as a neutral but opinionated body on which existing RSS standards should be supported. Media RSS has some very nice ideas, but the well-intentioned folks behind it left its nurturing to fate. That's not how you bootstrap a community standard. |
Like the idea of including a transcript as a separate link. Also like encouraging the formatting of that transcript using one of the standards from https://en.wikipedia.org/wiki/Timed_text that @CharlesWiltgen mentioned. However, using timed text presents similar challenges discussed in other issue of ad insertion and keeping the transcript in sync with the latest audio file it's pointed at. |
On the bright side, we're standing on the shoulders of giants (ffmpeg, AV Foundation, etc.) that are really good at this kind of EDL-like manipulation. Even for folks that have to roll their own (web apps, maybe?), it's conceptually straightforward — for example, if a 0:10 ad is inserted at 2:30, any events that happen at or after 2:30 are simply offset by 0:10. Not saying it wouldn't be a PITA. |
I would like to point the discussion to WebVTT. WebVTT is an existing spec with lots of possible time-based features (not only the text, also speaker names, styling or any other custom data like GPS coordinates etc.) and quite some systems support it already (screenreaders, (web) audio players with WebVTT display+search, software libs, etc.). |
Once again, I think as long as the metadata is tightly bound to the media file (e.g. through use of an id3 tag or link header) we can do whatever. WebVTT is definitely the obvious choice though I think it remains to be seen whether or not timed text is an important feature of these transcripts - I think if we make it a requirement, we may reduce the level of participation. If we don't make timed text a requirement then the data can be encoded directly in the feed. |
Agreed, transcripts (untimed) and subtitles/captions (timed) are both useful. I'd like to see both defined in the same way that MediaRSS sort of[1] does with FWIW I like WebVTT as the timed text format. It's supported in 82% of browsers in use worldwide and 96% in the USA, and there are apparently polyfills available for older browsers. The only potential downside is that I don't see any native iOS or Android parsers (iOS has one, but it only appears to work in HLS contexts) so that might create a bit of a chicken/egg problem initially. For the transcript format, it sure would be nice to be able to use Markdown (.md). If full HTML is supported, I think it's likely that enterprising people will use this for all kinds of things that go well beyond the intent. [1] IMO they didn't quite nail it because both can be used for timed-text. People consuming the spec shouldn't be wondering which to use when. |
No description provided.
The text was updated successfully, but these errors were encountered: