Upload a sub-rip title file to add captions to a recording #3864

Open
ffdixon opened this Issue Apr 28, 2017 · 1 comment

Comments

Projects
None yet
2 participants
@ffdixon
Member

ffdixon commented Apr 28, 2017

Extend the API to enable a 3rd party front-end to POST a sub-rip title file (SRT file) to the BigBlueButton server along with the recording ID.

The BigBlueButton server should then reprocess the recording and use the SRT file to place subtitles in the playback.

@browniecab

This comment has been minimized.

Show comment
Hide comment
@browniecab

browniecab May 25, 2018

We will need to add two new API calls to the server to get and upload captions, as described in the attached document.

API

Additional API calls will be added to the /bigbluebutton/api endpoint. These will be publicly exposed, so that an LMS integration could provide the ability to upload subtitles directly from a recording list, for example.

getRecordingTextTracks

Get a list of the caption/subtitle files currently available for a recording. It will include information about the captions (language, etc.), as well as a download link. This may be useful to retrieve live or automatically transcribed subtitles from a recording for manual editing.

GET parameters

recordID

Type: String, Required

A single recording ID to retrieve the available captions for. (Unlike other recording APIs, you cannot provide a comma-separated list of recordings.)

Response

An example response looks like the following:

<response>
 <returncode>SUCCESS</returncode>
 <tracks>
   <track
       kind="captions"
       lang="en-US"
       label="English"
       source="live"
       href="https://example.com/XXX/en-US-live.vtt"
     />
   <track …>
 </tracks>
</response>

The <track> tag has the following attributes:

kind

Indicates the intended use of the text track. The value will be one of the following strings:

  • subtitles
  • captions

The meaning of these values is defined by the HTML5 video element, see the MDN docs for details. Note that the HTML5 specification defines additional values which are not currently used here, but may be added at a later date.

lang

The language of the text track, as a language tag. See RFC 5646 for details on the format, and the Language subtag lookup for assistance using them. It will usually consist of a 2 or 3 letter language code in lowercase, optionally followed by a dash and a 2-3 letter geographic region code (country code) in uppercase.

label

A human-readable label for the text track. This is the string displayed in the subtitle selection list during recording playback.

source

Indicates where the track came from. The value will be one of the following strings:

  • live - A caption track derived from live captioning performed in a BigBlueButton.
  • automatic - A caption track generated automatically via computer voice recognition.
  • upload - A caption track uploaded by a 3rd party.

href

A link to download this text track file. The format will always be WebVTT (text/vtt mime type), which is similar to the SRT format.

The timing of the track will match the current recording playback video and audio files. Note that if the recording is edited (adjusting in/out markers), tracks from live or automatic sources will be re-created with the new timing. Uploaded tracks will be edited, but this may result in data loss if sections of the recording are removed during edits.

Errors

In addition to the standard BigBlueButton checksum error, this API call can return the following errors in <messageKey> when returncode is FAILED:

missingParameter

A required parameter is missing.

noRecordings

No recording was found matching the provided recording ID.

putRecordingTextTrack

Upload a caption or subtitle file to add it to the recording. If there is any existing track with the same values for kind and lang, it will be replaced.

Note that this api requires using a POST request. The parameters listed as GET parameters must be included in the request URI, and the actual uploaded file must be included in the body of the request in the multipart/form-data format.

Note that the standard BigBlueButton checksum algorithm must be performed on the GET parameters, but that the body of the request (the subtitle file) is not checksummed.

This design is such that a web application could generate a form with a signed url, and display it in the browser with a file upload selection box. When the user submits the form, it will upload the track directly to the recording api. The API may be used programmatically as well, of course.

This API is asynchronous. It can take several minutes for the uploaded file to be incorporated into the published recording, and if an uploaded file contains unrecoverable errors, it may never appear.

GET Parameters

recordID

Type: String, Required

A single recording ID to retrieve the available captions for. (Unlike other recording APIs, you cannot provide a comma-separated list of recordings.)

kind

Type: String, Required

Indicates the intended use of the text track. See the getRecordingTextTracks description for details. Using a value other than one listed in this document will cause an error to be returned.

lang

Type: String, Required

The language of the text track, as a language tag. See the getRecordingTextTracks description for details. The API will check that the language tag is well-formed, and return an error if it is not.

label

Type: String, Optional

A human-readable label for the text track. If not specified, the system will automatically generate a label containing the name of the language identified by the lang parameter.

POST Body

If the request has a body, the Content-Type header must specify multipart/form-data. The following parameters may be encoded in the post body.

file

Type: Binary Data, Optional

Contains the uploaded subtitle or caption file. If this parameter is missing, or if the POST request has no body, then any existing text track matching the kind and lang specified will be deleted.

If known, the uploading application should set the Content-Type to a value appropriate to the file format. If Content-Type is unset, or does not match a known subtitle format, the uploaded file will be probed to automatically detect the type.

Multiple types of subtitles are accepted for upload, but they will be converted to the WebVTT format for display.

The size of the request is limited (TODO: determine the limit)

The following types of subtitle files are accepted:

  • SRT (SubRip Text), including basic formatting. SRT does not have a standard mime type, but application/x-subrip is accepted.
  • SSA or ASS (Sub Station Alpha, Advanced Sub Station). Most formatting will be discarded, but basic inline styles (bold, italic, etc.) may be preserved.
    SSA/ASS does not have a standard mime type.
  • WebVTT. Uploaded WebVTT files will be used as-is, but note that browser support varies, so including REGION or STYLE blocks is not recommended.
    The WebVTT mime type is text/vtt

Errors

In addition to the standard BigBlueButton checksum error, this API call can return the following errors in <messageKey> when returncode is FAILED:

missingParameter

A required parameter is missing.

noRecordings

No recording was found matching the provided recording ID.

invalidKind

The kind parameter is not set to a permitted value.

invalidLang

The lang parameter is not a well-formed language tag.

The uploaded text track is not validated during upload. If it is invalid, it will be ignored and the existing subtitle will not be replaced.

browniecab commented May 25, 2018

We will need to add two new API calls to the server to get and upload captions, as described in the attached document.

API

Additional API calls will be added to the /bigbluebutton/api endpoint. These will be publicly exposed, so that an LMS integration could provide the ability to upload subtitles directly from a recording list, for example.

getRecordingTextTracks

Get a list of the caption/subtitle files currently available for a recording. It will include information about the captions (language, etc.), as well as a download link. This may be useful to retrieve live or automatically transcribed subtitles from a recording for manual editing.

GET parameters

recordID

Type: String, Required

A single recording ID to retrieve the available captions for. (Unlike other recording APIs, you cannot provide a comma-separated list of recordings.)

Response

An example response looks like the following:

<response>
 <returncode>SUCCESS</returncode>
 <tracks>
   <track
       kind="captions"
       lang="en-US"
       label="English"
       source="live"
       href="https://example.com/XXX/en-US-live.vtt"
     />
   <track …>
 </tracks>
</response>

The <track> tag has the following attributes:

kind

Indicates the intended use of the text track. The value will be one of the following strings:

  • subtitles
  • captions

The meaning of these values is defined by the HTML5 video element, see the MDN docs for details. Note that the HTML5 specification defines additional values which are not currently used here, but may be added at a later date.

lang

The language of the text track, as a language tag. See RFC 5646 for details on the format, and the Language subtag lookup for assistance using them. It will usually consist of a 2 or 3 letter language code in lowercase, optionally followed by a dash and a 2-3 letter geographic region code (country code) in uppercase.

label

A human-readable label for the text track. This is the string displayed in the subtitle selection list during recording playback.

source

Indicates where the track came from. The value will be one of the following strings:

  • live - A caption track derived from live captioning performed in a BigBlueButton.
  • automatic - A caption track generated automatically via computer voice recognition.
  • upload - A caption track uploaded by a 3rd party.

href

A link to download this text track file. The format will always be WebVTT (text/vtt mime type), which is similar to the SRT format.

The timing of the track will match the current recording playback video and audio files. Note that if the recording is edited (adjusting in/out markers), tracks from live or automatic sources will be re-created with the new timing. Uploaded tracks will be edited, but this may result in data loss if sections of the recording are removed during edits.

Errors

In addition to the standard BigBlueButton checksum error, this API call can return the following errors in <messageKey> when returncode is FAILED:

missingParameter

A required parameter is missing.

noRecordings

No recording was found matching the provided recording ID.

putRecordingTextTrack

Upload a caption or subtitle file to add it to the recording. If there is any existing track with the same values for kind and lang, it will be replaced.

Note that this api requires using a POST request. The parameters listed as GET parameters must be included in the request URI, and the actual uploaded file must be included in the body of the request in the multipart/form-data format.

Note that the standard BigBlueButton checksum algorithm must be performed on the GET parameters, but that the body of the request (the subtitle file) is not checksummed.

This design is such that a web application could generate a form with a signed url, and display it in the browser with a file upload selection box. When the user submits the form, it will upload the track directly to the recording api. The API may be used programmatically as well, of course.

This API is asynchronous. It can take several minutes for the uploaded file to be incorporated into the published recording, and if an uploaded file contains unrecoverable errors, it may never appear.

GET Parameters

recordID

Type: String, Required

A single recording ID to retrieve the available captions for. (Unlike other recording APIs, you cannot provide a comma-separated list of recordings.)

kind

Type: String, Required

Indicates the intended use of the text track. See the getRecordingTextTracks description for details. Using a value other than one listed in this document will cause an error to be returned.

lang

Type: String, Required

The language of the text track, as a language tag. See the getRecordingTextTracks description for details. The API will check that the language tag is well-formed, and return an error if it is not.

label

Type: String, Optional

A human-readable label for the text track. If not specified, the system will automatically generate a label containing the name of the language identified by the lang parameter.

POST Body

If the request has a body, the Content-Type header must specify multipart/form-data. The following parameters may be encoded in the post body.

file

Type: Binary Data, Optional

Contains the uploaded subtitle or caption file. If this parameter is missing, or if the POST request has no body, then any existing text track matching the kind and lang specified will be deleted.

If known, the uploading application should set the Content-Type to a value appropriate to the file format. If Content-Type is unset, or does not match a known subtitle format, the uploaded file will be probed to automatically detect the type.

Multiple types of subtitles are accepted for upload, but they will be converted to the WebVTT format for display.

The size of the request is limited (TODO: determine the limit)

The following types of subtitle files are accepted:

  • SRT (SubRip Text), including basic formatting. SRT does not have a standard mime type, but application/x-subrip is accepted.
  • SSA or ASS (Sub Station Alpha, Advanced Sub Station). Most formatting will be discarded, but basic inline styles (bold, italic, etc.) may be preserved.
    SSA/ASS does not have a standard mime type.
  • WebVTT. Uploaded WebVTT files will be used as-is, but note that browser support varies, so including REGION or STYLE blocks is not recommended.
    The WebVTT mime type is text/vtt

Errors

In addition to the standard BigBlueButton checksum error, this API call can return the following errors in <messageKey> when returncode is FAILED:

missingParameter

A required parameter is missing.

noRecordings

No recording was found matching the provided recording ID.

invalidKind

The kind parameter is not set to a permitted value.

invalidLang

The lang parameter is not a well-formed language tag.

The uploaded text track is not validated during upload. If it is invalid, it will be ignored and the existing subtitle will not be replaced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment