Copyright 2024 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

# YouTube Tech Services Translation

The idea is to allow the translation of existing YouTube videos. This includes automated caption translation and upload, as well as automated synthetisation (Text to-speech generation).

Requirements:


*   Google Drive
*   Google Cloud project with [YouTube Data Api](https://console.cloud.google.com/apis/library/youtube.googleapis.com), [YouTube Content ID API](https://console.cloud.google.com/apis/library/youtubepartner.googleapis.com), [Google Translate API](https://console.cloud.google.com/apis/library/translate.googleapis.com) (billing account required) and [Google TextToSpeech API](https://console.cloud.google.com/apis/library/texttospeech.googleapis.com) (billing account required) enabled
*   Google Cloud Project Service Account with Key ([Help Center](https://cloud.google.com/iam/docs/service-accounts-create#iam-service-accounts-create-console))
*   YouTube CMS (External Content Owner ID) and YouTube Video ID

This script will:


*   Download the existing (auto-generated) captions to the Google Drive folder
*   Translate the existing captions into the configured languages, store them in the Google Drive folder, and [optional] upload them directly to your YouTube video
*   Synthesize the translated captions and store them in the Google Drive folder
*   Provides an option to test the synthesized audio on the existing video

The usage of the Google Translate API as well as the Google TextToSpeech API may be associated with some cost for more intense usage. Therefore, please check the current pricing here:


*   https://cloud.google.com/translate/pricing
*   https://cloud.google.com/text-to-speech/pricing

# Install the required libraries

1. Library to use SRT (caption file format): **pysrt**

2. Library to use SSML (Speech Synthesis Markup Language): **pyssml**

3. Some libraries to support text analysis for Asian languages

4. Update the Google OAuth libraries to the newest version


In [None]:
!pip3 install 'pysrt'
!pip3 install 'pyssml'
!pip3 install 'jieba' # chinese
!pip3 install 'nagisa' # japanese
!pip3 install 'konlpy' # korean
!pip3 install 'pyvi' # vietnamese
!pip3 install 'google-api-python-client==2.84.0'
!pip3 install 'google-auth==2.17.3'
!pip3 install 'google-auth-httplib2==0.1.1'
!pip3 install 'google-auth-oauthlib==1.0.0'

# Configuration

Please adjust the settings below:

*  EXTERNAL_CONTENT_OWNER_ID: *ID of your YouTube Content Owner*
*  EXTERNAL_VIDEO_ID: *ID of the video (part of the URL)*
*  VOICE_START_AT_MSEC:* Time (in milliseconds) when the first voice appears (offset), in case you don't have it, just count until first voice appear and multiply it by 1000*
*  BREAK_BETWEEN_SENTENCE: *Time (in milliseconds) between sentences*
*  CAPTION_TRANSLATION_LANGUAGE: *original language of the video*
*  CAPTION_TRANSLATION_LANGUAGE: *language you want to translate, (please check supported languages [here](https://cloud.google.com/text-to-speech/docs/voices)), format example ['en', 'de', 'jp']*
*  TRANSLATION_VOICE_GENDER: *gender of the AI voice*
*  VOICE_SPEED: *Speed of the voice to better control the sync of the translation*
*  DRIVE_FOLDER: *Folder in your Google Drive to store captions and translated audio versions, if the Drive folder is not available it will be created*
*   JSON_FILE: *json file with service account credentials, you need to upload this file to the configured Drive folder*
*  AUTO_UPLOAD_TRANSLATED_CAPTIONS: *should the script upload translated captions (default = false)*
*  AUTO_LABEL_ASSET: *should the script label the asset belonging to this video so that the usage of this script can be tracked (default = false)*



In [None]:
# add the YouTube Content Owner ID
EXTERNAL_CONTENT_OWNER_ID = ''  # @param {type:"string"}

# YouTube video ID you want to translate
EXTERNAL_VIDEO_ID = ''  # @param {type:"string"}

# estimated time when the first voice is in the video in milliseconds
# in case you don't have it, just stop the video on YouTube when the first
# voice appear and multiply it by 1000
VOICE_START_AT_MSEC = 2  # @param {type:"integer"}

# break time between each sentence to better sync
BREAK_BETWEEN_SENTENCE = 0  # @param {type:"integer"}

# original language of the YouTube video you want to translate
CAPTION_ORIGINAL_LANGUAGE = 'de'  # @param {type:"string"}

# list of languages you want to translate
# (please check supported languages here: https://cloud.google.com/text-to-speech/docs/voices)
# format example ['en', 'de', 'jp']
CAPTION_TRANSLATION_LANGUAGE = ['en']  # @param {type:"raw"}

# gender of the AI Language ['MALE','FEMALE','NEUTRAL']
TRANSLATION_VOICE_GENDER = 'NEUTRAL'  # @param {type:"string"}

# speed of the language "x-slow", "slow", "medium", "fast", "x-fast", or "default"
VOICE_SPEED = 'default'  # @param {type:"string"}

# Google Drive folder where the captions / audio files will be stored
DRIVE_FOLDER = 'PTMTranslate'  # @param {type:"string"}

# Service Account Authentification file
# if you download the json file from your Google Cloud project the file has
# a different name, therefore you either need to rename the file itself or
# adjust the name here
# you need to upload this file to the configured Cloud folder
JSON_FILE = 'client_secret.json'  # @param {type:"string"}

# automatically upload translated captions to YouTube
AUTO_UPLOAD_TRANSLATED_CAPTIONS = False  # @param {type:"boolean"}

# label the asset so that your YouTube PTM team can see which assets where updated by this script
# this is voluntary and the scripts works also perfectly without this, but it helps us to track
# the usage of the script
AUTO_LABEL_ASSET = False  # @param {type:"boolean"}


# Connect to Drive

Mount the Google Drive destination and create a specific folder for the script and per video.

**NOTE: If this is the first time you run the code, you need to upload your `client_secret.json` oauth file to the Drive folder after is has been created.**

In [None]:
from pathlib import Path
from google.colab import drive

drive.mount('/content/drive', force_remount=True)
DRIVE_DIRECTORY_PATH = (
    '/content/drive/My Drive/' + DRIVE_FOLDER + '/' + EXTERNAL_VIDEO_ID + '/'
)

# create folder if necessary
Path(DRIVE_DIRECTORY_PATH).mkdir(parents=True, exist_ok=True)


# Authentication and Google Service Usage

List the Google services and methods to access them.

In [None]:
import json
import re
from google.oauth2 import service_account
from googleapiclient import discovery

# scope, name, version for YouTube data API
YOUTUBE_DATA_API_SERVICE_NAME = 'youtube'
YOUTUBE_DATA_API_VERSION = 'v3'
YOUTUBE_DATA_API_SCOPE = [
    'https://www.googleapis.com/auth/youtube.force-ssl',
    'https://www.googleapis.com/auth/youtubepartner',
]

YOUTUBE_PARTNER_API_SERVICE_NAME = 'youtubePartner'
YOUTUBE_PARTNER_API_VERSION = 'v1'
YOUTUBE_PARTNER_API_SCOPE = [
    'https://www.googleapis.com/auth/youtube.force-ssl',
    'https://www.googleapis.com/auth/youtubepartner',
]

# scope, name, version for Google Translate API
GOOGLE_TRANSLATE_API_SERVICE_NAME = 'translate'
GOOGLE_TRANSLATE_API_VERSION = 'v2'
GOOGLE_TRANSLATE_API_SCOPE = ['https://www.googleapis.com/auth/cloud-platform']

# scope, name, version for Google TextToSpeech API
GOOGLE_TEXT_TO_SPEECH_API_SERVICE_NAME = 'texttospeech'
GOOGLE_TEXT_TO_SPEECH_API_VERSION = 'v1'
GOOGLE_TEXT_TO_SPEECH_API_SCOPE = [
    'https://www.googleapis.com/auth/cloud-platform'
]


def get_credentials(scopes):
  f = open('/content/drive/My Drive/' + DRIVE_FOLDER + '/' + JSON_FILE)
  service_account_json = json.load(f)
  credentials = service_account.Credentials.from_service_account_info(
      service_account_json, scopes=scopes
  )
  return credentials


def get_youtube_data_api_service():
  return discovery.build(
      YOUTUBE_DATA_API_SERVICE_NAME,
      YOUTUBE_DATA_API_VERSION,
      credentials=get_credentials(YOUTUBE_DATA_API_SCOPE),
  )


def get_youtube_partner_api_service():
  return discovery.build(
      YOUTUBE_PARTNER_API_SERVICE_NAME,
      YOUTUBE_PARTNER_API_VERSION,
      static_discovery=False,
      credentials=get_credentials(YOUTUBE_PARTNER_API_SCOPE),
  )


def get_google_translate_api_service():
  return discovery.build(
      GOOGLE_TRANSLATE_API_SERVICE_NAME,
      GOOGLE_TRANSLATE_API_VERSION,
      credentials=get_credentials(GOOGLE_TRANSLATE_API_SCOPE),
  )


def get_google_text_to_speech_api_service():
  return discovery.build(
      GOOGLE_TEXT_TO_SPEECH_API_SERVICE_NAME,
      GOOGLE_TEXT_TO_SPEECH_API_VERSION,
      credentials=get_credentials(GOOGLE_TEXT_TO_SPEECH_API_SCOPE),
  )


def is_text_in_brackets_or_empty(text):
  # is text empty
  if text == '':
    return True
  # is text latin characters in brackets []
  elif re.fullmatch(r'\[([A-Za-z0-9_]+)\]', text):
    return True
  # is text arabic character or other word in brackets []
  elif re.fullmatch(r'\[([\u0900-\u097F\w]+)\]', text):
    return True
  else:
    return False


def is_language_chinese(language_code):
  return language_code == 'zh'


def is_language_japanese(language_code):
  return language_code == 'ja'


def is_language_korean(language_code):
  return language_code == 'ko'


def is_language_vietnamese(language_code):
  return language_code == 'vi'


def is_language_without_space(language_code):
  return (
      is_language_chinese(language_code)
      or is_language_japanese(language_code)
      or is_language_korean(language_code)
      or is_language_vietnamese(language_code)
  )


# Download the caption file of the video

The authentication happens at the YouTube Content Owner level; therefore, the EXTERNAL_CONTENT_OWNER_ID needs to be configured (above). The video needs to be in the configured YouTube Content Owner, and your account needs access to this YouTube Content Owner. The caption file is stored in the configured Google Drive folder.

In [None]:
from io import FileIO
from googleapiclient.http import MediaIoBaseDownload

# get YouTube data API
youtube_data = get_youtube_data_api_service()

# get caption of the original language
response = (
    youtube_data.captions()
    .list(part='snippet', videoId=EXTERNAL_VIDEO_ID)
    .execute()
)

# get caption ID of the original language
caption_id = response['items'][0]['id']

request = youtube_data.captions().download(
    id=caption_id,
    onBehalfOfContentOwner=EXTERNAL_CONTENT_OWNER_ID,
    # download captions in srt format
    tfmt='srt',
)

# output file in Google Drive
path_orig_caption_file = DRIVE_DIRECTORY_PATH + 'orig_caption.srt'
fh = FileIO(path_orig_caption_file, 'wb')

download = MediaIoBaseDownload(fh, request)
complete = False
while not complete:
  status, complete = download.next_chunk()
print('Download Complete!: ' + path_orig_caption_file)


# Data Structure for Translation

The processing of the caption file, the translation and the AI speech generation require some datastructure.

In [None]:
from pysrt import SubRipFile
from pysrt import SubRipItem
from pyssml.PySSML import PySSML


# class to map the entire text and the individual sentences
# this is mainly needed to sync sentences after the translation
class SentenceTextMapper:
  sentence = []
  text = ''

  def __init__(self):
    self.sentence = []
    self.text = ''

  def add_sentence(self, sentence):
    self.sentence.append(sentence)
    self.text = self.text + sentence.text.replace('\n', ' ') + ' '


# represents all data (srt file, language code, sentenceTextMap, ssml,
# base64 mp3) for one translated langauge
class Language:
  srtfile = None
  languageCode = ''
  sentenceTextMap = []
  ssml = None
  base64_mp3 = ''

  def __init__(self, languageCode):
    self.languageCode = languageCode
    self.srtfile = None
    self.sentenceTextMap = []
    self.base64_mp3 = ''
    self.ssml = None

  # split the srt file into segemnts
  def process_srt_file(self):
    sentenceTextMapper = SentenceTextMapper()
    self.sentenceTextMap.append(sentenceTextMapper)
    for sentence in self.srtfile:
      if is_text_in_brackets_or_empty(sentence.text):
        sentenceTextMapper = SentenceTextMapper()
        self.sentenceTextMap.append(sentenceTextMapper)
        sentenceTextMapper.add_sentence(sentence)
        sentenceTextMapper = SentenceTextMapper()
        self.sentenceTextMap.append(sentenceTextMapper)
      else:
        sentenceTextMapper.add_sentence(sentence)


# Translate Captions

This segment splits the captions into different parts and translates them into the configured languages. There is also some logic to do some mapping between the original captions and the resulting captions so that the timing of the translated captions is better aligned. The translated captions are stored in the configured Google Drive folder.

In [None]:
import html
import jieba
from konlpy.tag import Komoran
import nagisa
from pysrt import SubRipFile, SubRipItem
from pyvi import ViPosTagger, ViTokenizer

# array for all different languages
languageArray = []


def get_original_caption():
  # create new language object for original language
  original = Language(CAPTION_ORIGINAL_LANGUAGE)
  # load caption file from drive
  path_original_caption_file = DRIVE_DIRECTORY_PATH + 'orig_caption.srt'
  original.srtfile = SubRipFile.open(path_orig_caption_file)
  original.process_srt_file()
  return original


def get_translated_language(translate_service, language_code, original):
  dubed_language = Language(language_code)
  # for each segment of the orig caption file do a translation
  for sentenceTextMap in original.sentenceTextMap:
    # tranlate
    outputs = (
        translate_service.translations()
        .list(
            source=CAPTION_ORIGINAL_LANGUAGE,
            target=dubed_language.languageCode,
            q=sentenceTextMap.text,
        )
        .execute()
    )
    # encode to translate
    translate = (
        outputs['translations'][0]['translatedText']
        .encode('utf-8')
        .decode('utf-8')
    )
    translate = html.unescape(translate)
    sentenceTextMapper = SentenceTextMapper()
    sentenceTextMapper.text = translate
    dubed_language.sentenceTextMap.append(sentenceTextMapper)
  return dubed_language


# mapping the translation (e.g. adjust number of word per line)
def create_translation_caption(original, translation):
  part = 0
  translation.srtfile = SubRipFile()
  while part < len(original.sentenceTextMap):
    original_map = original.sentenceTextMap[part]
    original_word_array = original_map.text.split()
    translate_map = translation.sentenceTextMap[part]
    # use special word detection for language who does not splitt word via space
    if is_language_chinese(translation.languageCode):
      translate_word_array = jieba.lcut(translate_map.text)
    elif is_language_japanese(translation.languageCode):
      temp = nagisa.tagging(translate_map.text)
      translate_word_array = temp.words
    elif is_language_korean(translation.languageCode):
      komoran = Komoran()
      translate_word_array = komoran.morphs(translate_map.text)
    elif is_language_vietnamese(translation.languageCode):
      temp = ViTokenizer.tokenize(translate_map.text)
      translate_word_array = temp.split()
    else:
      translate_word_array = translate_map.text.split()

    # factor between the length of the original text and the translation
    # e.g. original word count is 5 / translation word count is 10
    # text_length_factor: 10/5 = 2
    text_length_factor = 1
    if len(original_word_array) > 0:
      text_length_factor = len(translate_word_array) / len(original_word_array)

    # remainer is required if the resulting number of works is not a full number
    # e.g. 3.5 words, reaminer would be 0.5
    remainer = 0
    word_array_index = 0
    translate_sentence = ''
    for sentence in original_map.sentence:
      original_sentence_length = len(sentence.text.split())
      translate_length = int(original_sentence_length * text_length_factor)
      remainer += (original_sentence_length * text_length_factor) % 1
      if remainer >= 1:
        translate_length = translate_length + 1
        remainer = remainer - 1
      if (
          word_array_index + translate_length == len(translate_word_array) - 1
          and original_map.sentence.index(sentence)
          == len(original_map.sentence) - 1
      ):
        translate_length += 1
      translate_sentence = translate_word_array[
          word_array_index : word_array_index + translate_length
      ]
      word_array_index += translate_length
      if is_language_without_space(translation.languageCode):
        translate_sentence = ''.join(translate_sentence)
      else:
        translate_sentence = ' '.join(translate_sentence)
      subRibItem = SubRipItem(
          start=sentence.start.to_time(),
          end=sentence.end.to_time(),
          text=translate_sentence,
      )
      translation.srtfile.append(subRibItem)

    part += 1


def save_translation_caption(translation):
  srtfile = translation.srtfile
  path_translated_caption_file = (
      DRIVE_DIRECTORY_PATH + translation.languageCode + '_caption.srt'
  )
  srtfile.save(path_translated_caption_file)
  print(
      'Captions for language {} are stored at {}. We recommend to review them.'
      .format(translation.languageCode, path_translated_caption_file)
  )


if __name__ == '__main__':
  # load original language of the video and add it to array
  original = get_original_caption()
  languageArray.append(original)

  # get Google translation API
  translate_service = get_google_translate_api_service()

  # iterate all languages to translate captions
  for language in CAPTION_TRANSLATION_LANGUAGE:
    print('Start Translation for Language: {}'.format(language))
    translation = get_translated_language(translate_service, language, original)
    # create the translation to the original caption file
    create_translation_caption(original, translation)
    # save translated caption file to Google Drive folder
    save_translation_caption(translation)
    # add translation to array
    languageArray.append(translation)
    print('Translation finished.')

  print(
      'All Translations are done. Please review the translated file(s) in your'
      ' Drive folder.'
  )


# Upload the translated caption to YouTube

If you wish, you can now upload the translated caption file directly to your YouTube video.

In [None]:
from googleapiclient.http import MediaFileUpload

if AUTO_UPLOAD_TRANSLATED_CAPTIONS:
  youtube_data_service = get_youtube_data_api_service()

  for lanaguage in languageArray:
    if lanaguage.languageCode != CAPTION_ORIGINAL_LANGUAGE:
      print(
          'Uploading translated caption file for language: '
          + lanaguage.languageCode
      )
      media_body = MediaFileUpload(
          DRIVE_DIRECTORY_PATH + lanaguage.languageCode + '_caption.srt',
          resumable=True,
      )

      request = youtube_data_service.captions().insert(
          part='snippet',
          onBehalfOfContentOwner=EXTERNAL_CONTENT_OWNER_ID,
          body={
              'snippet': {
                  'videoId': EXTERNAL_VIDEO_ID,
                  'language': lanaguage.languageCode,
                  'name': 'auto translated caption',
              }
          },
          media_body=media_body,
      )
      try:
        response = request.execute()
        print('Done.')
      except Exception as e:
        print(e)

    print('All Done.')
else:
  print('Auto upload of translated captions is disabled.')


# Label Assets

If you wish, you can label the asset of the translated video to allow your YouTube team to track the usage of this script.

In [None]:
if AUTO_UPLOAD_TRANSLATED_CAPTIONS:
  youtube_partner_service = get_youtube_partner_api_service()

  claims_search_response = (
      youtube_partner_service.claimSearch()
      .list(
          onBehalfOfContentOwner=EXTERNAL_CONTENT_OWNER_ID,
          partnerUploaded='true',
          videoId=EXTERNAL_VIDEO_ID,
          status='ACTIVE',
      )
      .execute()
  )

  # the video should have exactly one partner uploaded claim
  if len(claims_search_response['items']) == 1:
    claim = claims_search_response['items'][0]
    asset_id = claim['assetId']
    print(asset_id)
    # get existing labels for that asset
    get_asset_response = (
        youtube_partner_service.assets()
        .get(assetId=asset_id, onBehalfOfContentOwner=EXTERNAL_CONTENT_OWNER_ID)
        .execute()
    )

    # add new label to the list
    labels = get_asset_response.get('label')
    print(labels)
    if labels is None:
      labels = ['TRANSLATE_COLAB_PTM']
    else:
      labels.append('TRANSLATE_COLAB_PTM')

    print(labels)
    # patch asset with exising and new labels
    patch_asset_response = (
        youtube_partner_service.assets()
        .patch(
            assetId=asset_id,
            onBehalfOfContentOwner=EXTERNAL_CONTENT_OWNER_ID,
            body={'label': labels},
        )
        .execute()
    )

    print('Added new Label to Asset.')
else:
  print('Auto labeling of video asset is disabled.')


# Create SSML file from Caption File

Speech Synthesis Markup Language (SSML) is a format to allow the Google Cloud Text to Speech API to generate AI-generated speech. The SSML file is generated from the caption file.

In [None]:
for language in languageArray:
  if language.languageCode != CAPTION_ORIGINAL_LANGUAGE:
    print('Create SSML file for language: {}'.format(language.languageCode))
    language.ssml = PySSML()
    language.ssml.pause(str(VOICE_START_AT_MSEC) + 'ms')

    for sentence in language.srtfile:
      if is_text_in_brackets_or_empty(sentence.text):
        duration_ms = (
            int(sentence.duration.hours) * 3600000
            + int(sentence.duration.minutes) * 60000
            + int(sentence.duration.seconds) * 1000
            + int(sentence.duration.milliseconds)
        )
        duration_ms += 1000
        # workaround since pause limit is 10s
        while duration_ms > 10000:
          language.ssml.pause(str(10000) + 'ms')
          duration_ms -= 10000
        language.ssml.pause(str(duration_ms) + 'ms')
      else:
        if VOICE_SPEED == 'default':
          language.ssml.say(sentence.text)
        else:
          language.ssml.prosody(
              word=sentence.text, attributes={'rate': VOICE_SPEED}
          )
        language.ssml.pause(str(BREAK_BETWEEN_SENTENCE) + 'ms')
    print('Language done.')

print('All SSML files are generated.')


# Synthesize Translated Captions

Use the Google TextToSpeech API to synthesize all languages and store the files in the Google Cloud folder.

In [None]:
import base64
import os

text_to_speech_service = get_google_text_to_speech_api_service()

for language in languageArray:
  if language.languageCode != CAPTION_ORIGINAL_LANGUAGE:
    print(
        'Syntesice voice for language {} with voice gender {}'.format(
            language.languageCode, TRANSLATION_VOICE_GENDER
        )
    )
    body = {
        'input': {'ssml': language.ssml.ssml()},
        'voice': {
            'languageCode': language.languageCode,
            'ssmlGender': TRANSLATION_VOICE_GENDER,
        },
        'audioConfig': {
            'audioEncoding': 'MP3',
        },
    }
    response = text_to_speech_service.text().synthesize(body=body).execute()
    language.base64_mp3 = response['audioContent']
    print('Language done.')

    # write file to Google Drive as webm and use ffmpeg to convert to mp3
    filename_webm = (
        DRIVE_DIRECTORY_PATH + language.languageCode + '_translation.webm'
    )
    filename_mp3 = (
        DRIVE_DIRECTORY_PATH + language.languageCode + '_translation.mp3'
    )
    file = open(filename_webm, 'wb')
    file.write(base64.b64decode(language.base64_mp3))
    file.close()
    !ffmpeg -loglevel quiet -y -i '{filename_webm}' '{filename_mp3}'
    os.remove(filename_webm)

print('All Languages synthesized.')

# Visualizes Results

Load the YouTube video (does not work if embedding is not allowed) and all synthesized audio tracks to quality check the results. If the results are good, feel free to navigate to your configured Cloud folder to upload the additional audio tracks via YouTube Multi Audio to your existing YouTube video. Please check this [Help Center](https://support.google.com/youtube/answer/13338784?hl=en) to understand how you can add additional audio tracks to your video.

In [None]:
import io
import IPython.display as display

html = """
<html>
  <body>
    <div style="width: 100%;">
      <div style="width: 650px; height: 400px; float: left;">
        <div id="player"></div>
      </div>
      <div style="margin-left: 650px; height: 400px;">
        <table border="1">
        <tr>
          <th>Language</th>
          <th>Audio</th>
          <th>Controls</th>
        </tr>
"""
for language in languageArray:
  if language.languageCode != CAPTION_ORIGINAL_LANGUAGE:
    html += """
    <tr>
      <th>{language}</th>
      <td><audio id="audio-{language}" controls="" src="data:audio/mpeg;base64,{base64_audio}"></audio></td>
      <td>
        <button onclick="play('{language}')">Play</button>
        <button onclick="pause('{language}')">Pause</button>
        <button onclick="rewind('{language}')">Rewind</button>
      </td>
    </tr>
    """.format(language=language.languageCode, base64_audio=language.base64_mp3)

html += (
    """
</table>
      </div>
    </div>
<script>
var tag = document.createElement('script');
tag.src = "https://www.youtube.com/iframe_api";
var firstScriptTag = document.getElementsByTagName('script')[0];
firstScriptTag.parentNode.insertBefore(tag, firstScriptTag);
var player;
function onYouTubeIframeAPIReady() {
  player = new YT.Player('player', {
    height: '390',
    width: '640',
    videoId: '"""
    + EXTERNAL_VIDEO_ID
    + """',
    playerVars: {
      'playsinline': 1
    },
    events: {
      'onReady': onPlayerReady,
      'onStateChange': onPlayerStateChange
    }
  });
}

function onPlayerReady(event) {
  console.log("onPlayerReady");
}
function onPlayerStateChange(event) {
  console.log("onPlayerStateChange");
}

function play(language) {
  rewind(language);
  //play audio
  document.getElementById("audio-" + language).play();
  //play video
  player.mute();
  player.playVideo();
}

function pause(language) {
  //pause audio
  document.getElementById("audio-" + language).pause();
  //pause video
  player.pauseVideo();
}
function rewind(language) {
  pause(language);
  //rewind audio
  document.getElementById("audio-" + language).currentTime = 0;
  //rewind video
  player.seekTo(0);
}
</script>
</body>
</html>
"""
)

display.display(display.HTML(html))