This is a notebook to translate VTT files fluently using the OpenAI API. Can be switched to Deepseek easily for more cost-efficient processing. 

All you have to do is clone the repo, change the folder / file paths to the respective vtt, set a language, and put in an OpenAI API key.

In [9]:
### pip installations
%pip install openai
%pip install webvtt-py



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m25.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.
Collecting webvtt-py
  Downloading webvtt_py-0.5.1-py3-none-any.whl (19 kB)
Installing collected packages: webvtt-py
Successfully installed webvtt-py-0.5.1

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m25.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [38]:
### Package imports
from openai import OpenAI
import os
import webvtt

In [69]:
### Setup variables
## Set file paths / language output 
folder_path = "/Users/connorwright/Downloads/GT.CS.CodeFiles/BuzzStudios/Assets/Subtitles/"
vtt_name = "frisbee-fables-cc.vtt"
trans_lang = "es"

## Set API Key 
api_key = ""

vtt_path = os.path.join(folder_path, vtt_name)

In [70]:
### Turn original captions into single string for GPT input 
captions = []
for caption in webvtt.read(vtt_path):
    #print(caption.start)  # start timestamp in text format
    #print(caption.end)  # end timestamp in text format
    #print(caption.text)  # caption text
    #print(caption.voice)  # voice span if present
    captions.append(caption.text)

captions = "\n".join(captions) if isinstance(captions, list) else str(captions)
print(captions)


['[Beat heavy, tense music playing]', 'ALFRED: The dragon draws back,', 'releasing a terrible roar', 'as it prepares to let out its fire breath.', 'You’re battered but you’re still standing.', 'You can do this.', 'The dragon’s horde glimmers in the darkness of the room.', 'What do you do?', 'EMMA: I draw my sword and aim for the dragon’s tail.', 'ALFRED: [muttering] Tail, okay.', 'MICHELLE: I use my wizard staff to...repel the dragon’s fire!', 'ALFRED: Okay, okay.', '[clatter of dice being rolled]', 'Okay!', 'The dragon is almost defeated.', 'As you prepare to attack-', '[sound of record scratch]', 'LEO: Hey,', 'what are you freaks doing?', 'ALFRED: Oh, uh.', 'Hey...Leo.', 'EMMA: We were about to beat the dragon before you got here.', 'LEO: [scoffing] No you weren’t.', '[sound of DND board being flipped]\nALFRED: HEY!', 'LEO: [mockingly] Are you mad?', 'You big baby!', 'This is why you can never make the ultimate frisbee team!', 'ALFRED: [stammering] Well, uh-', 'you’re...not gonna mak

In [71]:
### Setup OpenAI client and context
#client = OpenAI(api_key="", base_url="https://api.deepseek.com")
client = OpenAI(api_key=api_key)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a professional subtitle translator. \
            You will only receive a string transcription of a vtt file containing subtitles in English. \
            You will only output a translation of the subtitles and bracketed actions. \
            Do not add anything else to your reply.\
            Do not merge sentences, translate each line individually. \
            Return the translated subtitles in the same order and length as the input. \
            Your steps are as follows: \
            1. Parse the input subtitles \
            2. Translate the input subtitles into {trans_lang} \
            3. Alter the translated subtitles into more fluent sentences \
            4. Use the setResult method to output the translated subtitles as a string[] \
         "},
        {"role": "user", "content": captions}
    ]
)

print(response)

ChatCompletion(id='chatcmpl-AwEMutRKy6D9PD9hX3LQNIFRfEKys', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='[Beat heavy, tense music playing]  \nALFRED: El dragón retrocede,  \nliberando un terrible rugido  \nmientras se prepara para soltar su aliento de fuego.  \nEstás golpeado, pero aún de pie.  \nPuedes hacerlo.  \nLa horda del dragón brilla en la oscuridad de la habitación.  \n¿Qué haces?  \nEMMA: Desenfundo mi espada y apunto a la cola del dragón.  \nALFRED: [murmurando] Cola, está bien.  \nMICHELLE: ¡Yo uso mi bastón de mago para...repeler el fuego del dragón!  \nALFRED: ¡Está bien, está bien!  \n[sonido de dados siendo lanzados]  \n¡Está bien!  \nEl dragón está casi derrotado.  \nMientras te preparas para atacar-  \n[sonido de rasguño en el disco]  \nLEO: Oye,  \n¿qué están haciendo ustedes, raros?  \nALFRED: Oh, eh.  \nHola...Leo.  \nEMMA: Estábamos a punto de vencer al dragón antes de que llegaras.  \nLEO: [despectivamente] 

In [72]:
### Save translated captions as new vtt file 

## Get GPT response as string, split into list
trans_str = str(response.choices[0].message.content)
trans_list = trans_str.split("\n")
print(trans_list)

## Edit caption files to match translations, accounting for multi-line texts 
trans_vtt = webvtt.read(vtt_path)
line_index = 0
for caption in trans_vtt:
    num_lines = len(caption.text.split("\n"))
    trans_lines = trans_list[line_index:line_index+num_lines]
    caption.text = "\n".join(trans_lines)
    line_index += num_lines

## Save as new file w/ specified language name 
trans_filename = str(os.path.splitext(vtt_name)[0]) + '-' + str(trans_lang) + '.vtt'
trans_path = os.path.join(folder_path, trans_filename)
trans_vtt.save(trans_path)

['[Beat heavy, tense music playing]  ', 'ALFRED: El dragón retrocede,  ', 'liberando un terrible rugido  ', 'mientras se prepara para soltar su aliento de fuego.  ', 'Estás golpeado, pero aún de pie.  ', 'Puedes hacerlo.  ', 'La horda del dragón brilla en la oscuridad de la habitación.  ', '¿Qué haces?  ', 'EMMA: Desenfundo mi espada y apunto a la cola del dragón.  ', 'ALFRED: [murmurando] Cola, está bien.  ', 'MICHELLE: ¡Yo uso mi bastón de mago para...repeler el fuego del dragón!  ', 'ALFRED: ¡Está bien, está bien!  ', '[sonido de dados siendo lanzados]  ', '¡Está bien!  ', 'El dragón está casi derrotado.  ', 'Mientras te preparas para atacar-  ', '[sonido de rasguño en el disco]  ', 'LEO: Oye,  ', '¿qué están haciendo ustedes, raros?  ', 'ALFRED: Oh, eh.  ', 'Hola...Leo.  ', 'EMMA: Estábamos a punto de vencer al dragón antes de que llegaras.  ', 'LEO: [despectivamente] No lo estaban.  ', '[sonido de tablero de DND siendo volteado]  ', 'ALFRED: ¡HEY!  ', 'LEO: [burlándose] ¿Estás eno