# VTT AI Translator
## About
This is a notebook to translate VTT subtitle / caption files fluently using the OpenAI API and webvtt. It can be switched to Deepseek easily in the future for more cost-efficient processing. Made by Connor Wright for Georgia Tech's Buzz Studios Filmmaking Club. 

## How to Use 
* Clone the repo
* Change the folder / file paths to the respective vtt
* Set a language using the ISO language code
* Put in an OpenAI API key (or ask for mine)
* Run all the cells

In [1]:
### pip installations
%pip install openai
%pip install webvtt-py

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [2]:
### Package imports
from openai import OpenAI
import os
import webvtt
import copy

In [None]:
### Setup variables
## Set file paths / language output 
folder_path = "/Users/connorwright/Downloads/GT.CS.CodeFiles/BuzzStudios/Assets/Subtitles/"
vtt_name = "antr-English.vtt"
trans_lang = "es"
language = "Spanish"

## Set API Key 
api_key = ""

vtt_path = os.path.join(folder_path, vtt_name)

In [61]:
### Turn original captions into single string for GPT input 
captions_list = []
captions = []
captions_2 = []

curr_chars = 0
max_tokens = 32000 # 4o-mini limit is 16000 tokens. 4 chars per token. Divide by 2 for safety

vtt = webvtt.read(vtt_path)

#print (vtt)
print (vtt[1].text.replace("\n", " ~ "))
print (vtt[2])
print (vtt[3])

for caption in webvtt.read(vtt_path):
    #print(caption.start)  # start timestamp in text format
    #print(caption.end)  # end timestamp in text format
    #print(caption.text)  # caption text
    #print(caption.voice)  # voice span if present
    '''
    curr_chars += len(caption.text)
    curr_tokens = curr_chars / 4
    
    captions.append(caption.text)
    
    if (curr_tokens > max_tokens):
        captions_list.append(copy.deepcopy(captions))
        captions.clear()
        curr_chars = 0
    '''
    
    #if caption.text.strip():  # avoid blank lines
    #caption_text = caption.text.strip()
    #if "\n" in caption.text:
        #print(caption.text) 
    '''
    caption.text = caption.text.replace("\n", " ~ ")
    caption_text = caption.text

    curr_chars += len(caption_text)
    curr_tokens = curr_chars / 4
    captions.append(caption_text)

    if curr_tokens > max_tokens:
        captions_list.append(copy.deepcopy(captions))
        captions = []
        curr_chars = 0
    '''
    caption.text = caption.text.replace("\n", " ~ ")
    caption_text = caption.text
    captions_2.append(caption_text)
    


if captions:
    captions_list.append(copy.deepcopy(captions))

captions_list = [
    "\n".join(c) if isinstance(c, list) else str(c)
    for c in captions_list
]

#print(captions_list[0])

captions_2 = " | ".join(captions_2) if isinstance(captions_2, list) else str(captions_2)


captions = captions_2
print(captions_2)



[Aaron] Sure you're not drinking ~ that juice tonight, Devin?
00:01:37.708 00:01:39.000 [Devin] That was a one-time thing.
00:01:39.000 00:01:40.708 [Devin] Graham, you want any?
[”Jazz Opener/Graham’s Theme” by Joshua Ancrademption] | [Aaron] Sure you're not drinking ~ that juice tonight, Devin? | [Devin] That was a one-time thing. | [Devin] Graham, you want any? | [Graham] Good, but thanks. | So, what's your take on ~ the project, Graham? | Man, I don’t know. | I just don't get why they expect us to ~ implement a full datapath in three weeks. | Me neither. | I'm just trying to keep my grade! | Please, keep that project out of here tonight. | I have no clue what a datapath is, | and honestly, I’m better off not knowing. | [Aaron] Fair enough. | How’s your weekend been, anyway? | I mean, it could’ve been worse. | I spent the week studying for that thermo ~ exam and practicing my batting. | But speaking of, I’d better ~ see you all at the game Friday. | My team has worked our ass off to

In [62]:
### Setup OpenAI client and context
#client = OpenAI(api_key="", base_url="https://api.deepseek.com")
client = OpenAI(api_key=api_key)

system_message = f"""You are a professional subtitle translator. \
            You will only receive a string transcription of a vtt file containing subtitles in English. \
            You will only output a {language} translation of the subtitles and bracketed actions. \
            Do not add anything else to your reply.\
            Do not merge sentences, translate each line individually. \
            Return the translated subtitles in the same order and length as the input. \
            Your steps are as follows: \
            1. Parse the input subtitles \
            2. Translate each line into {language} with language code {trans_lang}. Do not change or remove any '~' or '|' character. If there is a '~' or a '|' mid-sentence, keep it mid-sentence. \
            3. Alter the translated subtitles into more fluent sentences \
            4. Use the setResult method to output the translated subtitles as a string[].
"""

'''
responses = []
for captions in captions_list:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": system_message},
            {"role": "user", "content": captions}
        ]
    )
    responses.append(response)
'''
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": system_message},
        {"role": "user", "content": captions}
    ]
)
    
print(response)

ChatCompletion(id='chatcmpl-BZQbypcPFKQY0gSSQ05iPw2VU5ev5', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='["Jazz Opener/Tema de Graham" de Joshua Ancrademption] | [Aaron] ¿Seguramente no estás bebiendo ~ ese jugo esta noche, Devin? | [Devin] Eso fue algo que pasó una vez. | [Devin] Graham, ¿quieres algo? | [Graham] Estoy bien, pero gracias. | Entonces, ¿cuál es tu opinión sobre ~ el proyecto, Graham? | Hombre, no lo sé. | Simplemente no entiendo por qué esperan que ~ implementemos un datapath completo en tres semanas. | Yo tampoco. | Solo estoy tratando de mantener mi nota. | Por favor, mantén ese proyecto fuera de aquí esta noche. | No tengo ni idea de lo que es un datapath, | y honestamente, es mejor que no lo sepa. | [Aaron] Bien. | ¿Cómo ha sido tu fin de semana, de todos modos? | Quiero decir, podría haber sido peor. | Pasé la semana estudiando para ese examen de termo ~ y practicando mi bateo. | Pero hablando de eso, mejor ~ 

In [63]:
### Save translated captions as new vtt file 

## Get GPT response as string, split into list
'''
full_str = ''
for response in responses:
    trans_str = str(response.choices[0].message.content)
    full_str += trans_str

trans_list = full_str.split("\n")
print(trans_list)

## Edit caption files to match translations, accounting for multi-line texts 
trans_vtt = webvtt.read(vtt_path)
line_index = 0

for i, caption in enumerate(trans_vtt):
    #num_lines = len(caption.text.split("\n"))
    #trans_lines = trans_list[line_index:line_index+num_lines]
    caption.text = trans_list[i].replace(" ~ ", "\n")
    #caption.text = "\n".join(trans_lines)
    #line_index += num_lines
'''
'''
for i, caption in enumerate(trans_vtt):
    caption.text = trans_list[i] if i < len(trans_list) else "[Missing translation]"
'''
'''
## Save as new file w/ specified language name 
trans_filename = str(os.path.splitext(vtt_name)[0]) + '-' + str(trans_lang) + '.vtt'
trans_path = os.path.join(folder_path, trans_filename)
trans_vtt.save(trans_path)
'''

### Save translated captions as new vtt file 

## Get GPT response as string, split into list
trans_str = str(response.choices[0].message.content)
trans_list = trans_str.split(" | ")
print(trans_list)

## Edit caption files to match translations, accounting for multi-line texts 
trans_vtt = webvtt.read(vtt_path)
line_index = 0
for i, caption in enumerate(trans_vtt):
    num_lines = len(caption.text.split(" ~ "))
    trans_lines = trans_list[line_index:line_index+num_lines]
    caption.text = "\n".join(trans_lines).replace(" ~ ", "\n")
    line_index += num_lines
    #caption.text = trans_list[i].replace(" ~ ", "\n")

## Save as new file w/ specified language name 
trans_filename = str(os.path.splitext(vtt_name)[0]) + '-' + str(trans_lang) + '.vtt'
trans_path = os.path.join(folder_path, trans_filename)
trans_vtt.save(trans_path)

['["Jazz Opener/Tema de Graham" de Joshua Ancrademption]', '[Aaron] ¿Seguramente no estás bebiendo ~ ese jugo esta noche, Devin?', '[Devin] Eso fue algo que pasó una vez.', '[Devin] Graham, ¿quieres algo?', '[Graham] Estoy bien, pero gracias.', 'Entonces, ¿cuál es tu opinión sobre ~ el proyecto, Graham?', 'Hombre, no lo sé.', 'Simplemente no entiendo por qué esperan que ~ implementemos un datapath completo en tres semanas.', 'Yo tampoco.', 'Solo estoy tratando de mantener mi nota.', 'Por favor, mantén ese proyecto fuera de aquí esta noche.', 'No tengo ni idea de lo que es un datapath,', 'y honestamente, es mejor que no lo sepa.', '[Aaron] Bien.', '¿Cómo ha sido tu fin de semana, de todos modos?', 'Quiero decir, podría haber sido peor.', 'Pasé la semana estudiando para ese examen de termo ~ y practicando mi bateo.', 'Pero hablando de eso, mejor ~ los veo a todos en el juego del viernes.', 'Mi equipo ha trabajado muy duro para llegar a ~ los playoffs, y realmente puedo usar el apoyo.', '