# How to Translate a video

## Prepare a glossary file

If you want to use glossary when translating, place all of them into a directory. The default is `data`. 
Regardless how many files you have, each file needs to be a `csv` file with at least 4 columns. 

| Term | Translation | Definition | Example | 
| --- | --- | --- | --- | 
| ... | ... | ... | ... |


You can add more columns to provide some other information, but these 4 are required for this program to run.

## Download or prepare an audio file
If you have something you wish to translate, replce the path below. 
If not, you can download a YouTube video to start with. 

In [None]:
from gat.utils import download_audio

audio_file = download_audio("url-to-youtube-video", output_dir="data")
# audio_file = "relative-path-to-your-video-or-audio-file"

## Transcribe the audio

There's two ways to transcribe the audio. 
One is locally run faster-whisper, the other is calling the OpenAI whisper API. 
Here, I will show you how to use the locally run model. 

If your audio have some terms that you wish to be transcribed to, you can do so by providing a prompt. 

In [None]:
from gat.transcription import get_whisper_prompt, transcribe_whisper 

whisper_prompt = "You might encounter words like: " + get_whisper_prompt("data")
start, end, text = transcribe_whisper(audio_file, model_size="large-v3", whisper_prompt=whisper_prompt)

## Translate the text

Here, there's many different translators. All of them are backed by LLMs. 
I will demonstrate how to use Ollama, but OpenAI and other OpenAI-compatible services are included as well.
The only difference is that you need to provide your own API key in the `.env` file.

Before you run the next block of code, make sure you have Ollama running with the model you wish to use downloaded.

In [None]:
from gat.glossary_matcher import GlossaryMatcher
from gat.translators import OllamaTranslator 

gm = GlossaryMatcher()
gm.load_from_dir("data")

translator = OllamaTranslator(matcher=gm, model="qwen2.5")
translated = translator.translate_sentences(text, n_history=3)

## What's next

You can now do whatever you want with your translated texts. 
For instance, you can save them in a srt file for your video.

In [None]:
from gat.utils import save_srt 

save_srt("your_subtitle_file.srt", start, end, translated)