# Transcription Task

This is an example Python notebook for running transcription tasks using [Camb.ai](https://camb.ai)'s API.

# Index

- [Optional: Loading API Key from .env file](#optional-loading-api-key-from-env-file)
- [Transcribe Method](#transcribe-method)
  - [Optional: Transcribing a Public video (i.e. YouTube)](#optional-transcribing-a-public-video-ie-youtube)
- [Individual Methods](#individual-methods)
  - [Create Transcription Task](#create-transcription-task)
  - [Get Transcription Status](#get-transcription-status)
  - [Get Transcription Result](#get-dubbed-result)

In [None]:
# !pip install cambai

To get started, make sure you import both the `CambAI` class from the `cambai` module.

Ensure that you have the Camb AI API Key from https://studio.camb.ai

In [3]:
import os
from cambai import CambAI

In [5]:
client = CambAI(
    api_key=os.environ.get("CAMB_API_KEY") # Get your API key from https://studio.camb.ai
)

### Optional: Loading API Key from .env file

While you can provide an `api_key` as an argument, we recommend using [`python-dotenv`](https://pypi.org/project/python-dotenv/) to add `CAMB_API_KEY="Your API Key"` to your `.env` file to prevent your API Key from being stored in source control.

In [None]:
# !pip install python-dotenv

In [None]:
from dotenv import load_dotenv

load_dotenv()

client = CambAI(
    api_key=os.environ.get("CAMB_API_KEY") # Get your API key from https://studio.camb.ai
)

----

To get the numbers for the list of all available languages, you can use the `get_languages()` method.

You can also save the list of languages to a file by setting the `write_to_file` argument to `True`.

In [None]:
client.get_languages("source", write_to_file=True)
# client.get_languages("target", write_to_file=True)

---
---

# Transcribe Method

You can call the `transcribe()` method to perform the entire transcribe process.<br>
From calling the API, to getting the transcription result.

In [None]:
transcription = client.transcribe(
    "path/to/audio/file",
    language=1,        # Get the language code from the file generated in the previous step
    save_to_file=True, # Optionally, save the transcription to a file
    debug=True,        # Optionally, you can set debug to True to see the print statements
)

**NOTE:** The transcription process can take a while to complete, so depending on how long the video/audio is, you may have to set the `polling_interval` attribute to a higher value.

---

### Optional: Transcribing a Public video (i.e. YouTube)

To transcribe a public video from YouTube or any other website, use [`yt-dlp`](https://pypi.org/project/yt-dlp/) to download it first

In [None]:
# !pip install yt-dlp

In [9]:
from yt_dlp import YoutubeDL

URLS = ["https://youtu.be/rlK-BySadHk"]

ydl_opts = {
    "outtmpl": "video.%(ext)s",
    "format": "mp4",
    "overwrite": True,
}

with YoutubeDL(ydl_opts) as ydl:
    error_code = ydl.download(URLS)

transcription = client.transcribe(
    audio_file="video.mp4",
    language=1,        # Get the language code from the file generated in the previous step
    save_to_file=True, # Optionally, save the transcription to a file
    debug=True,        # Optionally, you can set debug to True to see the print statements
    polling_interval=60
)

[youtube] Extracting URL: https://youtu.be/rlK-BySadHk
[youtube] rlK-BySadHk: Downloading webpage
[youtube] rlK-BySadHk: Downloading ios player API JSON
[youtube] rlK-BySadHk: Downloading m3u8 information
[info] rlK-BySadHk: Downloading 1 format(s): 18
[download] Destination: video.mp4
[download] 100% of    3.11MiB in 00:00:02 at 1.35MiB/s     


Waiting 60 seconds before checking status again: 100%|██████████| 60/60 [01:00<00:00,  1.00s/s]


In [None]:
transcription

---
---

# Individual Methods

You can also access individual methods to have more control over the transcription task.

- [Create Transcription](#create-transcription-task)
- [Get Transcription Status](#get-transcription-status)
- [Get Transcription Result](#get-transcription-result)

## Create Transcription Task

This method creates a transcription task and returns the Run ID which is used to check the status of the transcription task.

In [None]:
task = client.create_transcription("path/to/audio", language=123)

task_id = task["task_id"]

In [20]:
task_id

'e9a58959-3e81-4bf1-94cb-9067297d60ec'

----

## Get Transcription Status

This method returns the status of the transcription. The status can be one of the following:

- SUCCESS
- PENDING
- TIMEOUT
- ERROR
- PAYMENT_REQUIRED

In [17]:
transcription_status = client.get_task_status("transcription", task_id)
# transcription_status = client.get_transcription_status(task_id)

status = transcription_status["status"]
run_id = transcription_status["run_id"]

In [18]:
status, run_id

('SUCCESS', 28173)

----

## Get Transcription Result

You can use this method to get the transcription result.<br>
If you wish to save the transcription to a file, set the `write_to_file` argument to `True`.

In [19]:
transcription = client.get_transcription_result(run_id, save_to_file=True)

----

To know more, check out Camb AI's [API Documentation](https://docs.camb.ai).