In [1]:
!pip install git+https://github.com/openai/whisper.git

Collecting git+https://github.com/openai/whisper.git
  Cloning https://github.com/openai/whisper.git to /tmp/pip-req-build-2agx3m8d
  Running command git clone --filter=blob:none --quiet https://github.com/openai/whisper.git /tmp/pip-req-build-2agx3m8d
  Resolved https://github.com/openai/whisper.git to commit 90db0de1896c23cbfaf0c58bc2d30665f709f170
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


In [3]:
!pip install transformers



In [4]:
# Import required libraries
import whisper
import warnings
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, MarianMTModel, MarianTokenizer
warnings.filterwarnings('ignore')

In [5]:
# Load the Whisper model (choose 'small', 'medium', or 'large' for better accuracy)
model = whisper.load_model("small")

In [6]:
# Translation models and tokenizers setup
nllb_model_name = "facebook/nllb-200-distilled-1.3B"
nllb_tokenizer = AutoTokenizer.from_pretrained(nllb_model_name)
nllb_model = AutoModelForSeq2SeqLM.from_pretrained(nllb_model_name)

In [7]:
# Helper function to split long text into chunks
def split_text_into_chunks(text, max_tokens, tokenizer):
    tokens = tokenizer.encode(text)
    chunks = []
    for i in range(0, len(tokens), max_tokens):
        chunk_tokens = tokens[i:i + max_tokens]
        chunks.append(tokenizer.decode(chunk_tokens, skip_special_tokens=True))
    return chunks

In [8]:
# Helper function to translate chunks of text
def translate_chunks(chunks, translation_function):
    translated_chunks = [translation_function(chunk) for chunk in chunks]
    return " ".join(translated_chunks)

In [9]:
# Wrapper function for large text translation
def translate_large_text(text, translation_function, tokenizer, max_tokens=512):
    chunks = split_text_into_chunks(text, max_tokens, tokenizer)
    translated_text = translate_chunks(chunks, translation_function)
    return translated_text

In [10]:
def english_to_telugu(text):
    src_lang = "eng_Latn"
    tgt_lang = "tel_Telu"
    nllb_tokenizer.src_lang = src_lang
    inputs = nllb_tokenizer(text, return_tensors="pt", padding=True)
    outputs = nllb_model.generate(**inputs, forced_bos_token_id=nllb_tokenizer.convert_tokens_to_ids(tgt_lang))
    return nllb_tokenizer.decode(outputs[0], skip_special_tokens=True)

In [11]:
def transcribe_and_translate(audio_path):
    # Step 1: Transcribe the audio file
    print("Transcribing audio...")
    result = model.transcribe(audio_path)
    transcribed_text = result["text"]
    print("Transcription Complete!")
    print("Transcribed Text:", transcribed_text)

    # Step 2: User Translation Choice
    print("\nSelect a translation option:")
    print("1. English to Telugu")

    choice = int(input("Enter your choice as 1: "))

    # Step 3: Perform Translation
    if choice == 1:
      translated_text = translate_large_text(transcribed_text, english_to_telugu, nllb_tokenizer)
    else:
        print("Invalid choice. Please try again.")
        return

    # Step 4: Display Translated Text
    print("Translated Text:")
    print(translated_text)

In [12]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [13]:
# Specify the audio file path
audio_path = "/content/drive/MyDrive/My Projects/Trizen/CBPT1SGA/audio.wav"  # Replace with your file path

In [14]:
print(audio_path)

/content/drive/MyDrive/My Projects/Trizen/CBPT1SGA/audio.wav


In [15]:
# Run the pipeline
transcribe_and_translate(audio_path)

Transcribing audio...
Transcription Complete!
Transcribed Text:  Today I'm going to teach you one expression, just one, because I want you to memorize it and start using it every time you speak English. This expression is extremely useful and native speakers use it all the time. Every day we come across something that is very easy or obvious to do or understand. You don't even need to think about it. It's obvious that water is wet, then the sun is hot. And it's obvious that a cat will always land on its feet. Try your own risk. So when you don't have to consider something for a long time, it's just because it's a no-brainer. It's a no-brainer that you will improve your English if you follow me here.

Select a translation option:
1. English to Telugu
Enter your choice as 1: 1
Translated Text:
ఈ రోజు నేను మీకు ఒక వ్యక్తీకరణ నేర్పబోతున్నాను, ఒక్కటి మాత్రమే, ఎందుకంటే మీరు ఆంగ్లంలో మాట్లాడే ప్రతిసారీ దాన్ని గుర్తుంచుకోవాలని మరియు ఉపయోగించడం ప్రారంభించాలని నేను కోరుకుంటున్నాను. ఈ వ్యక్తీకరణ 