## Model: Whisper

In [1]:
#%pip install git+https://github.com/openai/whisper.git
%pip install langchain-community --quiet
%pip install langchain-pinecone --quiet
%pip install ffmpeg-python --quiet
%pip install torch --quiet

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [8]:
import os
import re
import whisper
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
from langchain.llms import HuggingFacePipeline
from langchain.chains import RetrievalQA
from langchain.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import Pinecone as LangchainPinecone
from pinecone import Pinecone, ServerlessSpec

In [6]:
model_whisper = whisper.load_model("base")

# STEP 3: Clean the transcribed question
def clean_question(raw_text: str) -> str:
    fillers = ["uh", "um", "you know", "like", "i mean", "so", "well"]
    pattern = r'\b(?:' + '|'.join(fillers) + r')\b'
    cleaned = re.sub(pattern, '', raw_text, flags=re.IGNORECASE)
    cleaned = re.sub(r'\s+', ' ', cleaned).strip()
    cleaned = cleaned[0].upper() + cleaned[1:] if cleaned else cleaned
    if not cleaned.endswith("?") and any(w in cleaned.lower() for w in ["what", "how", "why", "when", "where", "who"]):
        cleaned += "?"
    return cleaned

In [10]:
PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
pc = Pinecone(api_key=PINECONE_API_KEY)
index = pc.Index("youtube-transcripts")
embeddings = HuggingFaceEmbeddings(model_name="BAAI/bge-small-en-v1.5")
vectorstore = LangchainPinecone(index=index, embedding=embeddings, text_key="text")
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 5})

In [12]:
from huggingface_hub import login
HF_TOKEN = os.getenv("HF_TOKEN")
login(token=HF_TOKEN)

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [15]:
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Meta-Llama-3-8B-Instruct",
    device_map="auto",
    trust_remote_code=True
)

llm_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.9
)
llm = HuggingFacePipeline(pipeline=llm_pipeline)

qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever, return_source_documents=True)

tokenizer_config.json:   0%|          | 0.00/51.0k [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/73.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/654 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Fetching 4 files:   0%|          | 0/4 [00:00<?, ?it/s]

model-00001-of-00004.safetensors:   0%|          | 0.00/4.98G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/1.17G [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/4.92G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/187 [00:00<?, ?B/s]

Some parameters are on the meta device because they were offloaded to the cpu and disk.
Device set to use cpu
  llm = HuggingFacePipeline(pipeline=llm_pipeline)


In [16]:
audio_path ="../audio_files/7WJ6lmxa1WQ.mp3"  # Replace with your actual file
assert os.path.exists(audio_path), "Audio file not found."

result = model_whisper.transcribe(audio_path, language="en", verbose=True)
raw_question = result["text"]
print("🔍 Raw:", raw_question)



[00:00.000 --> 00:09.740]  Hello, everyone. Welcome to another episode of our podcast. Today we're going to dive
[00:09.740 --> 00:15.280]  again into an eGente AI framework and I have another technology leader with me. Ronnie
[00:15.280 --> 00:19.960]  Zeus joins us from Israel. Ronnie, welcome. How are you?
[00:19.960 --> 00:22.920]  Good, good. Thank you. How are you doing?
[00:22.920 --> 00:27.600]  Yes, great. And we're going to dive into awesome stuff today. He's a leader building
[00:27.600 --> 00:31.920]  products for AI ops for a long time bringing a lot of new innovations. I'm really excited
[00:31.920 --> 00:37.140]  to have him on this podcast. Before we dive in, Ronnie, tell me a little bit about the you
[00:37.140 --> 00:43.080]  use Gini TVI in your world. Of course you do. You're making new products, but anything
[00:43.080 --> 00:45.480]  interesting that you do with it.
[00:45.480 --> 00:55.040]  Every day, they are all in our, both exploring different technologies in

In [17]:
cleaned_question = clean_question(raw_question)
print("🧼 Cleaned:", cleaned_question)

🧼 Cleaned: Hello, everyone. Welcome to another episode of our podcast. Today we're going to dive again into an eGente AI framework and I have another technology leader with me. Ronnie Zeus joins us from Israel. Ronnie, welcome. How are you? Good, good. Thank you. How are you doing? Yes, great. And we're going to dive into awesome stuff today. He's a leader building products for AI ops for a long time bringing a lot of new innovations. I'm really excited to have him on this podcast. Before we dive in, Ronnie, tell me a little bit about the you use Gini TVI in your world. Of course you do. You're making new products, but anything interesting that you do with it. Every day, they are all in our, both exploring different technologies in Gini and personally using. It's really helped me to arrange my thoughts and put together my thoughts into structure ways to make sure that the messages I'm trying to convey are clear and concise. it's super helpful for me. Yeah, that's ready. I started to us

In [19]:
response = qa_chain.invoke("What is the use of Gini?")
print("Answer:", response["result"])

Found document with no `text` key. Skipping.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


KeyboardInterrupt: 