[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/henalon0)

# **LLaMA, LangChain & Pinecone - Chat with Youtube Videos**

In [None]:
from IPython import display

## **Step 1: Install All the Required Pakages**

In [None]:
# Whisper
!python -m pip install -U openai-whisper -qq
!python -m pip install -U yt-dlp -qq
!sudo apt update -qq && sudo apt install ffmpeg -qq

# LangChain, LLaMA, Pinecone
!python -m pip install langchain -qq
!python -m pip install pinecone-client -qq
!python -m pip install sentence_transformers -qq
!python -m pip install xformers -qq
!python -m pip install bitsandbytes -qq
!python -m pip install accelerate -qq
!python -m pip install transformers -qq

display.clear_output()

#**Step 2: Import All the Required Libraries**

In [None]:
import warnings
warnings.filterwarnings('ignore')

import sys
import os
from glob import glob
import yt_dlp
import whisper

from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Pinecone
import pinecone
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import pipeline

from langchain import HuggingFacePipeline, PromptTemplate
from langchain.chains import RetrievalQA

#**Step 3: Download, Transcribe and Load the Data**

In [None]:
video_url = "https://www.youtube.com/watch?v=U_s0ekwPK5g"

with yt_dlp.YoutubeDL({"extract_audio": True, "format": "bestaudio", "outtmpl": "%(title)s.mp3"}) as video:
    info_dict = video.extract_info(video_url, download=True)
    video_title = info_dict["title"]
    video.download(video_url)

file_name = glob("*.mp3")[0]

[youtube] Extracting URL: https://www.youtube.com/watch?v=U_s0ekwPK5g
[youtube] U_s0ekwPK5g: Downloading webpage
[youtube] U_s0ekwPK5g: Downloading ios player API JSON
[youtube] U_s0ekwPK5g: Downloading android player API JSON
[youtube] U_s0ekwPK5g: Downloading m3u8 information
[info] U_s0ekwPK5g: Downloading 1 format(s): 251
[download] Destination: 81 Minutes of Business Advice For Every Entrepreneur.mp3
[download] 100% of   63.51MiB in 00:00:01 at 44.76MiB/s  
[youtube] Extracting URL: https://www.youtube.com/watch?v=U_s0ekwPK5g
[youtube] U_s0ekwPK5g: Downloading webpage
[youtube] U_s0ekwPK5g: Downloading ios player API JSON
[youtube] U_s0ekwPK5g: Downloading android player API JSON
[youtube] U_s0ekwPK5g: Downloading m3u8 information
[info] U_s0ekwPK5g: Downloading 1 format(s): 251
[download] 81 Minutes of Business Advice For Every Entrepreneur.mp3 has already been downloaded
[download] 100% of   63.51MiB


In [None]:
model = whisper.load_model("medium")
result = model.transcribe(file_name)

with open(file_name.replace(".mp3", ".txt"), "w", encoding="utf-8") as file:
    file.write(result["text"])

100%|█████████████████████████████████████| 1.42G/1.42G [00:21<00:00, 69.7MiB/s]


In [None]:
loader = TextLoader(file_name.replace(".mp3", ".txt"))
data = loader.load()

#**Step 4: Split the Text into Chunks**

In [None]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=20)
docs = text_splitter.split_documents(data)

In [None]:
len(docs)

210

In [None]:
docs[0]

Document(page_content="My audience asked me questions for 10 hours and here are the best moments. Let's say my only goal was just to get rich AF. What is rich AF? Just so we define it. A hundred million. Why not? It's a good number. Cool. You're willing to work hard, but you have limited skills and experience. What business would you start like right now in 2023? And I have no money, right? I would probably build a boring business in services. And I would probably pick something around the body. I wouldn't just start", metadata={'source': '81 Minutes of Business Advice For Every Entrepreneur.txt'})

#**Step 5: Setup the Environment**

In [None]:
PINECONE_API_KEY = os.environ.get("PINECONE_API_KEY", "PUT-YOUR-API-KEY-HERE")
PINECONE_API_ENV = os.environ.get("PINECONE_API_ENV", "PUT-YOUR-ENV-KEY-HERE")

#**Step 6: Downlaod the Embeddings**

In [None]:
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

#**Step 7: Initializing the Pinecone**

In [None]:
# initialize pinecone
pinecone.init(
    api_key=PINECONE_API_KEY,     # find at app.pinecone.io
    environment=PINECONE_API_ENV  # next to api key in console
)

index_name = "langchain-llama2"   # put in the name of your pinecone index here

#**Step 8: Create Embeddings for Each of the Text Chunk**

In [None]:
docsearch = Pinecone.from_texts([t.page_content for t in docs], embeddings, index_name=index_name)

# If you already have an index, you can load it like this


In [None]:
# docsearch = Pinecone.from_existing_index(index_name, embeddings)

#**Step 9: Similarity Search**

In [None]:
query = "What is the best way to generate leads?"

In [None]:
docs = docsearch.similarity_search(query, k=2)

In [None]:
docs

[Document(page_content="question that influences all aspects of the business and is also a mental exercise that people can feel like they're making progress on their offer a lot faster. Leads is a lot more about activity and implementation. And so people can get ideas from like the lead magnet chapter, but then you have to start reaching out to people, posting content or running ads to start getting leads in the door. And so there's a little bit bigger of a hurdle with leads than there is offers, but it's also the"),
 Document(page_content="get 70% of people who read the book to get leads versus 10% of people to get leads. If I just cut out some of the steps and made assumptions about their skill level, the last 10 or 20%, the people were experts are still going to benefit from. I'm just also going to include turn on your computer. Here's how you do that. And from the software perspective with what you're doing, that's also how you get a way larger percentage of your clients to activat

# **Step 10: Creating a Llama2 Model Wrapper**

In [None]:
from huggingface_hub import notebook_login
import torch

In [None]:
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [None]:
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf", use_auth_token=True)

tokenizer_config.json:   0%|          | 0.00/1.62k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

In [None]:
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-chat-hf",
                                             device_map='auto',
                                             torch_dtype=torch.float16,
                                             use_auth_token=True,
                                             load_in_8bit=True
                                             )

config.json:   0%|          | 0.00/614 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

In [None]:
pipe = pipeline("text-generation",
                model=model,
                tokenizer=tokenizer,
                torch_dtype=torch.bfloat16,
                device_map="auto",
                max_new_tokens=512,
                do_sample=True,
                top_k=30,
                num_return_sequences=1,
                eos_token_id=tokenizer.eos_token_id
                )

In [None]:
llm = HuggingFacePipeline(pipeline=pipe, model_kwargs={'temperature': 0.1})

# **Step 11: Create a Prompt Template**

In [None]:
SYSTEM_PROMPT = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer."""

In [None]:
B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<>\n", "\n<>\n\n"

In [None]:
SYSTEM_PROMPT = B_SYS + SYSTEM_PROMPT + E_SYS

In [None]:
instruction = """
{context}

Question: {question}
"""

In [None]:
template = B_INST + SYSTEM_PROMPT + instruction + E_INST

In [None]:
prompt = PromptTemplate(template=template, input_variables=["context", "question"])

In [None]:
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=docsearch.as_retriever(search_kwargs={"k": 2}),
    return_source_documents=True,
    chain_type_kwargs={"prompt": prompt},
)

In [None]:
while True:
    query = input("Query: ")

    if query == "q":
        break
    else:
        result = qa_chain(query)
        print("\nAnswer: " + result['result'].strip().replace(". ",".\n"))
        print("\n" + "-" * 10 + "\n")

Query: What is the best way to generate leads?

Answer: Thank you for providing the context.
Based on the information provided, the best way to generate leads is a multi-step process that involves several activities and strategies.
Here are some of the key actions that can help generate leads:

1.
Create landing pages: Designing landing pages that are optimized for conversion can help attract potential customers and encourage them to take action.
2.
Write compelling copy: Crafting persuasive and engaging copy can help grab the attention of potential customers and convince them to take action.
3.
Develop follow-up sequences: Having a well-defined follow-up sequence can help nurture leads and turn them into paying customers.
4.
Work the leads: Actively engaging with leads and providing them with valuable content and offers can help build trust and establish a relationship.
5.
Use scheduling tools: Utilizing scheduling tools can help automate the lead generation process and ensure that le