use these commands to install the required packages.

In [None]:
!pip install google-cloud-aiplatform
!pip install langchain
!pip install chromadb
!pip install pytube
!pip install youtube-transcript-api
!pip install gradio
from google.cloud import aiplatform

Your account should have the Vertex AI user role.

In [None]:
from google.colab import auth as google_auth
google_auth.authenticate_user()

In [None]:
import vertexai
PROJECT_ID = "gunas1" #enter your project id here
vertexai.init(project=PROJECT_ID)

Use these commands to import the required libraries.

In [None]:
from langchain.document_loaders import YoutubeLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.llms import VertexAI

Initialize the Vertex AI LLM model                                         
Use this code snippet to initialize the Vertex AI LLM model. This initializes "llm" with the Text-Bison model of Vertex AI

In [None]:
llm = VertexAI(
model_name="text-bison@001",
max_output_tokens=256,
temperature=0.1, #The temperature is a parameter that controls the randomness of the LLM's output. A higher temperature will result in more creative and imaginative text, while a lower temperature will result in more accurate and factual text.
top_p=0.8,
top_k=40,
verbose=True,
)

Initiate Embeddings

In [None]:
from langchain.embeddings import VertexAIEmbeddings

# Embedding
EMBEDDING_QPM = 100 #Query per minute
EMBEDDING_NUM_BATCH =5 #the batch output value
embeddings = VertexAIEmbeddings(
    requests_per_minute=EMBEDDING_QPM,
    num_instances_per_batch=EMBEDDING_NUM_BATCH,
)

Load and chunk the video

In [27]:
loader = YoutubeLoader.from_youtube_url("https://youtu.be/8foQERR0mc0?si=iG0aU2D8ld3G89ZR", add_video_info=True) #you can replace with any youtube link.
result = loader.load()

Split the video

In [28]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=0) #chunk size will tell each embedding has how many characters.
docs = text_splitter.split_documents(result)
print(f"# of documents = {len(docs)}")

# of documents = 12


Store and retrieve

Store your documents
For this exercise, we are using ChromaDB. You can also use Vertex AI Vector Search. Store your documents and index them in ChromaDB as a vector store. ChromaDB is used to store and retrieve vector embeddings for use with LLMs and to perform semantic search over data.




In [29]:
db = Chroma.from_documents(docs, embeddings)
retriever = db.as_retriever(search_type="similarity", search_kwargs={"k": 2})

Create a retriever chain                                       
Create a retriever chain to answer the question. This is where we associate the Vertex AI Text Bison model LLM and the retriever that retrieves the embeddings from Chroma DB.

In [30]:
qa = RetrievalQA.from_chain_type( llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True)

Define your prompt

In [31]:
def sm_ask(question, print_results=True):
  video_subset = qa({"query": question})
  context = video_subset
  prompt = f"""
  Answer the following question in a detailed manner, using information from the text below. If the answer is not in the text,say I dont know and do not generate your own response.

  Question:
  {question}
  Text:
  {context}

  Question:
  {question}

  Answer:
  """
  parameters = {
  "temperature": 0.1,
  "max_output_tokens": 256,
  "top_p": 0.8,
  "top_k": 40
  }
  response = llm.predict(prompt, **parameters)
  return {
  "answer": response

  }

Integrate the LLM Application                            
Integrate the LLM application with Gradio for a visual front end interaction.

In [32]:
import gradio as gr
def get_response(input_text):
  response = sm_ask(input_text)
  return response

grapp = gr.Interface(fn=get_response, inputs="text", outputs="text")
grapp.launch()

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://fabc20b5764af22d1b.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


