# Lab 5: Tie it all together

In this lab, you use a combination of all the other Labs in order to have a RAG application that:

* Uses the transcripts from all the Boston Azure Youtube videos
* Uses the transcript version that is chunked in 5 minute increments
* Uses Azure AI Search as the vector store
* Creates citations with the Youtube video title and url to the 5 minute chunk

> NOTE: Soon after the GAB I will be pulling the Azure AI Search indexes down that is used in this lab

### Step 1:

Look over the code and run the following to get ready for the lesson:

In [None]:
import os
from dotenv import load_dotenv
from langchain_openai import AzureChatOpenAI
from langchain_openai.embeddings import AzureOpenAIEmbeddings
from langchain_core.output_parsers import StrOutputParser
from langchain.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain.vectorstores.azuresearch import AzureSearch

load_dotenv()

llm = AzureChatOpenAI(
  openai_api_version="2023-05-15",
  azure_deployment= os.getenv("AZURE_OPENAI_MODEL_DEPLOYMENT_NAME")
)

embeddings = AzureOpenAIEmbeddings()

parser = StrOutputParser()

def format_docs(docs):
    return "\n\n".join([f"{d.metadata['title']}~https://youtu.be/{d.metadata['videoId']}?t={d.metadata['seconds']}~{d.page_content}" for d in docs])

vectorstore_address = os.getenv("AZURE_SEARCH_ENDPOINT")
vectorstore_password = os.getenv("AZURE_SEARCH_KEY")

index_name: str = "boston-azured-transcripts"
vectorstore: AzureSearch = AzureSearch(
    azure_search_endpoint=vectorstore_address,
    azure_search_key=vectorstore_password,
    index_name=index_name,
    embedding_function=embeddings.embed_query,
)


Next there is a new prompt template in order to handle the video title and the Youtube url. 

Run the following to create it (you may find you want to modify it to get better results):

In [None]:
prompt_template_with_citations = ChatPromptTemplate.from_messages(
    [
        ("system", """Assistant helps people with their questions about the content of video transcripts. Be brief in your answers.
        Answer ONLY with the facts listed in the list of sources below. If there isn't enough information below, say you don't know. 
        Do not generate answers that don't use the sources below.
        Each source has this format: title~url~source
        Always include the source title and url for each fact you use in the response.         
         Place the title and url in square brackets after your answer, for example:
            [Source Title: https://source.url]
         Don't combine sources, list each source separately, for example:
            [Source Title Video 1: https://source.url1]
            [Source Title Video 2: https://source.url2]
         
            Context: {context}
         """),
        ("human", "{question}")
    ],
)

Create a retriever and the chain to call the vector store, format the results and call the LLM:

In [None]:
retriever = vectorstore.as_retriever()

chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt_template_with_citations
    | llm
    | parser
)
chain.invoke("What is azure container apps?")

Try a few different questions:

In [None]:
chain.invoke("What is RAG?")

In [None]:
chain.invoke("How do you evaluate RAG applications?")

In [None]:
chain.invoke("What did Bill Wilder present?")

In [None]:
chain.invoke("Who presented on SQL Server?")

### That is it! You made it through to the end. Congratulations!