#  RAG Example (small but complete)

For more information, see https://python.langchain.com/docs/tutorials/rag/ 

- We are building a complete RAG pipeline with the same Vector Database we built earlier.

## Specify embedding model and vector store

In [None]:
from langchain_chroma import Chroma
from langchain_huggingface.embeddings import HuggingFaceEmbeddings

embedding_model = HuggingFaceEmbeddings()
database_loc = ("./chroma_db_test1")

vectorstore = Chroma(persist_directory=database_loc,
      embedding_function=embedding_model)

## Specify the LLM

We are going to use Ollama to keep it simple

In [None]:
from langchain_ollama import OllamaLLM
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

llm = OllamaLLM(model='llama3:latest') 

## Create a simple prompt template

This can be handled in many ways and can be very simple to complex.

In [None]:
from langchain_core.prompts import PromptTemplate

template = """
You are a counselor for Lafayette high school in Lexington, KY.
Students will ask a question based on their interests and career plans.
Your job is to advice them which courses register for 
based on the context provided below.

Context: {context}

Question: {question}

Answer: I would recommend the following courses: """

prompt = PromptTemplate.from_template(template)

## Create a chain of steps

For more information, see https://python.langchain.com/docs/tutorials/rag/ 

In [None]:
from typing_extensions import List, TypedDict
from langchain_core.documents import Document

# Define state for application
class State(TypedDict):
    question: str
    context: List[Document]
    answer: str

# Define application steps
def retrieve(state: State):
    retrieved_docs = vectorstore.similarity_search(state["question"], k=5)
    return {"context": retrieved_docs}

def generate(state: State):
    docs_content = "\n\n".join(doc.page_content for doc in state["context"])
    messages = prompt.invoke({"question": state["question"], "context": docs_content})
    response = llm.invoke(messages)
    return {"answer": response}

## Connect them and save them as a Langgraph

This is quite new. Chain creation was done thorugh piping until recently.

In [None]:
from langgraph.graph import START, StateGraph

# Compile application and test
graph_builder = StateGraph(State).add_sequence([retrieve, generate])
graph_builder.add_edge(START, "retrieve")
graph = graph_builder.compile()

In [None]:
from IPython.display import Image, display

display(Image(graph.get_graph().draw_mermaid_png()))

## Now comes the fun part!

In [None]:
# student_question = "I am interested in building homes"
# student_question = "I want to be an ambassodor to Japan"
# student_question = "My goal is to find cure for cancer"
# student_question = "I love rattle snakes"
# student_question = "I want to be an AI researcher"

In [None]:
response = graph.invoke({"question": student_question})

print(f'Answer: {response["answer"]}\n\n')
print("*" *80)      
print("For more information, see the following pages in the Lafayette Course catalog")

pages = []
for context in response["context"]:
    if context.metadata['page'] not in pages:
        pages.append(context.metadata['page'])
print(f"pages: {pages}")