# LangGraph Advanced with Llama 3

Will apply [Llama 3 (8B Model)](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3) through [Ollama](https://python.langchain.com/docs/integrations/chat/ollama/) to build [LangGraph](https://python.langchain.com/docs/langgraph/) multi-agent RAG Sytems

Ensure that you have `Ollama` running and have pulled `Llama3:8B` model 

In [34]:
import os
from langchain_community.chat_models import ChatOllama
from langchain_core.output_parsers import StrOutputParser, JsonOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import YoutubeLoader
from langchain_community.vectorstores import Chroma
from langchain.prompts import PromptTemplate
from langchain_openai import OpenAIEmbeddings
from langchain_community.retrievers import TavilySearchAPIRetriever

Inspiration from [Langchain Videos](https://www.youtube.com/watch?v=-ROS6gfYIts) and [previous notebook](https://github.com/jzamalloa1/langchain_learning/blob/main/langgraph_testing.ipynb)

<p>
<img src="ILLUSTRATIONS/langgraph_advanced_flow.png" 
      width="65%" height="auto"
      style="display: block; margin: 0 auto" />

Illustration [reference](https://github.com/jzamalloa1/langchain_learning/blob/main/langgraph_testing.ipynb)

In [9]:
# OpenAI API Key solely to use embedding model
os.environ["OPENAI_API_KEY"] = ""
os.environ["TAVILY_API_KEY"] = ""

In [3]:
llama_model = "llama3:8b"

#### Source for Vector Stores

In [5]:
llm_urls = [
    "https://lilianweng.github.io/posts/2023-06-23-agent/",
    "https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/",
    "https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",
    
]

plotly_yt_urls = [
    "https://www.youtube.com/watch?v=Qx5eFVUdDxk&list=PLYD54mj9I2JevdabetHsJ3RLCeMyBNKYV&index=1",
    "https://www.youtube.com/watch?v=Z9YUejzkFa0&list=PLYD54mj9I2JevdabetHsJ3RLCeMyBNKYV&index=2",
    "https://www.youtube.com/watch?v=4bP66rRxVBw&list=PLYD54mj9I2JevdabetHsJ3RLCeMyBNKYV&index=3",
    "https://www.youtube.com/watch?v=a1qzu5GKIf0&list=PLYD54mj9I2JevdabetHsJ3RLCeMyBNKYV&index=4",
    "https://www.youtube.com/watch?v=Fm7DC-Z5R7A&list=PLYD54mj9I2JevdabetHsJ3RLCeMyBNKYV&index=5",
    "https://www.youtube.com/watch?v=4jcWJ30HqSY&list=PLYD54mj9I2JevdabetHsJ3RLCeMyBNKYV&index=6"
]

#### Build Retrievers - One for LLM docs and one for Plotly docs

In [21]:
#### LLM DOCS VECTOR STORE #####
llm_docs = [WebBaseLoader(url).load() for url in llm_urls] # Each loader produced a list of one element
llm_docs = [i for j in llm_docs for i in j] # Decoupling lists of one element

text_splitter_class = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=250, chunk_overlap=0
)

chunks = text_splitter_class.split_documents(llm_docs)
print(f"Split into {len(chunks)} chunks")

embedding_model = OpenAIEmbeddings(model="text-embedding-3-small")

llm_llw_vectorstore = Chroma.from_documents(
    documents = chunks,
    embedding = embedding_model,
    collection_name = "chroma_llm_llw"
)

llm_llw_retriever = llm_llw_vectorstore.as_retriever()


#### YOUTUBE PLOTLY VECTOR STORE #####
yt_docs = [YoutubeLoader.from_youtube_url(url, add_video_info=False).load() for url in plotly_yt_urls] # Each loader produced a list of one element
yt_docs = [i for j in yt_docs for i in j] # Decoupling lists of one element

text_splitter_class = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=250, chunk_overlap=0
)

chunks = text_splitter_class.split_documents(yt_docs)
print(f"Split into {len(chunks)} chunks")

plotly_yt_vectorstore = Chroma.from_documents(
    documents = chunks,
    embedding = embedding_model,
    collection_name = "chroma_plotly_yt"
)

plotly_yt_retriever = plotly_yt_vectorstore.as_retriever()

Split into 194 chunks
Split into 109 chunks


#### Build Routers function for Conditional Graph node

**Ollama** has `format="json"`. This ensures output from llm is a JSON. Recall that in the [ChatOpenAI implementation](https://github.com/jzamalloa1/langchain_learning/blob/main/langgraph_testing.ipynb) we tested before we had to create a pydantic object and bind it to our ChatOpenAI llm to ensure structured output. We don't have to do that here.

Note: One way that we could have addressed that could have been by adding `model_kwargs={"response_format":{"type":"json_object"}}` to **ChatOpenAI**, but not sure if this works 100% (or perhaps does work as well as Ollama's json mode)

In [26]:
#### FIRST ROUTER TO DECIDE IF WE SHOULD USE WEB SEARCH OR INTERNAL VECTOR STORES #####
llm = ChatOllama(model=llama_model, format="json", temperature=0,
                 callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))

prompt = PromptTemplate(
    template="""
    <|begin_of_text|>
    <|start_header_id|>system<|end_header_id|> 
    You are an expert at routing a user question to a vectorstore or web search. 
    Use the vectorstore for questions on LLM  agents, 
    prompt engineering, adversarial attacks and developing charts with Plotly. 
    You do not need to be stringent with the keywords in the question related to these topics. 
    Otherwise, use web-search. Give a binary choice 'web_search' 
    or 'vectorstore' based on the question. Return the a JSON with a single key 'datasource' and 
    no premable or explaination. 
    Question to route: {question} <|eot_id|>
    
    <|start_header_id|>assistant<|end_header_id|>
    """,
    input_variables=["question"],
)

initial_router_chain = prompt | llm | JsonOutputParser()

initial_router_chain.invoke({"question":"How can I build a histogram using plotly?"})

{"datasource": "vectorstore"} 

  



 






 






 






 






 






 






 






 






 






 






 






 






 






 








{'datasource': 'vectorstore'}

In [28]:
#### SECOND ROUTER (IF VECTORSTORE IS CHOSEN) TO DECIDE IF WE SHOULD USE THE LLM OR THE PLOTLY RETRIEVER #####
llm = ChatOllama(model=llama_model, format="json", temperature=0,
                 callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))

prompt = PromptTemplate(
    template="""
    <|begin_of_text|>
    <|start_header_id|>system<|end_header_id|> 
    You are an expert at routing a user question to an LLM documentation vectorstore or a Plotly vectorstore. 
    Use the LLM vectorstore for questions on LLM  agents, 
    prompt engineering and adversarial attacks. 
    You do not need to be stringent with the keywords in the question related to these topics. 
    Otherwise, use the plotly vectorstore for things like Plotly charts. Give a binary choice 'llm_agent' 
    or 'plotly' based on the question. Return the a JSON with a single key 'datasource' and 
    no premable or explaination. 
    Question to route: {question} <|eot_id|>
    
    <|start_header_id|>assistant<|end_header_id|>
    """,
    input_variables=["question"],
)

initial_router_chain = prompt | llm | JsonOutputParser()

initial_router_chain.invoke({"question":"what are adversarial agents?"})

{"datasource": "llm_agent"} 

  



 






 





 





 





 





 





 





 





 





 





 





 





 





 







{'datasource': 'llm_agent'}

#### Build Retrieval Grader

In [16]:
# Retrieval Grader
llm = ChatOllama(model=llama_model, format="json", temperature=0,
                 callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))

prompt = PromptTemplate(
    template="""
    <|begin_of_text|>
    <|start_header_id|>system<|end_header_id|>
    You are a grader assessing relevance of a retrieved document to a user question. 
    If the document contains keywords related to the user question, 
    grade it as relevant. It does not need to be a stringent test. The goal is to filter out erroneous retrievals. \n
    Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question. \n
    Provide the binary score as a JSON with a single key 'score' and no preamble or explanation.
    <|eot_id|>
    
    <|start_header_id|>user<|end_header_id|>
    Here is the retrieved document: \n\n {document} \n\n
    Here is the user question: {question} \n <|eot_id|>
    
    <|start_header_id|>assistant<|end_header_id|>
    """,

    input_variables=["question", "document"],
)

retrieval_grader = prompt | llm | JsonOutputParser()
question = "langchain memory use"
docs = llm_llw_retriever.invoke(question)
doc_txt = docs[1].page_content

grader_output = retrieval_grader.invoke({"question": question, "document": doc_txt})


{"score": "yes"} 

    



    



    



    



    



    



    



    



    



    



    



    



    



    



    





In [19]:
print(type(grader_output))
grader_output

<class 'dict'>


{'score': 'yes'}

#### Build Retrieval Generator

In [33]:
# LLM (Without json mode since we want inference text)
llm = ChatOllama(model=llama_model, temperature=0,
                 callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))

# Prompt
prompt = PromptTemplate(
    template="""
    <|begin_of_text|>
    <|start_header_id|>system<|end_header_id|> 
    You are an assistant for question-answering tasks. 
    Use the following pieces of retrieved context to answer the question. 
    If you don't know the answer, just say that you don't know. 
    Keep the answer concise and to the point. <|eot_id|>
    
    <|start_header_id|>user<|end_header_id|>
    Question: {question} 
    Context: {context} 
    Answer: <|eot_id|>
    
    <|start_header_id|>assistant<|end_header_id|>""",
    
    input_variables=["question", "document"],
)

# Post-processing
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# Chain
rag_chain = prompt | llm | StrOutputParser()

In [31]:
# Example run
question = "how can I create a lineplot?"
docs = plotly_yt_retriever.invoke(question)
rag_chain.invoke({"context":format_docs(docs), "question":question})

To create a lineplot, you can use Plotly Express in Dash. You would do this by using the `px.line` function and specifying the x-axis and y-axis data. For example:

```
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output

app = dash.Dash()

app.layout = html.Div([
    dcc.Graph(id='line-plot'),
])

@app.callback(Output('line-plot', 'figure'), [Input('dropdown', 'value')])
def update_line_plot(value):
    fig = px.line(x=value['x'], y=value['y'])
    return fig

if __name__ == '__main__':
    app.run_server(debug=True)
```

This code creates a simple line plot with the x-axis and y-axis data specified in the `update_line_plot` function.

"To create a lineplot, you can use Plotly Express in Dash. You would do this by using the `px.line` function and specifying the x-axis and y-axis data. For example:\n\n```\nimport dash\nimport dash_core_components as dcc\nimport dash_html_components as html\nfrom dash.dependencies import Input, Output\n\napp = dash.Dash()\n\napp.layout = html.Div([\n    dcc.Graph(id='line-plot'),\n])\n\n@app.callback(Output('line-plot', 'figure'), [Input('dropdown', 'value')])\ndef update_line_plot(value):\n    fig = px.line(x=value['x'], y=value['y'])\n    return fig\n\nif __name__ == '__main__':\n    app.run_server(debug=True)\n```\n\nThis code creates a simple line plot with the x-axis and y-axis data specified in the `update_line_plot` function."

In [32]:
# Example run with unrelated question, info not found in vector store
question = "what is the One Piece?"
docs = plotly_yt_retriever.invoke(question)
rag_chain.invoke({"context":format_docs(docs), "question":question})

I don't know what the One Piece is. The context provided seems to be about Dash, a Python framework for building web applications, and Plotly Express, a library for creating interactive visualizations. It doesn't appear to mention anything related to "One Piece".

'I don\'t know what the One Piece is. The context provided seems to be about Dash, a Python framework for building web applications, and Plotly Express, a library for creating interactive visualizations. It doesn\'t appear to mention anything related to "One Piece".'

#### Build Websearch Tool

In [37]:
web_tool = TavilySearchAPIRetriever(k=5)

In [39]:
web_tool.invoke("A que equipo derrotaron Los Chankas el fin de semana pasado?")

[Document(page_content='Los Chankas derrotaron 2-0 a Santos FC en el cierre del Torneo Apertura de la Liga 2. Con este resultado, el equipo dirigido por Gustavo Cisneros terminó ubicado en el tercer lugar de las clasificaciones con 20 puntos; mientras que los nasqueños finalizaron en el sexto casillero con 16 unidades. ... que no tuvo acción el pasado fin de semana ...', metadata={'title': 'Los Chankas derrotaron 2-0 a Santos FC en el cierre del Torneo Apertura ...', 'source': 'https://www.futbolperuano.com/liga-2/noticias/los-chankas-vs-santos-fc-en-vivo-online-por-la-fecha-13-del-torneo-apertura-de-la-liga-2-358470', 'score': 0.97505, 'images': None}),
 Document(page_content='Los Chankas derrotó 2-0 a Sport Boys en Andahuaylas por la fecha 6 del Torneo Apertura 2024. Los goles del equipo local fueron obra de Carlos López y Ángel Ledesma.', metadata={'title': 'En Andahuaylas: Los Chankas venció 2-0 a Sport Boys por el Torneo Apertura', 'source': 'https://depor.com/futbol-peruano/desce

#### Build Answer Grader

In [40]:
llm = ChatOllama(model=llama_model, format="json", temperature=0,
                 callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))

# Prompt
prompt = PromptTemplate(
    template="""
    <|begin_of_text|>
    <|start_header_id|>system<|end_header_id|> 
    You are a grader assessing whether an 
    answer is useful to resolve a question. Give a binary score 'yes' or 'no' to indicate whether the answer is 
    useful to resolve a question. Provide the binary score as a JSON with a single key 'score' and no preamble or explanation.
     <|eot_id|>
     
     <|start_header_id|>user<|end_header_id|> 
     Here is the answer:
    \n ------- \n
    {generation} 
    \n ------- \n
    Here is the question: 
    {question} 
    <|eot_id|>
    
    <|start_header_id|>assistant<|end_header_id|>""",

    input_variables=["generation", "question"],
)

answer_grader = prompt | llm | JsonOutputParser()

In [41]:
# Test answer grader
answer_grader.invoke({"question": "How do we build graphs in plotly?",
                      "generation": "The One Piece is the best manga in the world"})

{"score": "no"} 






    



  





  





  





  





  





  





  





  





  







{'score': 'no'}