# Local Agents with Langraph and Ollama
The purpose of this notebook is to look at the issues associated with getting agents to work locally. Some videos on langraph have convinced me that it may be a more
flexible way to address the implementation of my own agentic architectures. That is explored in this notebook.

In [1]:
# Displaying final output format
from IPython.display import display, Markdown, Latex
# LangChain Dependencies
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import JsonOutputParser, StrOutputParser
from langchain_community.chat_models import ChatOllama
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_community.utilities import DuckDuckGoSearchAPIWrapper
from langgraph.graph import END, StateGraph
from langchain_core.prompts import ChatPromptTemplate

# For State Graph 
from typing_extensions import TypedDict
import os

In [2]:
# Environment Variables
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ["LANGCHAIN_PROJECT"] = "Ollama Agent"

In [3]:
# Defining LLM
local_llm = 'phi3:14b-medium-4k-instruct-q8_0'
model = ChatOllama(model=local_llm, temperature=0, base_url="http://192.168.86.2:11434", keep_alive=-1)
model_json = ChatOllama(model=local_llm, format='json', temperature=0, base_url="http://192.168.86.2:11434", keep_alive=-1)

In [4]:
# Web Search Tool

wrapper = DuckDuckGoSearchAPIWrapper(max_results=25)
web_search_tool = DuckDuckGoSearchRun(api_wrapper=wrapper)

# Test Run
# resp = web_search_tool.invoke("home depot news")
# resp

In [32]:
# Router
from langchain_core.messages import SystemMessage

router_prompt = ChatPromptTemplate.from_messages(

    [("system", """
    You are an expert at routing a user question to either the generation stage or web search. 
    Use the web search for questions that require more context for a better answer, or recent events.
    Otherwise, you can skip and go straight to the generation phase to respond.
    You do not need to be stringent with the keywords in the question related to these topics.
    Give a binary choice 'web_search' or 'generate' based on the question. 
    Return the JSON with a single key 'choice' with no premable or explanation. 
    
    Question to route: {question} 
    """
    ),]
)

# Chain
question_router = router_prompt | model_json | JsonOutputParser()

# Test Run
question = "What's up?"
print(question_router.invoke({"question": question}))

{'choice': 'generate'}


In [34]:
# Chain
question_router = router_prompt | model

# Test Run
question = "What's up?"
response = question_router.invoke("whats up?")
print(response)

content='```json\n\n{ "choice": "generate" }\n\n```' response_metadata={'model': 'phi3:14b-medium-4k-instruct-q8_0', 'created_at': '2024-05-24T07:54:55.370119971Z', 'message': {'role': 'assistant', 'content': ''}, 'done': True, 'total_duration': 442952241, 'load_duration': 2046245, 'prompt_eval_count': 11, 'prompt_eval_duration': 117445000, 'eval_count': 16, 'eval_duration': 319574000} id='run-e2d99d64-948d-4edd-b259-8cfd54c9c8d5-0'


In [35]:
print(response.response_metadata['prompt_eval_count'])

11


In [33]:
print(router_prompt.invoke("Whats up?"))

messages=[SystemMessage(content="\n    You are an expert at routing a user question to either the generation stage or web search. \n    Use the web search for questions that require more context for a better answer, or recent events.\n    Otherwise, you can skip and go straight to the generation phase to respond.\n    You do not need to be stringent with the keywords in the question related to these topics.\n    Give a binary choice 'web_search' or 'generate' based on the question. \n    Return the JSON with a single key 'choice' with no premable or explanation. \n    \n    Question to route: Whats up? \n    ")]


In [15]:
from langchain_core.prompts import ChatPromptTemplate
system_template = """
        You are an AI assistant for Research Question Tasks, that synthesizes web search results. 
        Strictly use the following pieces of web search context to answer the question. If you don't know the answer, just say that you don't know. 
        keep the answer concise, but provide all of the details you can in the form of a research report. 
        Only make direct references to material if provided in the context."""
user_template = """
    Question: {question} 
    Web Search Context: {context} 
    Answer: """

generate_prompt = ChatPromptTemplate.from_messages([
    ('system', system_template),
    ('human', user_template),
])

generate_result = generate_prompt.invoke( {
    "question": "What's been up with Macom recently?", "context": "" }
)
print(generate_result)

# Chain
generate_chain = generate_prompt | llama3 | StrOutputParser()

# Test Run
# question = "How are you?"
# context = ""
#generation = generate_chain.invoke({"context": context, "question": question})
#print(generation)

messages=[SystemMessage(content="\n        You are an AI assistant for Research Question Tasks, that synthesizes web search results. \n        Strictly use the following pieces of web search context to answer the question. If you don't know the answer, just say that you don't know. \n        keep the answer concise, but provide all of the details you can in the form of a research report. \n        Only make direct references to material if provided in the context."), HumanMessage(content="\n    Question: What's been up with Macom recently? \n    Web Search Context:  \n    Answer: ")]


In [31]:
# Query Transformation

query_prompt = ChatPromptTemplate.from_messages(
    ('system', """
    You are an expert at crafting web search queries for research questions.
    More often than not, a user will ask a basic question that they wish to learn more about, however it might not be in the best format. 
    Reword their query to be the most effective web search string possible.
    Return the JSON with a single key 'query' with no premable or explanation. 
    
    Question to transform: {question} 
    """)
)

# Chain
query_chain = query_prompt | llama3_json | JsonOutputParser()

# Test Run
question = "What's happened recently with Macom?"
print(query_chain.invoke({"question": question}))

NameError: name 'llama3_json' is not defined

In [10]:
# Graph State
class GraphState(TypedDict):
    """
    Represents the state of our graph.

    Attributes:
        question: question
        generation: LLM generation
        search_query: revised question for web search
        context: web_search result
    """
    question : str
    generation : str
    search_query : str
    context : str

# Node - Generate

def generate(state):
    """
    Generate answer

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): New key added to state, generation, that contains LLM generation
    """
    
    print("Step: Generating Final Response")
    question = state["question"]
    context = state["context"]

    # Answer Generation
    generation = generate_chain.invoke({"context": context, "question": question})
    return {"generation": generation}

# Node - Query Transformation

def transform_query(state):
    """
    Transform user question to web search

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): Appended search query
    """
    
    print("Step: Optimizing Query for Web Search")
    question = state['question']
    gen_query = query_chain.invoke({"question": question})
    search_query = gen_query["query"]
    return {"search_query": search_query}


# Node - Web Search

def web_search(state):
    """
    Web search based on the question

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): Appended web results to context
    """

    search_query = state['search_query']
    print(f'Step: Searching the Web for: "{search_query}"')
    
    # Web search tool call
    search_result = web_search_tool.invoke(search_query)
    return {"context": search_result}


# Conditional Edge, Routing

def route_question(state):
    """
    route question to web search or generation.

    Args:
        state (dict): The current graph state

    Returns:
        str: Next node to call
    """

    print("Step: Routing Query")
    question = state['question']
    output = question_router.invoke({"question": question})
    if output['choice'] == "web_search":
        print("Step: Routing Query to Web Search")
        return "websearch"
    elif output['choice'] == 'generate':
        print("Step: Routing Query to Generation")
        return "generate"

In [11]:
# Build the nodes
workflow = StateGraph(GraphState)
workflow.add_node("websearch", web_search)
workflow.add_node("transform_query", transform_query)
workflow.add_node("generate", generate)

# Build the edges
workflow.set_conditional_entry_point(
    route_question,
    {
        "websearch": "transform_query",
        "generate": "generate",
    },
)
workflow.add_edge("transform_query", "websearch")
workflow.add_edge("websearch", "generate")
workflow.add_edge("generate", END)

# Compile the workflow
local_agent = workflow.compile()

In [12]:
def run_agent(query):
    output = local_agent.invoke({"question": query})
    print("=======")
    display(Markdown(output["generation"]))

In [14]:
# Test it out!
run_agent("What's been up with Macom recently?")

Step: Routing Query
Step: Routing Query to Web Search
Step: Optimizing Query for Web Search
Step: Searching the Web for: "Macom recent news updates"
Step: Generating Final Response


Based on the available web search context, there is no recent information about Macom to provide. Further research may be required using different sources or keywords for more updated details regarding Macom's activities and developments.

Lets see if ChatOllama uses the langchain messages and inserts the correct tags.

In [16]:
from langchain_core.messages import HumanMessage, SystemMessage

messages = [
    SystemMessage(content="You are a helpful assistant that always responds with a Shakespearean sonnet."),
    HumanMessage(
        content="What color is the sky at different times of the day?"
    )
]

chat_model_response = llama3.invoke(messages)
print(chat_model_response)


content="When dawn doth break, and night's dark veil recedes,  \nThe eastern skies blush pink in morning light;  \nA canvas painted by Aurora's deeds,  \nAs stars retreat from Apollo's rising might.\n\nAt noon the azure heavens stretch so wide,  \nWith cotton clouds that drift on gentle breeze;  \nThe sun at zenith in its lofty pride,  \nBestows a warmth upon the earth with ease.\n\nAs day doth wane and shadows start to grow,  \nA tapestry of orange hues takes flight;  \nEvening's approach is marked by amber glow,  \nAnd twilight whispers softly into night.\n\nIn darkness deep, the moon her vigil keeps,  \nWhile stars above in silent splendor peep." response_metadata={'model': 'phi3:14b-medium-4k-instruct-q8_0', 'created_at': '2024-05-23T07:30:45.448378157Z', 'message': {'role': 'assistant', 'content': ''}, 'done': True, 'total_duration': 4263597286, 'load_duration': 1611359, 'prompt_eval_count': 36, 'prompt_eval_duration': 119083000, 'eval_count': 196, 'eval_duration': 4139918000} id=

OK. So that worked correctly. Lets rework the above prompts to use the standard prompts.

Lets now make a chatbot from phi3.

In [17]:
from langchain_core.prompts import MessagesPlaceholder
from langchain.memory import ChatMessageHistory

chat_prompt = ChatPromptTemplate.from_messages([ 
    ('system', 'You are a helpful assistant.' ),
    MessagesPlaceholder(variable_name="chat_history"),
    ('human', '{input}' ),
])

history = ChatMessageHistory()
history.add_user_message("Hi. My name is John. How are you?")

chat_chain = chat_prompt | llama3 | StrOutputParser()
print(chat_chain.invoke({'input':"Do you remember my name?", 'chat_history':history.messages}))




In [22]:
for m in history.messages:
    print(f'{m.type}: {m.content}')

human: Hi. My name is John. How are you?
