# Local Web Research Agent w/ Llama 3 8b

### [Llama 3 Release](https://llama.meta.com/llama3/)

### [Ollama Llama 3 Model](https://ollama.com/library/llama3)
---

![diagram](local_agent_diagram.png)

---
[Llama 3 Prompt Format](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/)

### Special Tokens used with Meta Llama 3
* **<|begin_of_text|>**: This is equivalent to the BOS token
* **<|eot_id|>**: This signifies the end of the message in a turn.
* **<|start_header_id|>{role}<|end_header_id|>**: These tokens enclose the role for a particular message. The possible roles can be: system, user, assistant.
* **<|end_of_text|>**: This is equivalent to the EOS token. On generating this token, Llama 3 will cease to generate more tokens.
A prompt should contain a single system message, can contain multiple alternating user and assistant messages, and always ends with the last user message followed by the assistant header.

In [20]:
!pip install --quiet feedparser langchain_community langgraph duckduckgo-search


In [21]:
# Displaying final output format
from IPython.display import display, Markdown, Latex
import feedparser
from langchain.tools import BaseTool
from urllib.parse import urlencode 

# LangChain Dependencies
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import JsonOutputParser, StrOutputParser
from langchain_community.chat_models import ChatOllama
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_community.utilities import DuckDuckGoSearchAPIWrapper
from langgraph.graph import END, StateGraph
# For State Graph 
from typing_extensions import TypedDict
import os


In [22]:
# Environment Variables
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ["LANGCHAIN_PROJECT"] = "L3 Research Agent"
LANGSMITH_ENDPOINT="https://api.smith.langchain.com"

In [23]:
# Defining LLM
local_llm = 'llama3.2'
llama3 = ChatOllama(model=local_llm, temperature=0)
llama3_json = ChatOllama(model=local_llm, format='json', temperature=0)

In [24]:
# Web Search Tool

wrapper = DuckDuckGoSearchAPIWrapper(max_results=25)
web_search_tool = DuckDuckGoSearchRun(api_wrapper=wrapper)

# Test Run
# resp = web_search_tool.invoke("home depot news")
# resp

In [25]:
def arxiv_search(query: str, max_results: int = 5):
    """
    Query arXiv’s API and return the top N most recent papers.
    """
    base_url = "http://export.arxiv.org/api/query"
    params = {
        "search_query": query,
        "start":        0,
        "max_results":  max_results,
        "sortBy":       "submittedDate",
        "sortOrder":    "descending",
    }
    url = f"{base_url}?{urlencode(params)}"
    feed = feedparser.parse(url)

    results = []
    for entry in feed.entries:
        results.append({
            "title": entry.title,
            "summary": entry.summary,
            "authors": [a.name for a in entry.authors],
            "published": entry.published,
            "pdf_url": next((l.href for l in entry.links if l.type=="application/pdf"), None)
        })
    return results

class ArxivAPIWrapper(BaseTool):
    name: str = "research-search"
    description: str = "Use this to fetch the latest arXiv papers on a topic."

    def _run(self, query: str):
        papers = arxiv_search(query, max_results=5)
        return "\n\n".join(
            f"{p['title']} ({p['published']})\n"
            f"{p['summary'][:300]}…\nPDF: {p['pdf_url']}"
            for p in papers
        )

    async def _arun(self, query: str):
        return self._run(query)


# instantiate
research_tool = ArxivAPIWrapper()

tools = [
    web_search_tool,
    research_tool,        
]


In [26]:
# Generation Prompt

generate_prompt = PromptTemplate(
    template="""
    
    <|begin_of_text|>
    
    <|start_header_id|>system<|end_header_id|> 
    
    You are an AI assistant for Research Question Tasks, that synthesizes web search results. 
    Strictly use the following pieces of web search context to answer the question. If you don't know the answer, just say that you don't know. 
    keep the answer concise, but provide all of the details you can in the form of a research report. 
    Only make direct references to material if provided in the context.
    
    <|eot_id|>
    
    <|start_header_id|>user<|end_header_id|>
    
    Question: {question} 
    Web Search Context: {context} 
    Answer: 
    
    <|eot_id|>
    
    <|start_header_id|>assistant<|end_header_id|>""",
    input_variables=["question", "context"],
)

# Chain
generate_chain = generate_prompt | llama3 | StrOutputParser()

# Test Run
# question = "How are you?"
# context = ""
# generation = generate_chain.invoke({"context": context, "question": question})
# print(generation)


In [27]:
# Router

router_prompt = PromptTemplate(
    template="""
    
    <|begin_of_text|>
    <|start_header_id|>system<|end_header_id|>
    You are an expert at routing a user question into one of three categories:
    1. RESEARCH – user needs the latest peer-reviewed papers (scholarly queries).
    2. NEWS     – user is asking about current events or recent news.
    3. GENERAL  – user should get a direct answer from the LLM without external lookups.

    Return JSON with a single key "choice" whose value is exactly one of RESEARCH, NEWS, or GENERAL (no extra text).

    Question to route: {question}
    <|eot_id|>
    <|start_header_id|>assistant<|end_header_id|>
    
    """,
    input_variables=["question"],
)

# Chain
question_router = router_prompt | llama3_json | JsonOutputParser()

# Test Run
question = "What's up?"
print(question_router.invoke({"question": question}))

Failed to send compressed multipart ingest: langsmith.utils.LangSmithAuthError: Authentication failed for https://api.smith.langchain.com/runs/multipart. HTTPError('401 Client Error: Unauthorized for url: https://api.smith.langchain.com/runs/multipart', '{"error":"Unauthorized"}\n')trace=d5447ff8-9e4c-49c7-acf3-4447825cd515,id=d5447ff8-9e4c-49c7-acf3-4447825cd515; trace=d5447ff8-9e4c-49c7-acf3-4447825cd515,id=54ffeab0-3e68-4e3d-9a90-b5fc6712066a; trace=d5447ff8-9e4c-49c7-acf3-4447825cd515,id=54ffeab0-3e68-4e3d-9a90-b5fc6712066a; trace=d5447ff8-9e4c-49c7-acf3-4447825cd515,id=c6d2b60e-e617-49c5-b663-8f25b35bfb77


{'choice': 'GENERAL'}


In [28]:
# Query Transformation

query_prompt = PromptTemplate(
    template="""
    
    <|begin_of_text|>
    
    <|start_header_id|>system<|end_header_id|> 
    
    You are an expert at crafting web search queries for research questions.
    More often than not, a user will ask a basic question that they wish to learn more about, however it might not be in the best format. 
    Reword their query to be the most effective web search string possible.
    Return the JSON with a single key 'query' with no premable or explanation. 
    
    Question to transform: {question} 
    
    <|eot_id|>
    
    <|start_header_id|>assistant<|end_header_id|>
    
    """,
    input_variables=["question"],
)

# Chain
query_chain = query_prompt | llama3_json | JsonOutputParser()

# Test Run
question = "What's happened recently with Macom?"
print(query_chain.invoke({"question": question}))



Failed to send compressed multipart ingest: langsmith.utils.LangSmithAuthError: Authentication failed for https://api.smith.langchain.com/runs/multipart. HTTPError('401 Client Error: Unauthorized for url: https://api.smith.langchain.com/runs/multipart', '{"error":"Unauthorized"}\n')trace=d5447ff8-9e4c-49c7-acf3-4447825cd515,id=c6d2b60e-e617-49c5-b663-8f25b35bfb77


{'query': 'Macom recent news'}


In [29]:
# Graph State
class GraphState(TypedDict):
    """
    Represents the state of our graph.

    Attributes:
        question: question
        generation: LLM generation
        search_query: revised question for web search
        context: web_search result
    """
    question : str
    generation : str
    search_query : str
    context : str

# Node - Generate

def generate(state):
    """
    Generate answer

    Args:
        state (dict): The current graph state

    Returns:
        dict: New key added to state, generation, that contains LLM generation
    """
    print("Step: Generating Final Response")
    question = state["question"]
    # Safely grab context (empty if none)
    context = state.get("context", "")

    # Answer Generation
    generation = generate_chain.invoke({"context": context, "question": question})
    return {"generation": generation}


# Node - Query Transformation

def transform_query(state):
    """
    Transform user question to web search

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): Appended search query
    """
    
    print("Step: Optimizing Query for Web Search")
    question = state['question']
    gen_query = query_chain.invoke({"question": question})
    search_query = gen_query["query"]
    return {"search_query": search_query}


# Node - Web Search

def web_search(state):
    """
    Web search based on the question

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): Appended web results to context
    """

    search_query = state['search_query']
    print(f'Step: Searching the Web for: "{search_query}"')
    
    # Web search tool call
    search_result = web_search_tool.invoke(search_query)
    return {"context": search_result}

def research_node(state):
    """
    Call the ArXiv tool and stash results in state['context'].
    """
    print("Step: Running Research Tool")
    q = state.get("query") or state.get("question")
    papers = research_tool.invoke(q)
    # research_tool.invoke returns a single string; if it were a list you'd join it
    return {"context": papers}



# Conditional Edge, Routing

def route_question(state):
    """
    Route question to one of the tags: RESEARCH, NEWS, or GENERAL.
    """
    print("Step: Routing Query")
    q = state.get("query") or state.get("question")
    choice = question_router.invoke({"question": q})["choice"]

    # Just return the tag
    if choice == "RESEARCH":
        print("→ Tag: RESEARCH")
        return "RESEARCH"
    elif choice == "NEWS":
        print("→ Tag: NEWS")
        return "NEWS"
    else:
        print("→ Tag: GENERAL")
        return "GENERAL"



In [30]:
# Build the nodes
workflow = StateGraph(GraphState)
workflow.add_node("websearch",       web_search)
workflow.add_node("transform_query", transform_query)
workflow.add_node("research",        research_node)     
workflow.add_node("generate",        generate)

# Build the edges
workflow.set_conditional_entry_point(
    route_question,
    {
        "RESEARCH": "research",       
        "NEWS":     "transform_query",  
        "GENERAL":  "generate",       
    },
)
workflow.add_edge("transform_query", "websearch")
workflow.add_edge("websearch",        "generate")
workflow.add_edge("research",         "generate")      
workflow.add_edge("generate",         END)

# Compile the workflow
local_agent = workflow.compile()


In [31]:
def run_agent(query):
    output = local_agent.invoke({"question": query})
    print("=======")
    display(Markdown(output["generation"]))

In [None]:
# Test it out!
# run_agent("What's up with lope recently?")
#run_agent("How are you?")
run_agent("Find me the latest arXiv papers on graph neural networks?")

Step: Routing Query


Failed to send compressed multipart ingest: langsmith.utils.LangSmithAuthError: Authentication failed for https://api.smith.langchain.com/runs/multipart. HTTPError('401 Client Error: Unauthorized for url: https://api.smith.langchain.com/runs/multipart', '{"error":"Unauthorized"}\n')trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=2913463b-4ed3-43cd-9baa-c63a5f0a73f2; trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=7b3fe8db-9bb9-4c4f-828f-1b3b1d98435e; trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=c184aa66-e21b-4345-8751-a80765149228; trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=19ab7927-5a98-41af-9864-ab6af40ab740; trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=bf53efc8-6472-47e1-96d0-6d075beb79e7; trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=bf53efc8-6472-47e1-96d0-6d075beb79e7; trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=a0e264ea-6d96-433f-942d-3561b28bbb7a


→ Tag: RESEARCH
Step: Running Research Tool
Step: Generating Final Response


Failed to send compressed multipart ingest: langsmith.utils.LangSmithAuthError: Authentication failed for https://api.smith.langchain.com/runs/multipart. HTTPError('401 Client Error: Unauthorized for url: https://api.smith.langchain.com/runs/multipart', '{"error":"Unauthorized"}\n')trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=a0e264ea-6d96-433f-942d-3561b28bbb7a; trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=05a552f7-cfdf-4454-a8c1-0419d5fbe8e4; trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=05a552f7-cfdf-4454-a8c1-0419d5fbe8e4; trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=19ab7927-5a98-41af-9864-ab6af40ab740; trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=c184aa66-e21b-4345-8751-a80765149228; trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=7b3fe8db-9bb9-4c4f-828f-1b3b1d98435e; trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=805427f0-64b7-4baf-a799-a916cfd0083c; trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id=b34c8f84-b64f-474d-9a98-63ee8ad14b0c; trace=2913463b-4ed3-43cd-9baa-c63a5f0a73f2,id