## Welcome to Week 4, Day 4

This is the start of an AWESOME project! Really simple and very effective.

### First - a heads up for Windows PC users

While executing this notebook, you might hit a problem with the Playwright browser raising a NotImplementedError.

This should work when we move to python modules, but it can cause problems in Windows in a notebook.

If you it this error and would like to run the notebook, you need to make a small change which seems quite hacky!

1. Right click in `.venv` in the File Explorer on the left and select "Find in folder"
2. Search for `asyncio.set_event_loop_policy(WindowsSelectorEventLoopPolicy())`  
3. That code should be found in a line of code in a file called `kernelapp.py`
4. Comment out that line of code in that file! And save the file. (And in fact, student William Lapa tells me that he needed to comment out the entire else statement that this line is part of.)
5. Restart the kernel by pressing the "Restart" button above

Thank you to student Nicolas for finding this, and to Yaki, Zibin and Bhaskar for confirming that this worked for them!

In [None]:
from typing import Annotated
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START
from langgraph.graph.message import add_messages
from dotenv import load_dotenv
from IPython.display import Image, display
import gradio as gr
from langgraph.prebuilt import ToolNode, tools_condition
import requests
import os
from langchain.agents import Tool

from langchain_openai import ChatOpenAI
from langchain_ollama import ChatOllama
from langgraph.checkpoint.memory import MemorySaver

In [None]:
load_dotenv(override=True)

### Asynchronous LangGraph

To run a tool:  
Sync: `tool.run(inputs)`  
Async: `await tool.arun(inputs)`

To invoke the graph:  
Sync: `graph.invoke(state)`  
Async: `await graph.ainvoke(state)`

In [None]:
class State(TypedDict):
    
    messages: Annotated[list, add_messages]


graph_builder = StateGraph(State)

In [None]:
pushover_token = os.getenv("PUSHOVER_TOKEN")
pushover_user = os.getenv("PUSHOVER_USER")
pushover_url = "https://api.pushover.net/1/messages.json"

def push(text: str):
    """Send a push notification to the user"""
    requests.post(pushover_url, data = {"token": pushover_token, "user": pushover_user, "message": text})

tool_push = Tool(
        name="send_push_notification",
        func=push,
        description="useful for when you want to send a push notification"
    )

### Next: Install Playwright

On Windows and MacOS:  
`playwright install`

On Linux:  
`playwright install —with-reps chromium`

### Introducing nest_asyncio

Python async code only allows for one "event loop" processing aynchronous events.

The `nest_asyncio` library patches this, and is used for special situations, if you need to run a nested event loop.



In [None]:
import nest_asyncio
nest_asyncio.apply()

### The LangChain community

One of the remarkable things about LangChain is the rich community around it.

Check this out:


In [None]:
from langchain_community.agent_toolkits import PlayWrightBrowserToolkit
from langchain_community.tools.playwright.utils import create_async_playwright_browser

# If you get a NotImplementedError here or later, see the Heads Up at the top of the notebook

async_browser =  create_async_playwright_browser(headless=False)  # headful mode
toolkit = PlayWrightBrowserToolkit.from_browser(async_browser=async_browser)
tools = toolkit.get_tools()

In [None]:
for tool in tools:
    print(f"{tool.name}={tool}")

In [None]:
tool_dict = {tool.name:tool for tool in tools}

navigate_tool = tool_dict.get("navigate_browser")
extract_text_tool = tool_dict.get("extract_text")

    
await navigate_tool.arun({"url": "https://www.cnn.com"})
text = await extract_text_tool.arun({})

In [None]:
import textwrap
print(textwrap.fill(text))

In [None]:
all_tools = tools + [tool_push]

In [None]:
from langchain_ollama import ChatOllama
llm = ChatOllama(
    model="gpt-oss:120b",
    temperature=0,
    # other params...
)   # or "mistral", "codellama", etc.

llm_with_tools = llm.bind_tools(all_tools)


def chatbot(state: State):
    return {"messages": [llm_with_tools.invoke(state["messages"])]}


In [None]:

graph_builder = StateGraph(State)
graph_builder.add_node("chatbot", chatbot)
graph_builder.add_node("tools", ToolNode(tools=all_tools))
graph_builder.add_conditional_edges( "chatbot", tools_condition, "tools")
graph_builder.add_edge("tools", "chatbot")
graph_builder.add_edge(START, "chatbot")

memory = MemorySaver()
graph = graph_builder.compile(checkpointer=memory)
display(Image(graph.get_graph().draw_mermaid_png()))

In [None]:
config = {"configurable": {"thread_id": "10"}}

async def chat(user_input: str, history):
    result = await graph.ainvoke({"messages": [{"role": "user", "content": user_input}]}, config=config)
    return result["messages"][-1].content


gr.ChatInterface(chat, type="messages").launch()

In [18]:
import asyncio
from langchain_ollama import ChatOllama
from langchain.output_parsers import OutputFixingParser
from langchain.output_parsers.json import SimpleJsonOutputParser
from langchain_core.prompts import ChatPromptTemplate

# ----------------------------
# 1. Define LLM (Ollama model)
# ----------------------------
# Make sure you have `ollama serve` running and model pulled (e.g. `ollama pull llama3`)
llm = ChatOllama(
    model="gpt-oss:120b",   # change to the model you have locally, like "mistral", "llama3.1", etc.
    temperature=0
)

# ----------------------------
# 2. Define schema (expected JSON fields)
# ----------------------------
schema = {
    "score": "An integer score between 1 and 10",
    "reasoning": "A short explanation for the score"
}

# ----------------------------
# 3. Prompt Template
# ----------------------------
prompt = ChatPromptTemplate.from_messages([
    ("system", """You are an evaluator.
Return ONLY valid JSON with two fields:
- score: an integer from 1 (very poor answer) to 10 (perfectly correct answer).
- reasoning: a short explanation of why you gave that score.
Do not include extra text."""),
    ("user", "Evaluate the following answer:\n\n{answer}")
])


# ----------------------------
# 4. Parser with auto-fix
# ----------------------------
parser = OutputFixingParser.from_llm(
    llm=llm,
    parser=SimpleJsonOutputParser()
)

# ----------------------------
# 5. Full pipeline (prompt -> llm -> parser)
# ----------------------------
chain = prompt | llm | parser

# ----------------------------
# 6. Run evaluator
# ----------------------------
async def main():
    test_answer = "The capital of France is Paris."
    result = await chain.ainvoke({"answer": test_answer})
    print("EVALUATOR RESULT:", result)

if __name__ == "__main__":
    asyncio.run(main())


EVALUATOR RESULT: {'score': 10, 'reasoning': 'The answer correctly states that Paris is the capital of France, which is accurate and complete.'}


In [23]:
from typing import Annotated, TypedDict, List, Dict, Any, Optional
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
from langchain_community.agent_toolkits import PlayWrightBrowserToolkit
from langchain_community.tools.playwright.utils import create_async_playwright_browser
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import MemorySaver
from langgraph.prebuilt import ToolNode
from langgraph.graph.message import add_messages
from pydantic import BaseModel, Field
import gradio as gr
import uuid
import nest_asyncio
import re
import json
from dotenv import load_dotenv

# -------------------------------
# Load environment
# -------------------------------
load_dotenv(override=True)
nest_asyncio.apply()

# -------------------------------
# Structured evaluator output
# -------------------------------
class EvaluatorOutput(BaseModel):
    feedback: str = Field(description="Feedback on the assistant's response")
    success_criteria_met: bool = Field(description="Whether the success criteria have been met")
    user_input_needed: bool = Field(description="True if more input is needed from the user")

# -------------------------------
# State definition
# -------------------------------
class State(TypedDict):
    messages: Annotated[List[Any], add_messages]
    success_criteria: str
    feedback_on_work: Optional[str]
    success_criteria_met: bool
    user_input_needed: bool

# -------------------------------
# Playwright & tools setup
# -------------------------------
async_browser = create_async_playwright_browser(headless=False)
toolkit = PlayWrightBrowserToolkit.from_browser(async_browser=async_browser)
tools = toolkit.get_tools()

# -------------------------------
# Ollama LLM setup
# -------------------------------
from langchain_ollama import ChatOllama

worker_llm = ChatOllama(model="gpt-oss:120b", temperature=0)
worker_llm_with_tools = worker_llm.bind_tools(tools)

evaluator_llm = ChatOllama(model="gpt-oss:120b", temperature=0)

# -------------------------------
# Safe evaluator wrapper
# -------------------------------
def extract_json_safe(text: str) -> Dict[str, Any]:
    """Extract JSON from LLM output or fallback to default"""
    try:
        match = re.search(r"\{.*\}", text, re.DOTALL)
        if match:
            return json.loads(match.group(0))
    except:
        pass
    return {
        "feedback": "Evaluator could not parse LLM output. Needs user input.",
        "success_criteria_met": False,
        "user_input_needed": True
    }

def safe_evaluator_invoke(messages: List[Any]) -> EvaluatorOutput:
    """Call evaluator LLM and safely parse JSON"""
    system_prompt = (
        "You are an evaluator. ONLY output valid JSON with keys: "
        "feedback (string), success_criteria_met (bool), user_input_needed (bool). "
        "Do NOT include any extra text or markdown."
    )
    evaluator_messages = [SystemMessage(content=system_prompt)] + messages
    raw_output = evaluator_llm.invoke(evaluator_messages)
    data = extract_json_safe(raw_output.content)
    return EvaluatorOutput(**data)

# -------------------------------
# Worker node
# -------------------------------
def worker(state: State) -> Dict[str, Any]:
    system_msg = f"You are a helpful assistant. Complete the task or ask the user a question.\nSuccess criteria:\n{state['success_criteria']}"
    if state.get("feedback_on_work"):
        system_msg += f"\nFeedback from previous attempt:\n{state['feedback_on_work']}"
    messages = [SystemMessage(content=system_msg)] + state["messages"]
    response = worker_llm_with_tools.invoke(messages)
    return {"messages": [response]}

def worker_router(state: State) -> str:
    last_message = state["messages"][-1]
    if hasattr(last_message, "tool_calls") and last_message.tool_calls:
        return "tools"
    return "evaluator"

# -------------------------------
# Evaluator node
# -------------------------------
def evaluator(state: State) -> State:
    last_response = state["messages"][-1].content
    user_msg = HumanMessage(content=f"Conversation:\n{last_response}")
    eval_result = safe_evaluator_invoke([user_msg])
    return {
        "messages": [AIMessage(content=f"Evaluator Feedback: {eval_result.feedback}")],
        "feedback_on_work": eval_result.feedback,
        "success_criteria_met": eval_result.success_criteria_met,
        "user_input_needed": eval_result.user_input_needed,
    }

def route_based_on_evaluation(state: State) -> str:
    if state["success_criteria_met"] or state["user_input_needed"]:
        return "END"
    return "worker"

# -------------------------------
# Graph setup
# -------------------------------
graph_builder = StateGraph(State)
graph_builder.add_node("worker", worker)
graph_builder.add_node("tools", ToolNode(tools=tools))
graph_builder.add_node("evaluator", evaluator)

graph_builder.add_conditional_edges("worker", worker_router, {"tools": "tools", "evaluator": "evaluator"})
graph_builder.add_edge("tools", "worker")
graph_builder.add_conditional_edges("evaluator", route_based_on_evaluation, {"worker": "worker", "END": END})
graph_builder.add_edge(START, "worker")

memory = MemorySaver()
graph = graph_builder.compile(checkpointer=memory)

# -------------------------------
# Gradio helpers
# -------------------------------
def make_thread_id() -> str:
    return str(uuid.uuid4())

async def process_message(message, success_criteria, history, thread):
    state = {
        "messages": [HumanMessage(content=message)],
        "success_criteria": success_criteria,
        "feedback_on_work": None,
        "success_criteria_met": False,
        "user_input_needed": False,
    }
    result = await graph.ainvoke(state, config={"configurable": {"thread_id": thread}})
    user = {"role": "user", "content": message}
    reply = {"role": "assistant", "content": result["messages"][-2].content}
    feedback = {"role": "assistant", "content": result["messages"][-1].content}
    return history + [user, reply, feedback]

async def reset():
    return "", "", None, make_thread_id()

# -------------------------------
# Gradio UI
# -------------------------------
with gr.Blocks(theme=gr.themes.Default(primary_hue="emerald")) as demo:
    gr.Markdown("## Sidekick Personal Co-worker")
    thread = gr.State(make_thread_id())

    with gr.Row():
        chatbot = gr.Chatbot(label="Sidekick", height=300, type="messages")
    with gr.Group():
        with gr.Row():
            message = gr.Textbox(show_label=False, placeholder="Your request to your sidekick")
        with gr.Row():
            success_criteria = gr.Textbox(show_label=False, placeholder="What are your success criteria?")
    with gr.Row():
        reset_button = gr.Button("Reset", variant="stop")
        go_button = gr.Button("Go!", variant="primary")

    message.submit(process_message, [message, success_criteria, chatbot, thread], [chatbot])
    success_criteria.submit(process_message, [message, success_criteria, chatbot, thread], [chatbot])
    go_button.click(process_message, [message, success_criteria, chatbot, thread], [chatbot])
    reset_button.click(reset, [], [message, success_criteria, chatbot, thread])

demo.launch()


* Running on local URL:  http://127.0.0.1:7866
* To create a public link, set `share=True` in `launch()`.




In [None]:
from typing import Annotated, TypedDict, List, Dict, Any, Optional
from typing_extensions import TypedDict
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
from langchain_community.agent_toolkits import PlayWrightBrowserToolkit
from langchain_community.tools.playwright.utils import create_async_playwright_browser
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import MemorySaver
from langgraph.prebuilt import ToolNode
from langgraph.graph.message import add_messages
from pydantic import BaseModel, Field
import gradio as gr
import uuid
import nest_asyncio
from dotenv import load_dotenv

# Load environment variables
load_dotenv(override=True)
nest_asyncio.apply()

# -------------------------------
# Structured output for evaluator
# -------------------------------
class EvaluatorOutput(BaseModel):
    feedback: str = Field(description="Feedback on the assistant's response")
    success_criteria_met: bool = Field(description="Whether the success criteria have been met")
    user_input_needed: bool = Field(description="True if more input is needed from the user, or clarifications")

# -------------------------------
# Define state
# -------------------------------
class State(TypedDict):
    messages: Annotated[List[Any], add_messages]
    success_criteria: str
    feedback_on_work: Optional[str]
    success_criteria_met: bool
    user_input_needed: bool

# -------------------------------
# Setup Playwright tools
# -------------------------------
async_browser = create_async_playwright_browser(headless=False)
toolkit = PlayWrightBrowserToolkit.from_browser(async_browser=async_browser)
tools = toolkit.get_tools()

# -------------------------------
# Setup Ollama LLMs
# -------------------------------
from langchain_ollama import ChatOllama

worker_llm = ChatOllama(model="llama3.3", temperature=0)
worker_llm_with_tools = worker_llm.bind_tools(tools)

evaluator_llm = ChatOllama(model="llama3.3", temperature=0)
evaluator_llm_with_output = evaluator_llm.with_structured_output(EvaluatorOutput)

# -------------------------------
# Worker node
# -------------------------------
def worker(state: State) -> Dict[str, Any]:
    system_message = f"""You are a helpful assistant that can use tools to complete tasks.
You keep working on a task until either you have a question or clarification for the user, or the success criteria is met.
This is the success criteria:
{state['success_criteria']}
You should reply either with a question for the user or with the final answer.
"""

    if state.get("feedback_on_work"):
        system_message += f"""
Previously your response was rejected because the success criteria was not met:
{state['feedback_on_work']}
Please continue, ensuring that you meet the success criteria or ask a question for the user.
"""

    # Update or insert system message
    messages = state["messages"]
    found_system = False
    for i, msg in enumerate(messages):
        if isinstance(msg, SystemMessage):
            messages[i].content = system_message
            found_system = True
            break
    if not found_system:
        messages = [SystemMessage(content=system_message)] + messages

    # Invoke LLM with tools safely
    try:
        response = worker_llm_with_tools.invoke(messages)
    except Exception as e:
        print(f"Worker LLM failed: {e}")
        response = AIMessage(content="I'm unable to process the request. Can you clarify?")

    return {"messages": [response]}

# -------------------------------
# Worker router
# -------------------------------
def worker_router(state: State) -> str:
    last_message = state["messages"][-1]
    if hasattr(last_message, "tool_calls") and last_message.tool_calls:
        return "tools"
    return "evaluator"

# -------------------------------
# Format conversation helper
# -------------------------------
def format_conversation(messages: List[Any]) -> str:
    conversation = "Conversation history:\n\n"
    for msg in messages:
        if isinstance(msg, HumanMessage):
            conversation += f"User: {msg.content}\n"
        elif isinstance(msg, AIMessage):
            conversation += f"Assistant: {msg.content or '[Tool used]'}\n"
    return conversation

# -------------------------------
# Evaluator node (safe)
# -------------------------------
def evaluator(state: State) -> State:
    last_response = state["messages"][-1].content

    system_msg = "You are an evaluator that determines if the assistant's task is completed successfully."
    user_msg = f"""Conversation:
{format_conversation(state['messages'])}

Success criteria:
{state['success_criteria']}

Assistant's last response:
{last_response}
"""

    if state["feedback_on_work"]:
        user_msg += f"\nNote: Prior feedback was: {state['feedback_on_work']}"

    messages = [SystemMessage(content=system_msg), HumanMessage(content=user_msg)]

    try:
        eval_result = evaluator_llm_with_output.invoke(messages)
        feedback = eval_result.feedback
        success_met = eval_result.success_criteria_met
        user_needed = eval_result.user_input_needed
    except Exception as e:
        print(f"Evaluator LLM failed: {e}")
        feedback = "Evaluator could not parse output."
        success_met = False
        user_needed = True

    return {
        "messages": [AIMessage(content=f"Evaluator Feedback: {feedback}")],
        "feedback_on_work": feedback,
        "success_criteria_met": success_met,
        "user_input_needed": user_needed
    }

# -------------------------------
# Route evaluator based on result
# -------------------------------
def route_based_on_evaluation(state: State) -> str:
    if state["success_criteria_met"] or state["user_input_needed"]:
        return "END"
    return "worker"

# -------------------------------
# Build graph
# -------------------------------
graph_builder = StateGraph(State)
graph_builder.add_node("worker", worker)
graph_builder.add_node("tools", ToolNode(tools=tools))
graph_builder.add_node("evaluator", evaluator)

graph_builder.add_conditional_edges("worker", worker_router, {"tools": "tools", "evaluator": "evaluator"})
graph_builder.add_edge("tools", "worker")
graph_builder.add_conditional_edges("evaluator", route_based_on_evaluation, {"worker": "worker", "END": END})
graph_builder.add_edge(START, "worker")

memory = MemorySaver()
graph = graph_builder.compile(checkpointer=memory)

# -------------------------------
# Helpers for Gradio
# -------------------------------
def make_thread_id() -> str:
    return str(uuid.uuid4())

async def process_message(message, success_criteria, history, thread):
    config = {"configurable": {"thread_id": thread}}
    state = {
        "messages": [HumanMessage(content=message)],
        "success_criteria": success_criteria,
        "feedback_on_work": None,
        "success_criteria_met": False,
        "user_input_needed": False,
    }

    result = await graph.ainvoke(state, config=config)
    user_msg = {"role": "user", "content": message}
    reply_msg = {"role": "assistant", "content": result["messages"][-2].content}
    feedback_msg = {"role": "assistant", "content": result["messages"][-1].content}

    return history + [user_msg, reply_msg, feedback_msg]

async def reset():
    return "", "", None, make_thread_id()

# -------------------------------
# Gradio UI
# -------------------------------
with gr.Blocks(theme=gr.themes.Default(primary_hue="emerald")) as demo:
    gr.Markdown("## Sidekick Personal Co-worker")
    thread = gr.State(make_thread_id())

    with gr.Row():
        chatbot = gr.Chatbot(label="Sidekick", height=300, type="messages")
    with gr.Group():
        with gr.Row():
            message = gr.Textbox(show_label=False, placeholder="Your request to your sidekick")
        with gr.Row():
            success_criteria = gr.Textbox(show_label=False, placeholder="What are your success criteria?")
    with gr.Row():
        reset_button = gr.Button("Reset", variant="stop")
        go_button = gr.Button("Go!", variant="primary")

    message.submit(process_message, [message, success_criteria, chatbot, thread], [chatbot])
    success_criteria.submit(process_message, [message, success_criteria, chatbot, thread], [chatbot])
    go_button.click(process_message, [message, success_criteria, chatbot, thread], [chatbot])
    reset_button.click(reset, [], [message, success_criteria, chatbot, thread])

demo.launch()


* Running on local URL:  http://127.0.0.1:7868
* To create a public link, set `share=True` in `launch()`.




Worker LLM failed: registry.ollama.ai/library/deepseek-r1:70b does not support tools (status code: 400)
Worker LLM failed: registry.ollama.ai/library/deepseek-r1:70b does not support tools (status code: 400)
Worker LLM failed: registry.ollama.ai/library/deepseek-r1:70b does not support tools (status code: 400)
Worker LLM failed: registry.ollama.ai/library/deepseek-r1:70b does not support tools (status code: 400)
