<p> <center> <a href="../start_here.ipynb.ipynb">Home Page</a> </center> </p>

<div>
    <span style="float: left; width: 33%; text-align: left;"><a href="03_low_level_mcp.ipynb">Previous Notebook</a></span>
    <span style="float: left; width: 34%; text-align: center;">
        <a href="01_inference_endpoint.ipynb">1</a>
        <a href="02_introduction_mcp.ipynb">2</a>
        <a href="03_low_level_mcp.ipynb">3</a>
        <a >4</a>
        <a href="05_challenge.ipynb">5</a>
    </span>
    <span style="float: left; width: 33%; text-align: right;"><a href="05_challenge.ipynb">Next Notebook</a></span>
</div>

## Learning objectives

By the end of this notebook, you will be able to:
- Define LangGraph State schemas and build chatbot workflows using StateGraph, nodes, and edges
- Connect NVIDIA NIM endpoints as the LLM backend using the `nvidia` model provider
- Stream graph responses using `graph.stream()` for real-time output
- Implement structured output with Pydantic models for parseable LLM responses

## Setup Environment 

In the first notebook, we learned how to set up our generated `NVIDIA API KEY`. As a requirement for this notebook, you must set up the key as enviroment variable `NVIDIA_API_KEY` to pull the NIMs docker images of your choice. If you haven't gotten your key, please visit the NVIDIA NIMs API [homepage](https://build.nvidia.com/explore/discover) and generate your API Key. Please run the cell below, input your `NVIDIA API KEY` in the display textbox, and press the enter key on your keyboard.

In [None]:
import os
import getpass

if not os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):
    nvapi_key = getpass.getpass("Enter your NVIDIA API key: ")
    assert nvapi_key.startswith("nvapi-"), f"{nvapi_key[:5]}... is not a valid key"
    os.environ["NVIDIA_API_KEY"] = nvapi_key
    os.environ["NGC_API_KEY"] = nvapi_key

## Introduction to LangGraph

In [None]:
from typing import Annotated
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START
from langgraph.graph.message import add_messages
from pydantic import BaseModel
from typing import Literal

In [None]:
from langchain.chat_models import init_chat_model

This creates a LangChain chat model connected to NVIDIA NIM. The `init_chat_model()` function handles all the configuration automatically—just specify the model ID and provider, and you're ready to start generating responses.

In [None]:
model_id = 'meta/llama-3.2-3b-instruct'
llm = init_chat_model(model=model_id, model_provider="nvidia")

In [None]:
# llm.get_available_models()

In this section, we'll construct a simple agentic workflow using LangGraph's StateGraph. Here's what we'll do:

1. Create a `StateGraph` with the state schema
2. Add nodes using `add_node(name, function)`
3. Add edges using `add_edge(source, target)`
4. Compile the graph before execution

In [None]:
class State(TypedDict):
    """
    Graph state schema.
    - messages: List of conversation messages with automatic append behavior
    """
    messages: Annotated[list, add_messages]

The State holds the conversation history using the `add_messages` reducer, which automatically appends new messages to the list.

Nodes are Python functions that receive state, perform actions, and return updated state.

In [None]:
def chatbot(state: State):
    """
    Chatbot node that invokes the LLM with conversation history.
    Returns updated state with the assistant's response.
    """
    return {"messages": [llm.invoke(state["messages"])]}

In [None]:
graph_builder = StateGraph(State)

# Add the chatbot node
graph_builder.add_node("chatbot", chatbot)

# Connect START -> chatbot (entry point)
graph_builder.add_edge(START, "chatbot")

# Compile the graph
graph = graph_builder.compile()

Use `graph.stream()` to get real-time token-by-token responses, improving user experience.

In [None]:
def stream_graph_updates(user_input: str):
    """Stream responses from the graph for real-time output."""
    for event in graph.stream({"messages": [{"role": "user", "content": user_input}]}):
        for value in event.values():
            print("Assistant:", value["messages"][-1].content)

In [None]:
stream_graph_updates("what is the meaning of life?")

## Structured Output with Pydantic

Applications often need LLM responses in parseable formats (e.g., JSON) for downstream processing. NVIDIA NIM supports structured generation using guided JSON schemas. We use Pydantic's `BaseModel` to define the expected output structure. The `Literal` type restricts the output to specific values.

In [None]:
from pydantic import BaseModel
from typing import Literal

class UserIntent(BaseModel):
    """The user's current intent in the conversation"""
    intent: Literal["naruto", "bleach"]

Reference: [NIM Structured Generation Docs](https://docs.nvidia.com/nim/large-language-models/latest/structured-generation.html)

In [None]:
llm_structured = init_chat_model(model=model_id, model_provider="nvidia").with_structured_output(
    UserIntent, strict=True
)

Use `.with_structured_output()` to enforce the Pydantic schema on LLM responses.

In [None]:
# Test: Classify user intent based on anime question
res = llm_structured.invoke([
    {'role':'system','content':'You are an anime encyclopedia. Classify if the user is asking a question on naruto or bleach.'},
    {'role':'user','content':'who is sasuke?'}
])

In [None]:
print(f'intent: {res}')

## Links and Resources

- [LangGraph](https://github.com/langchain-ai/langgraph)
- [LangChain NVIDIA](https://github.com/langchain-ai/langchain-nvidia)

---

## Licensing

Copyright © 2025 OpenACC-Standard.org. This material is released by OpenACC-Standard.org, in collaboration with NVIDIA Corporation, under the Creative Commons Attribution 4.0 International (CC BY 4.0). These materials include references to hardware and software developed by other entities; all applicable licensing and copyrights apply.

<p> <center> <a href="../start_here.ipynb.ipynb">Home Page</a> </center> </p>

<div>
    <span style="float: left; width: 33%; text-align: left;"><a href="03_low_level_mcp.ipynb">Previous Notebook</a></span>
    <span style="float: left; width: 34%; text-align: center;">
        <a href="01_inference_endpoint.ipynb">1</a>
        <a href="02_introduction_mcp.ipynb">2</a>
        <a href="03_low_level_mcp.ipynb">3</a>
        <a >4</a>
        <a href="05_challenge.ipynb">5</a>
    </span>
    <span style="float: left; width: 33%; text-align: right;"><a href="05_challenge.ipynb">Next Notebook</a></span>
</div>