# LangChain + LangGraph Quickstart

Verifies your dev-sandbox has LangChain working with Ollama, then walks through the core patterns: chains, tools, agents, and RAG.

**Prerequisites:** Ollama running on your Mac with at least one model pulled (e.g. `ollama pull llama3.1:8b-instruct-q8_0`)

## 1. Verify Environment

In [None]:
import os, importlib, httpx

# Your docker-compose sets this
OLLAMA_HOST = os.getenv("OLLAMA_HOST", "http://host.docker.internal:11434")

# Check Ollama
r = httpx.get(f"{OLLAMA_HOST}/api/tags")
models = r.json().get("models", [])
print(f"Ollama at {OLLAMA_HOST} — {len(models)} model(s):")
for m in models:
    print(f"  {m['name']}  ({m.get('size', 0) / 1e9:.1f} GB)")

# Check packages
print()
for pkg in ["langchain", "langgraph", "chromadb", "openai"]:
    mod = importlib.import_module(pkg)
    print(f"  {pkg}: {getattr(mod, '__version__', 'ok')}")

In [None]:
# ── Set the model you want to use throughout this notebook ──
# Change this to match whatever you've pulled in Ollama
MODEL = "llama3.1:8b-instruct-q8_0"
EMBED_MODEL = "nomic-embed-text"  # pull with: ollama pull nomic-embed-text

## 2. Basic LLM Call via LangChain

LangChain's `ChatOpenAI` works with Ollama since Ollama exposes an OpenAI-compatible API.

In [None]:
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage

llm = ChatOpenAI(
    base_url=f"{OLLAMA_HOST}/v1",
    api_key="not-needed",
    model=MODEL,
    temperature=0.7,
)

response = llm.invoke([
    SystemMessage(content="You are a helpful assistant. Be concise."),
    HumanMessage(content="What is retrieval-augmented generation in one paragraph?")
])

print(response.content)

## 3. Prompt Templates & Chains

The LCEL (LangChain Expression Language) pipe syntax lets you compose prompt → model → parser into a reusable chain.

In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an expert in {domain}. Explain clearly and concisely."),
    ("human", "{question}")
])

chain = prompt | llm | StrOutputParser()

result = chain.invoke({
    "domain": "distributed systems",
    "question": "What is the CAP theorem?"
})
print(result)

## 4. Streaming

For long responses, streaming gives you token-by-token output.

In [None]:
for chunk in chain.stream({"domain": "Python", "question": "Explain decorators with a short example"}):
    print(chunk, end="", flush=True)

## 5. Tools & Function Calling

Define Python functions as tools the LLM can invoke.

In [None]:
from langchain_core.tools import tool
import math

@tool
def calculator(expression: str) -> str:
    """Evaluate a math expression. Examples: '2 + 3', 'math.sqrt(144)', 'math.pi * 5**2'"""
    try:
        # Restricted eval with only math available
        result = eval(expression, {"__builtins__": {}}, {"math": math})
        return str(result)
    except Exception as e:
        return f"Error: {e}"

@tool
def get_word_count(text: str) -> str:
    """Count the number of words in a text string."""
    return str(len(text.split()))

# Test the tools directly
print(calculator.invoke("math.sqrt(144) + 10"))
print(get_word_count.invoke("hello world this is a test"))

## 6. LangGraph — ReAct Agent

LangGraph's `create_react_agent` builds a reasoning loop: the LLM decides which tools to call, observes results, and reasons until it has an answer.

In [None]:
from langgraph.prebuilt import create_react_agent

agent = create_react_agent(
    model=llm,
    tools=[calculator, get_word_count]
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "What is the square root of 256 plus 42?"}]
})

for msg in result["messages"]:
    role = getattr(msg, 'type', 'unknown')
    content = getattr(msg, 'content', '')
    if content:
        print(f"[{role}] {content}")
    # Also show tool calls if present
    tool_calls = getattr(msg, 'tool_calls', [])
    for tc in tool_calls:
        print(f"[{role} → tool] {tc['name']}({tc['args']})")

## 7. RAG — Retrieval-Augmented Generation

Store documents in ChromaDB, retrieve relevant ones, and pass them to the LLM as context.

**Note:** This uses Ollama for embeddings too. Make sure you've pulled the embed model:  
`ollama pull nomic-embed-text`

In [None]:
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.runnables import RunnablePassthrough

# Embeddings via Ollama (runs on your Mac's GPU)
embeddings = OllamaEmbeddings(
    base_url=OLLAMA_HOST,
    model=EMBED_MODEL
)

# Some sample documents
docs = [
    "LangChain is a framework for building applications with large language models.",
    "LangGraph extends LangChain with stateful, multi-step agent workflows using a graph abstraction.",
    "ChromaDB is an open-source embedding database for AI applications.",
    "Ollama runs large language models locally on your machine with GPU acceleration.",
    "RAG (Retrieval-Augmented Generation) grounds LLM responses in your own data.",
    "LCEL (LangChain Expression Language) uses pipe syntax to compose chains.",
]

# Create vector store (persisted to the chroma-data volume)
vectorstore = Chroma.from_texts(
    docs, embeddings,
    persist_directory="/home/developer/.chroma/quickstart"
)
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

print(f"Stored {len(docs)} documents. Testing retrieval...")
results = retriever.invoke("How do I build an agent?")
for r in results:
    print(f"  → {r.page_content}")

In [None]:
# Full RAG chain: retrieve → format → prompt → LLM → parse

rag_prompt = ChatPromptTemplate.from_messages([
    ("system", "Answer based only on the following context. If the context doesn't contain the answer, say so.\n\nContext:\n{context}"),
    ("human", "{question}")
])

def format_docs(docs):
    return "\n".join(d.page_content for d in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | llm
    | StrOutputParser()
)

answer = rag_chain.invoke("What is LangGraph and how does it relate to LangChain?")
print(answer)

## 8. LangGraph — Custom Agent with State

Beyond `create_react_agent`, you can define custom graph topologies. Here's a simple two-step agent: classify a question, then answer it differently based on the classification.

In [None]:
from langgraph.graph import StateGraph, START, END
from typing import TypedDict, Literal

# Define the state that flows through the graph
class AgentState(TypedDict):
    question: str
    category: str
    answer: str

# Node 1: classify the question
def classify(state: AgentState) -> AgentState:
    classify_chain = (
        ChatPromptTemplate.from_messages([
            ("system", "Classify this question as exactly one of: technical, conceptual, opinion. Reply with just the one word."),
            ("human", "{question}")
        ])
        | llm
        | StrOutputParser()
    )
    category = classify_chain.invoke({"question": state["question"]}).strip().lower()
    return {**state, "category": category}

# Node 2: answer based on category
def answer(state: AgentState) -> AgentState:
    style = {
        "technical": "Give a precise, detailed technical answer with code examples if relevant.",
        "conceptual": "Explain the concept clearly using an analogy.",
    }.get(state["category"], "Share a balanced perspective.")
    
    answer_chain = (
        ChatPromptTemplate.from_messages([
            ("system", style),
            ("human", "{question}")
        ])
        | llm
        | StrOutputParser()
    )
    result = answer_chain.invoke({"question": state["question"]})
    return {**state, "answer": result}

# Build the graph
graph = StateGraph(AgentState)
graph.add_node("classify", classify)
graph.add_node("answer", answer)
graph.add_edge(START, "classify")
graph.add_edge("classify", "answer")
graph.add_edge("answer", END)

app = graph.compile()

# Run it
result = app.invoke({"question": "How does Python's GIL work?", "category": "", "answer": ""})
print(f"Category: {result['category']}")
print(f"\nAnswer:\n{result['answer']}")

## Next Steps

You now have all the building blocks working. Some ideas to explore:

- **RAG over your own docs**: load PDFs or markdown files from `~/projects/` and build a Q&A system
- **Multi-tool agents**: add web search, file I/O, or database tools to a LangGraph agent
- **Conversation memory**: use LangGraph's checkpointing to build agents with persistent memory
- **Compare models**: swap `MODEL` to try different Ollama models and compare quality/speed
- **LangSmith tracing**: sign up at smith.langchain.com (free tier) to visualize agent execution