# LangGraph with Tools and Retrieval

### ✅ Restored Data Context

You're building a **LangGraph-based multi-agent web app using Streamlit** that accepts a software development task (e.g., “Scrape all links from a webpage”), then executes multiple AI agents (planner, researcher, coder, reviewer, tester) in a **graph-defined order** using LangChain and OpenAI models.

You previously encountered and fixed:

* KeyError due to output misreference
* Issues with `.bind(system=...)` in `ChatOpenAI`
* Required `docstring` in tools
* Missing `state_schema` in `StateGraph`

Now you're asking:



### 📦 LangGraph + Tools + LangChain Toolkits — Explained

LangGraph provides **structured control flow** for agents and chains in **graph-based logic**, while LangChain provides **modular components** (chains, tools, prompts, agents). Combining them lets you create advanced multi-agent systems.



### 🛠️ What Are Tools in LangChain?

**Tools** in LangChain are callable functions wrapped so that LLM agents can use them.

```python
from langchain.tools import tool

@tool
def multiply(a: int, b: int) -> int:
    """Multiply two numbers together."""
    return a * b
```

These can be used by agents that support tool usage (e.g., `ChatOpenAI` with function calling).



### 🔄 How LangGraph Works with Tools

You can treat a `tool` like a **LangGraph node**, allowing agents to invoke tools during execution. Example use cases:

* API calls
* Calculations
* File operations
* Database lookups



### 🔗 LangChain Toolkit Overview

LangChain offers modular **building blocks**:

| Toolkit        | Purpose                                               |
| -------------- | ----------------------------------------------------- |
| **Chains**     | Sequential tasks (LLM → prompt → output)              |
| **Agents**     | Dynamic tools and reasoning via ReAct or OpenAI tools |
| **Memory**     | Stateful interactions across steps                    |
| **Tools**      | Plug-in functions like search, math, scrape           |
| **ChatPrompt** | Template chat formatting                              |
| **Retrievers** | For document-based QA                                 |



### 🔁 LangGraph + LangChain Integration Flow

**Typical pattern:**

1. Use LangChain tools (`@tool`) for actions.
2. Define agents in LangGraph as **nodes**.
3. Attach tools to agents using LangChain's tool wrappers.
4. Control execution using LangGraph’s graph structure.
5. Use `invoke()` or `astream()` to run the graph.



### ✅ Example Use Case

Let’s say the user wants:

> “Generate a script to scrape image URLs from a webpage.”

**Agents & Tools Setup:**

* **Planner**: breaks task into steps.
* **Researcher**: finds best libraries.
* **Coder**: writes code (optionally uses tools).
* **Tool**: validates or executes code (e.g., "run\_code").
* **Reviewer**: refines code.



### 📚 Adding **Retrieval-Augmented Generation (RAG)** to LangGraph

**RAG (Retrieval-Augmented Generation)** enhances LLM responses by combining:

* **Retrieval**: fetch relevant documents from a knowledge base.
* **Generation**: LLM generates an answer based on those docs.



### 🧠 Why RAG with LangGraph?

In a **LangGraph agent system**, RAG helps:

* Agents (like Researcher, Coder) answer based on **custom docs**.
* Provide **contextual grounding** to reduce hallucination.
* Enable question-answering over **private or domain-specific data**.



### 🧱 Core Components for RAG

1. **Vector Store** (e.g., FAISS, Chroma, Weaviate): for storing document embeddings.
2. **Embeddings Model**: converts text into vector form (e.g., `OpenAIEmbeddings`, `HuggingFaceEmbeddings`).
3. **Retriever**: queries the vector DB to get relevant chunks.
4. **Document Chain** or **RAG Chain**: formats input + retrieved data, feeds to LLM.



### ✅ Integration Steps in LangGraph

#### 1. **Load & Embed Documents**

```python
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.document_loaders import TextLoader

loader = TextLoader("data/dev_docs.txt")
docs = loader.load()

embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(docs, embeddings)
retriever = vectorstore.as_retriever()
```



#### 2. **Create a Retrieval Chain (RAG Chain)**

```python
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(model="gpt-4")
rag_chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)
```



#### 3. **Use RAG Chain as a LangGraph Node**

```python
def research_node(state):
    question = state["messages"][-1].content
    result = rag_chain.run(question)
    state["messages"].append(HumanMessage(content=result))
    return {"messages": state["messages"]}
```

Then connect it like:

```python
builder.add_node("researcher", research_node)
builder.add_edge("planner", "researcher")
```



### 🧠 Use Cases

* 👨‍⚕️ Medical research agents reading journals
* 📄 Legal assistant agents referencing case law
* 🏢 Corporate AI agents trained on company SOPs or internal docs



### 💡 Best Practices

| Tip        | Description                                             |
| ---------- | ------------------------------------------------------- |
| Chunking   | Split docs into 500–1000 tokens for better relevance    |
| Metadata   | Store source/file names for traceability                |
| Filter     | Use filters in retriever if docs are large/multi-domain |
| Updateable | Keep vectorstore updateable (e.g., `add_documents`)     |



### 🔄 **Graph Nodes for Search, Vector Store Lookup, and Summarization in LangGraph**

In LangGraph, each **node** represents a distinct **step** or **behavior** in the agent's reasoning pipeline. When integrating **search**, **vector store lookup**, and **summarization**, each of these can be designed as dedicated nodes.



## 🧠 What Are Nodes in LangGraph?

- A **node** is a function or a chain that:
  - Takes in a `state` (like message history or task info)
  - Processes it
  - Returns an updated state



## 📌 1. **Search Node** (Web or API Search)

### ✅ Purpose:
To allow agents to fetch **real-time info** from the web or external APIs.

### 🔧 Implementation:
```python
from langchain.tools import Tool

def web_search_tool(query: str) -> str:
    # Custom search logic or API like SerpAPI, Tavily
    return f"Search results for: {query}"  # Simulate results

search_node = Tool.from_function(
    func=web_search_tool,
    name="WebSearch",
    description="Useful for retrieving real-time information from the internet."
)
```

### 👇 In Graph:
```python
builder.add_node("search", safe_node(search_node, "Search"))
builder.add_edge("researcher", "search")
```



## 📌 2. **Vector Store Lookup Node** (Retrieval / RAG)

### ✅ Purpose:
Fetch relevant documents from a **local or hosted vector database** using user queries.

### 🔧 Implementation:
```python
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

retriever = FAISS.load_local("vector_db", OpenAIEmbeddings()).as_retriever()

def vector_lookup(state):
    query = state["messages"][-1].content
    docs = retriever.get_relevant_documents(query)
    results = "\n".join([doc.page_content for doc in docs[:3]])
    state["messages"].append(HumanMessage(content=results))
    return {"messages": state["messages"]}
```

### 👇 In Graph:
```python
builder.add_node("vector_lookup", vector_lookup)
builder.add_edge("planner", "vector_lookup")
```



## 📌 3. **Summarization Node**

### ✅ Purpose:
To generate a concise summary of long retrieved results or multi-turn conversation.

### 🔧 Implementation:
```python
from langchain.chains.summarize import load_summarize_chain

summary_chain = load_summarize_chain(ChatOpenAI(model="gpt-4"), chain_type="stuff")

def summarization_node(state):
    docs = state["messages"][-1].content
    summary = summary_chain.run([Document(page_content=docs)])
    state["messages"].append(HumanMessage(content=summary))
    return {"messages": state["messages"]}
```

### 👇 In Graph:
```python
builder.add_node("summarizer", summarization_node)
builder.add_edge("vector_lookup", "summarizer")
```



### 🔁 Example Pipeline:

```mermaid
graph LR
Start --> Planner
Planner --> Search
Search --> Vector_Lookup
Vector_Lookup --> Summarizer
Summarizer --> Coder
Coder --> Reviewer --> Tester --> END
```



## 💡 Tips

| Feature        | Suggestion                                      |
|----------------|--------------------------------------------------|
| Vector Store   | Use `Chroma`, `FAISS`, or `Weaviate` for local/dev |
| Search Tool    | Use `Tavily`, `SerpAPI`, or `Bing Web Search`     |
| Summarization  | Use `map_reduce` or `refine` for large docs       |
| Streamlit UI   | Allow user to toggle: 🔍 Search / 📚 RAG / 📝 Summary |
