# ToolRag Agent with LangChain, Granite and watsonx.ai

This notebook shows how to build a Retrieval-Augmented Generation (RAG) agent with a large, semantically-searchable toolset, powered by LangChain’s agent framework, IBM’s Granite LLM, and watsonx embeddings. You’ll see credential setup, tool semantic indexing, and agent orchestration for robust research and engineering workflows.

## Prerequisites
- Python 3.10+ environment (e.g., Jupyter, Colab, or watsonx.ai).
- Libraries: langgraph, langgraph-bigtool, langchain-ibm, langchain-huggingface, transformers, torch, python-dotenv.
- IBM watsonx.ai credentials (API key, project ID) for Granite model access. Alternatively, use local Granite via transformers.

Let's build a scalable ToolRag Agent!

# Steps

## Step 1. Set up your environment

While you can choose from several tools, this recipe is best suited for a Jupyter Notebook. Jupyter Notebooks are widely used within data science to combine code with various data sources such as text, images and data visualizations. 

You can run this notebook in [Colab](https://colab.research.google.com/), or download it to your system and [run the notebook locally](https://github.com/ibm-granite-community/granite-kitchen/blob/main/recipes/Getting_Started_with_Jupyter_Locally/Getting_Started_with_Jupyter_Locally.md). 

To avoid Python package dependency conflicts, we recommend setting up a [virtual environment](https://docs.python.org/3/library/venv.html).

Note, this notebook is compatible with Python 3.12 and well as Python 3.11, the default in Colab at the time of publishing this recipe. To check your python version, you can run the `!python --version` command in a code cell.


## Step 2. Set up a watsonx.ai instance

See [Getting Started with IBM watsonx](https://github.com/ibm-granite-community/granite-kitchen/blob/main/recipes/Getting_Started/Getting_Started_with_WatsonX.ipynb) for information on getting ready to use watsonx.ai. 

You will need three credentials from the watsonx.ai set up to add to your environment: `WATSONX_URL`, `WATSONX_APIKEY`, and `WATSONX_PROJECT_ID`.

## Step 3. Install relevant libraries and set up credentials and the Granite model

We'll need a few libraries for this recipe. We will be using LangGraph and LangChain libraries to use Granite on watsonx.ai.

In [None]:
# Install core libraries
%pip install -qU langchain-ibm langgraph langgraph-bigtool ibm-watsonx-ai

# Install RAG components (Vector Store and utilities)
%pip install -q chromadb langchain-community langchain-chroma

# Install IBM specific utility for easy credentials load
%pip install -q "git+https://github.com/ibm-granite-community/utils.git"

## Authentication and model initialization 

The next step involves initialization of the watsonx LLM (used for the agent's reasoning) and watsonx Embeddings (for Tool-RAG semantic search).

**Note:** Ensure your environment variables (`WATSONX_URL`, `WATSONX_APIKEY`, `WATSONX_PROJECT_ID`) are set before running this cell.

In [None]:
import os
import math
import types
import uuid
from getpass import getpass
from typing_extensions import Annotated
from langchain.chat_models import init_chat_model
from langchain_ibm import WatsonxEmbeddings
from langchain_core.utils.utils import convert_to_secret_str
from ibm_granite_community.notebook_utils import get_env_var
from ibm_watsonx_ai.metanames import EmbedTextParamsMetaNames
from langchain_core.documents import Document
from langchain_chroma import Chroma
from langgraph_bigtool import create_agent
from langgraph.store.memory import InMemoryStore as LangGraphStore
from langchain_core.messages import HumanMessage
from ibm_granite_community.notebook_utils import get_env_var
from langchain_core.utils.utils import convert_to_secret_str
from langchain.chat_models import init_chat_model

# --- Configuration ---
model = "ibm/granite-3-3-8b-instruct"

llm_params = {
    "temperature": 0,
    "max_completion_tokens": 200,
    "repetition_penalty": 1.05,
}

# --- 1. LLM Initialization (Agent's Brain) ---
llm = init_chat_model(
    model=model,
    model_provider="ibm",
    url=convert_to_secret_str(get_env_var("WATSONX_URL")),
    apikey=convert_to_secret_str(get_env_var("WATSONX_APIKEY")),
    project_id=get_env_var("WATSONX_PROJECT_ID"),
    params=llm_params,
)
print(f"LLM initialized: {model}")


# --- 2. Embeddings Initialization (Tool-RAG Indexer) ---
watsonx_embedding = WatsonxEmbeddings(
    model_id="ibm/granite-embedding-107m-multilingual",
    url=get_env_var("WATSONX_URL"),
    apikey=get_env_var("WATSONX_APIKEY"),
    project_id=get_env_var("WATSONX_PROJECT_ID"),
    params={
        EmbedTextParamsMetaNames.TRUNCATE_INPUT_TOKENS: 3
    }
)
print("Embeddings initialized: ibm/granite-embedding-107m-multilingual")

## Defining Tools (The "Big Tool" Set)

The use of the Python `math` module serves as an ideal "Big Tool" set for this demonstration. To demonstrate the "Big Tool" concept—where an agent must select a few relevant tools from a very large set—we will use the entire public API of Python's built-in math module. The module contains roughly 50 functions (including logarithmic, trigonometric, hyperbolic, power, and number-theoretic functions), providing a sufficiently large toolset to stress the RAG mechanism

In the next step, we will iterate over every function in the math module, convert them into `langchain_core.tools.Tool` objects, and store their descriptions in a Chroma vector store. This will create our indexed "Big Tool" registry, enabling the RAG-Tooling approach.

In [None]:
from langgraph_bigtool.utils import convert_positional_only_function_to_tool

# 1. Collect all functions from `math` as an example set of "Big Tools"
all_tools = []
for function_name in dir(math):
    function = getattr(math, function_name)
    if not isinstance(function, types.BuiltinFunctionType):
        continue
    # Convert functions to LangChain tools
    lc_tool = convert_positional_only_function_to_tool(function)
    # FIX: Only include tools that were successfully converted (i.e., not None)
    if lc_tool is not None:
        all_tools.append(lc_tool)

print(f"Total tools collected and successfully converted: {len(all_tools)}")

# 2. Create the tool registry dictionary (id -> tool) for BigTool
tool_registry = {}
for t in all_tools:
    # Assign a unique ID to each tool
    tool_registry[str(uuid.uuid4())] = t

## 4. The Tool-RAG Recipe (Custom Retrieval Function)

This section implements the **Tool-RAG** mechanism. We index tool descriptions into a **Chroma** vector store using **watsonx Embeddings**, and define a custom function to retrieve the most relevant tools based on the user query.

The Tool RAG (Retrieval-Augmented Generation) uses the concepts of Tool calling and RAG:

1. Tool Indexing (The RAG Step): All tool metadata (names, descriptions, and schemas) are treated as documents and embedded into a Vector Store (in our case, ChromaDB).
2. Tool Retrieval: When a user asks a question, a retrieval step is executed first. The user's query is used to perform a semantic search against the Vector Store.
3. Dynamic Binding: Only the top K most semantically relevant tools are retrieved and dynamically bound to the LLM's prompt. This keeps the prompt concise and relevant.
4. Tool Execution: The LLM will only see a handful of relevant tools, where it can reliably select and call the correct one.

This approach ensures the agent can scale to a virtually unlimited number of tools while maintaining high performance and prompt efficiency.


In [None]:
from langchain_core.documents import Document
from langchain_chroma import Chroma

TOOL_ID_KEY = "\u200btool_id"

# 1. Initialize the Chroma Vector Store
vectorstore = Chroma(
    embedding_function=watsonx_embedding,
    collection_name="tool_rag_index_watsonx_v4_clean",
)

# Ensure the collection is ready. Use reset_collection to ensure a clean, initialized collection.
try:
    vectorstore.reset_collection()
except Exception as e:
    print(f"Warning during collection reset: {e}")
    pass

# Index tools into Chroma
tool_documents = []
for tool_id, t in tool_registry.items():
    tool_documents.append(
        Document(
            # The page_content is what gets embedded for RAG search
            page_content=f"{t.name}: {t.description}",
            metadata={TOOL_ID_KEY: tool_id}
        )
    )

doc_ids = vectorstore.add_documents(tool_documents)
print(f"Successfully indexed {len(doc_ids)} tool descriptions in Chroma.")


# 2. Define the Custom Tool Retrieval Function (The RAG Logic)
def retrieve_tools_from_chroma(query: str) -> list[str]:
    """
    Retrieve tool IDs from the Chroma vector store based on the user's query.
    This function implements the 'Tool-RAG' step.
    """
    # Use Chroma's similarity search to find the top 2 most relevant tool descriptions
    results = vectorstore.similarity_search(query, k=2)
    
    # Extract the original tool_id from the metadata
    tool_ids = [doc.metadata[TOOL_ID_KEY] for doc in results]
    return tool_ids

print("Custom tool retrieval function defined, connecting BigTool to Chroma/watsonx RAG.")

## 5. Building and Compiling the ToolRag Agent

This cell constructs the core logic of our agent using LangGraph, the state machine layer for LangChain. LangGraph allows us to define the specific steps and decision points in our agent's workflow. We use `langgraph_bigtool.create_agent` and pass the custom RAG function to build the final agent graph.

In [None]:
from langgraph_bigtool import create_agent
from langgraph.store.memory import InMemoryStore as LangGraphStore

# Create the BigTool Agent Builder, passing the custom retrieval function
builder = create_agent(
    llm, 
    tool_registry,
    # This plugs the custom RAG logic into the agent's workflow
    retrieve_tools_function=retrieve_tools_from_chroma 
)

# Compile the graph. We use the standard `LangGraphStore` for graph state persistence.
agent = builder.compile(store=LangGraphStore()) 

print("ToolRag Agent (BigTool + watsonx + Chroma RAG) compiled successfully.")