<a href="https://colab.research.google.com/github/frank-morales2020/MLxDL/blob/main/MYVERSION_OnDemandLoaderTool.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<a href="https://colab.research.google.com/github/jerryjliu/llama_index/blob/main/docs/examples/tools/OnDemandLoaderTool.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# OnDemandLoaderTool Tutorial

Our `OnDemandLoaderTool` is a powerful agent tool that allows for "on-demand" data querying from any data source on LlamaHub.

This tool takes in a `BaseReader` data loader, and when called will 1) load data, 2) index data, and 3) query the data.

In this walkthrough, we show how to use the `OnDemandLoaderTool` to convert our Wikipedia data loader into an accessible search tool for a LangChain agent.

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [1]:
%pip install llama-index-readers-wikipedia

Collecting llama-index-readers-wikipedia
  Downloading llama_index_readers_wikipedia-0.1.3-py3-none-any.whl (2.1 kB)
Collecting llama-index-core<0.11.0,>=0.10.1 (from llama-index-readers-wikipedia)
  Downloading llama_index_core-0.10.12-py3-none-any.whl (15.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.3/15.3 MB[0m [31m24.0 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-wikipedia)
  Downloading dataclasses_json-0.6.4-py3-none-any.whl (28 kB)
Collecting deprecated>=1.2.9.3 (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-wikipedia)
  Downloading Deprecated-1.2.14-py2.py3-none-any.whl (9.6 kB)
Collecting dirtyjson<2.0.0,>=1.0.8 (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-wikipedia)
  Downloading dirtyjson-1.0.8-py3-none-any.whl (25 kB)
Collecting httpx (from llama-index-core<0.11.0,>=0.10.1->llama-index-readers-wikipedia)
  Downloading httpx-0.27.0-py3-none-any.w

In [2]:
!pip install llama-index

Collecting llama-index
  Downloading llama_index-0.10.12-py3-none-any.whl (5.6 kB)
Collecting llama-index-agent-openai<0.2.0,>=0.1.4 (from llama-index)
  Downloading llama_index_agent_openai-0.1.5-py3-none-any.whl (12 kB)
Collecting llama-index-cli<0.2.0,>=0.1.2 (from llama-index)
  Downloading llama_index_cli-0.1.5-py3-none-any.whl (25 kB)
Collecting llama-index-embeddings-openai<0.2.0,>=0.1.5 (from llama-index)
  Downloading llama_index_embeddings_openai-0.1.6-py3-none-any.whl (6.0 kB)
Collecting llama-index-indices-managed-llama-cloud<0.2.0,>=0.1.2 (from llama-index)
  Downloading llama_index_indices_managed_llama_cloud-0.1.3-py3-none-any.whl (6.6 kB)
Collecting llama-index-legacy<0.10.0,>=0.9.48 (from llama-index)
  Downloading llama_index_legacy-0.9.48-py3-none-any.whl (2.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m26.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting llama-index-llms-openai<0.2.0,>=0.1.5 (from llama-index)
  Downloading

In [3]:
from llama_index.core.tools.ondemand_loader_tool import OnDemandLoaderTool
from llama_index.readers.wikipedia import WikipediaReader
from typing import List

from pydantic import BaseModel

### Define Tool

We first define the `WikipediaReader`. Note that the `load_data` interface to `WikipediaReader` takes in a list of `pages`. By default, this queries the Wikipedia search endpoint which will autosuggest the relevant pages.

We then wrap it into our `OnDemandLoaderTool`.

By default since we don't specify the `index_cls`, a simple vector store index is initialized.

In [4]:
!pip install wikipedia
!pip install langchain

Collecting wikipedia
  Downloading wikipedia-1.4.0.tar.gz (27 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: wikipedia
  Building wheel for wikipedia (setup.py) ... [?25l[?25hdone
  Created wheel for wikipedia: filename=wikipedia-1.4.0-py3-none-any.whl size=11678 sha256=7151f0c81f8a3aefbeeebcd582d5a5c4a0d280cb798b1e4aef7900c3426722ef
  Stored in directory: /root/.cache/pip/wheels/5e/b6/c5/93f3dec388ae76edc830cb42901bb0232504dfc0df02fc50de
Successfully built wikipedia
Installing collected packages: wikipedia
Successfully installed wikipedia-1.4.0
Collecting langchain
  Downloading langchain-0.1.9-py3-none-any.whl (816 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m817.0/817.0 kB[0m [31m8.3 MB/s[0m eta [36m0:00:00[0m
Collecting jsonpatch<2.0,>=1.33 (from langchain)
  Downloading jsonpatch-1.33-py2.py3-none-any.whl (12 kB)
Collecting langchain-community<0.1,>=0.0.21 (from langchain)
  Downloading langchain_commu

In [5]:
reader = WikipediaReader()

In [6]:
tool = OnDemandLoaderTool.from_defaults(
    reader,
    name="Wikipedia Tool",
    description="A tool for loading and querying articles from Wikipedia",
)

#### Testing

We can try running the tool by itself (or as a LangChain tool), just to showcase what the interface is like!

Note that besides the arguments required for the data loader, the tool also takes in a `query_str` which will be
the query against the index.

In [None]:
#added by Frank Morales(FM) 22/02/2024
%pip install openai  --root-user-action=ignore
%pip install colab-env --upgrade --quiet --root-user-action=ignore

In [8]:
#added by Frank Morales(FM) 22/02/2024
import warnings
warnings.filterwarnings('ignore')

import colab_env
import openai
import os
openai.api_key = os.getenv("OPENAI_API_KEY")

from openai import OpenAI
client = OpenAI()

Mounted at /content/gdrive


In [None]:
# run tool by itself
#added by Frank Morales(FM) 22/02/2024
tool(["montreal"], query_str="What is the best restaurant in montreal?")

In [10]:
# run tool as langchain structured tool
lc_tool = tool.to_langchain_structured_tool(verbose=True)

In [None]:
lc_tool.run(
    tool_input={
        "pages": ["montreal"],
        "query_str": "What is the best restaurant in montreal?",
    }
)

### Initialize LangChain Agent

For tutorial purposes, the agent just has access to one tool - the Wikipedia Reader

Note that we need to use Structured Tools from LangChain.

In [19]:
from langchain.agents import initialize_agent
from langchain.chat_models import ChatOpenAI

In [18]:
llm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo", streaming=True)

In [17]:
agent = initialize_agent(
    [lc_tool],
    llm=llm,
    agent="structured-chat-zero-shot-react-description",
    verbose=False,
)

# Now let's run some queries!

The OnDemandLoaderTool allows the agent to simultaneously 1) load the data from Wikipedia, 2) query that data.

In [21]:
agent.run("What is the best restaurant in montreal?")


Observation: [36;1m[1;3mThe best restaurants in Canada include a variety of establishments across different provinces and cities. Some notable ones are Alo and St. Lawrence in Toronto, Au Pied de Cochon and Joe Beef in Montreal, and Toqué! in Quebec City. These restaurants are recognized for their exceptional dining experiences and high-quality cuisine.[0m
Thought:
Observation: [36;1m[1;3mAu Pied de Cochon, Joe Beef, Toqué!, Alo, and Schwartz's are some of the best restaurants in Canada.[0m
Thought:

"Some of the best restaurants in Montreal are Au Pied de Cochon, Joe Beef, Toqué!, Alo, and Schwartz's."