# Building a Exa Search Powered Data Agent

<a href="https://colab.research.google.com/github/run-llama/llama-hub/blob/main/llama_hub/tools/notebooks/exa.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This tutorial walks through using the LLM tools provided by the [Exa API](https://exa.ai) to allow LLMs to use semantic queries to search for and retrieve rich web content from the internet.

To get started, you will need an [OpenAI api key](https://platform.openai.com/account/api-keys) and an [Exa API key](https://dashboard.exa.ai/api-keys)

We will import the relevant agents and tools and pass them our keys here:

In [1]:
# Install the relevant LlamaIndex packages, incl. core and Exa tool
!pip install llama-index llama-index-core llama-index-tools-exa

Collecting llama-index
  Using cached llama_index-0.10.54-py3-none-any.whl.metadata (11 kB)
Collecting llama-index-core
  Using cached llama_index_core-0.10.53.post1-py3-none-any.whl.metadata (2.5 kB)
Collecting llama-index-tools-exa
  Using cached llama_index_tools_exa-0.1.3-py3-none-any.whl.metadata (2.1 kB)
Collecting llama-index-agent-openai<0.3.0,>=0.1.4 (from llama-index)
  Using cached llama_index_agent_openai-0.2.8-py3-none-any.whl.metadata (729 bytes)
Collecting llama-index-cli<0.2.0,>=0.1.2 (from llama-index)
  Using cached llama_index_cli-0.1.12-py3-none-any.whl.metadata (1.5 kB)
Collecting llama-index-embeddings-openai<0.2.0,>=0.1.5 (from llama-index)
  Using cached llama_index_embeddings_openai-0.1.10-py3-none-any.whl.metadata (604 bytes)
Collecting llama-index-indices-managed-llama-cloud>=0.2.0 (from llama-index)
  Using cached llama_index_indices_managed_llama_cloud-0.2.4-py3-none-any.whl.metadata (3.8 kB)
Collecting llama-index-legacy<0.10.0,>=0.9.48 (from llama-index)


In [2]:
# Get OS for environment variables
import os
# Set up OpenAI
from llama_index.agent.openai import OpenAIAgent

# NOTE:
# You must have an OpenAI API key in the environment variable OPENAI_API_KEY
# You must have an Exa API key in the environment variable EXA_API_KEY

# Set up the Exa search tool
from llama_index.tools.exa import ExaToolSpec

#Instantiate
exa_tool = ExaToolSpec(
    api_key=os.environ["EXA_API_KEY"],
    # max_characters=2000   # this is the default
)

# Get the list of tools to see what Exa offers
exa_tool_list = exa_tool.to_tool_list()
for tool in exa_tool_list:
    print(tool.metadata.name)

search
retrieve_documents
search_and_retrieve_documents
search_and_retrieve_highlights
find_similar
current_date


## Testing the Exa tools

We've imported our OpenAI agent, set up the API keys, and initialized our tool, checking the methods that it has available. Let's test out the tool before setting up our Agent.

All of the Exa search tools make use of the `AutoPrompt` option where Exa will pass the query through an LLM to refine it in line with Exa query best-practice.

In [5]:
exa_tool.search_and_retrieve_documents("machine learning transformers", num_results=3)

[Exa Tool] Autoprompt: Here is a comprehensive guide to machine learning transformers:


[Document(id_='348691a5-01af-4d94-9279-789038d8fa63', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Xavier Amatriain Mar 26, 2022 1 min read I no longer maintain my Medium blog. Find the newest (2023) version of the Tranformer catalog in my personal blog here.', mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'),
 Document(id_='6b42dfce-e05b-4c1d-83fe-138367f11aac', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='“The preeminent book for the preeminent transformers library—a model of clarity!”\n—Jeremy Howard, cofounder of fast.ai and professor at University of Queensland\n\n“A wonderfully clear and incisive guide to modern NLP’s most essential library. Recommended!”\n—Christopher Manning, Thomas M. Siebel Professor in Machine Learn

In [6]:
exa_tool.find_similar(
    "https://www.mihaileric.com/posts/transformers-attention-in-disguise/"
)

[{'title': 'Transformers: a Primer',
  'url': 'http://www.columbia.edu/~jsl2239/transformers.html',
  'id': 'http://www.columbia.edu/~jsl2239/transformers.html'},
 {'title': 'Illustrated Guide to Transformers- Step by Step Explanation',
  'url': 'https://towardsdatascience.com/illustrated-guide-to-transformers-step-by-step-explanation-f74876522bc0?gi=8fe76db5c4d9',
  'id': 'https://towardsdatascience.com/illustrated-guide-to-transformers-step-by-step-explanation-f74876522bc0?gi=8fe76db5c4d9'},
 {'title': 'The Transformer Attention Mechanism - MachineLearningMastery.com',
  'url': 'https://machinelearningmastery.com/the-transformer-attention-mechanism/',
  'id': 'https://machinelearningmastery.com/the-transformer-attention-mechanism/'}]

In [13]:
exa_tool.search_and_retrieve_documents(
    "This is a summary of recent research around diffusion models:", num_results=1
)

[Exa Tool] Autoprompt: Here is a summary of recent research around diffusion models:


[Document(id_='3e7c3b3d-e21b-4b7e-96b4-d5d8ccec12ab', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Getting-started-with-Diffusion-Literature\nSummary of the most important papers and blogs about diffusion models for students to learn about diffusion models. Also contains an overview of all published robotics diffusion papers\nLearning about Diffusion models\nWhile there exist many tutorials for Diffusion models, below you can find an overview of some introduction blog posts and video, which I found the most intuitive and useful:\n\nWhat are Diffusion Models?: an introduction video, which introduces the general idea of diffusion models and some high-level math about how the model works\n Generative Modeling by Estimating Gradients of the Data Distribution: blog post from the one of the most influential authors in this area, which introduces diffusion models from the score-based perspective\n What are Diffusion Model

While `search_and_retrieve_documents` returns raw text from the source document, `search_and_retrieve_highlights` returns relevant curated snippets.

In [11]:
exa_tool.search_and_retrieve_highlights(
    "This is a summary of recent research around diffusion models:", num_results=1
)

[Exa Tool] Autoprompt: Here is a summary of recent research around diffusion models:


[Document(id_='c0fde261-f3b7-41b6-be5c-931ce5134242', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Diffusion models are a type of generative model and in this field, the main focus are vision based applications, thus all theory papers mentioned in the text below are mostly focused on image synthesis or similar tasks related to it. Diffusion models can be viewed from two perspectives: one is based on the initial idea of of Sohl-Dickstein et al., (2015) and the other is based on a different direction of research: score-based generative models. Song & Ermon, (2019) introduced the score-based generative models category. They presented the noise-conditioned score network (NCSN), which is a predecessor to diffusion model. The main idea of the paper is to learn the score function of the unknown data distribution with a neural network.', mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadat

### Exploring other Exa functionalities
There are many additional parameters that you can pass to Exa methods to help find only what you need - namely domain and date filters.

In [23]:
# Example 1: Calling search_and_contents with date filters
exa_tool.search_and_retrieve_documents(
    "Advancements in quantum computing",
    num_results=3,
    start_published_date="2024-01-01",
    end_published_date="2024-07-10"
)

[Exa Tool] Autoprompt: Here is a recent advancement in quantum computing:


[Document(id_='609ae891-42b3-4dcb-b6dd-4704ade56357', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Insider Brief\n\nQuantinuum and University of Colorado researchers successfully entangled four logical qubits with better fidelity than their physical counterparts.\nThe advance demonstrates improved error protection and operational reliability, essential steps toward developing practical and scalable quantum computers.\nThe achievement also shows Quantinuum’s commitment to making quantum computing more accessible and reliable, combining the advanced H2 quantum processor with innovative error-correcting codes.\n\nQuantinuum and the University of Colorado say they have — for the first time — entangled four error-protected logical qubits that have better fidelity than their physical counterparts. It’s not just an academic exercise, though. They report in a paper they posted on the pre-print server ArXiv that the system 

In [24]:
# Example 2: Calling search_and_contents with include_domains filters
exa_tool.search_and_retrieve_documents(
    "Climate change mitigation strategies",
    num_results=3,
    include_domains=["www.nature.com", "www.sciencemag.org", "www.pnas.org"]
)

[Exa Tool] Autoprompt: Here is a link to a climate change mitigation strategy:


[Document(id_='c87bcad1-80eb-403e-825c-c2baed082500', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text="November 14, 2006 103 (46) 17184-17189 Abstract If it were to become apparent that dangerous changes in global climate were inevitable, despite greenhouse gas controls, active methods to cool the Earth on an emergency basis might be desirable. The concept considered here is to block 1.8% of the solar flux with a space sunshade orbited near the inner Lagrange point (L1), in-line between the Earth and sun. Following the work of J. Early [Early, JT (1989) J Br Interplanet Soc 42:567–569], transparent material would be used to deflect the sunlight, rather than to absorb it, to minimize the shift in balance out from L1 caused by radiation pressure. Three advances aimed at practical implementation are presented. First is an optical design for a very thin refractive screen with low reflectivity, leading to a total sunshade m

We can see we have different tools to search for results, retrieve the results, find similar results to a web page, and finally a tool that combines search and document retrieval into a single tool. We will test them out in LLM Agents below:

### Using the Search and Retrieve documents tools in an Agent

We can create an agent with access to the above tools and start testing it out:

In [39]:
# We don't give the Agent our unwrapped retrieve document tools, instead pass the wrapped tools
agent = OpenAIAgent.from_tools(
    exa_tool_list,
    verbose=True,
)

In [10]:
print(agent.chat("What are the best resturants in toronto?"))

Added user message to memory: What are the best resturants in toronto?
=== Calling Function ===
Calling function: search with args: {"query":"best restaurants in Toronto","num_results":5}
[Exa Tool] Autoprompt: One of the best restaurants in Toronto is:
Got output: [{'title': 'Via Allegro Ristorante - Toronto Fine Dining Restaurant', 'url': 'https://viaallegroristorante.com/', 'id': 'https://viaallegroristorante.com/'}, {'title': 'Sophisticated Dining - Toronto Restaurant | Scaramouche', 'url': 'https://www.scaramoucherestaurant.com/', 'id': 'https://www.scaramoucherestaurant.com/'}, {'title': "Where To Eat In Toronto - Barberian's Steak House", 'url': 'https://www.elitesportstours.ca/blog/where-to-eat-in-toronto-barberians-steakhouse', 'id': 'https://www.elitesportstours.ca/blog/where-to-eat-in-toronto-barberians-steakhouse'}, {'title': 'Where To Eat In Toronto - STK Toronto', 'url': 'https://www.elitesportstours.ca/blog/where-to-eat-in-toronto-stk-toronto', 'id': 'https://www.elitesp

In [11]:
print(agent.chat("tell me more about Osteria Giulia"))

Added user message to memory: tell me more about Osteria Giulia
=== Calling Function ===
Calling function: search with args: {"query":"Osteria Giulia restaurant Toronto","num_results":1}
[Exa Tool] Autoprompt: A fantastic restaurant to try in Toronto is Osteria Giulia.
Got output: [{'title': 'Giulietta', 'url': 'https://giu.ca/', 'id': 'https://giu.ca/'}]

It seems that there might be a slight confusion in the search results. The search returned information about a restaurant named "Giulietta" instead of "Osteria Giulia." Would you like me to search again for more specific information about Osteria Giulia in Toronto?


## Avoiding Context Window Issues

The above example shows the core uses of the Exa tool. We can easily retrieve a clean list of links related to a query, and then we can fetch the content of the article as a cleaned up html extract. Alternatively, the search_and_retrieve_documents tool directly returns the documents from our search result.

We can see that the content of the articles is somewhat long and may overflow current LLM context windows.  

1. Use `search_and_retrieve_highlights`: This is an endpoint offered by Exa that directly retrieves relevant highlight snippets from the web, instead of full web articles. As a result you don't need to worry about indexing/chunking offline yourself!

2. Wrap `search_and_retrieve_documents` with `LoadAndSearchToolSpec`: We set up and use a "wrapper" tool from LlamaIndex that allows us to load text from any tool into a VectorStore, and query it for retrieval. This is where the `search_and_retrieve_documents` tool become particularly useful. The Agent can make a single query to retrieve a large number of documents, using a very small number of tokens, and then make queries to retrieve specific information from the documents.

### 1. Using `search_and_retrieve_highlights`

The easiest is to just use `search_and_retrieve_highlights` from Exa. This is essentially a "web RAG" endpoint - they handle chunking/embedding under the hood.

In [12]:
tools = exa_tool.to_tool_list(
    spec_functions=["search_and_retrieve_highlights", "current_date"]
)

In [13]:
agent = OpenAIAgent.from_tools(
    tools,
    verbose=True,
)

In [14]:
response = agent.chat("Tell me more about the recent news on semiconductors")
print(f"Response: {str(response)}")

Added user message to memory: Tell me more about the recent news on semiconductors
=== Calling Function ===
Calling function: current_date with args: {}
Got output: 2024-07-10

=== Calling Function ===
Calling function: search_and_retrieve_highlights with args: {"query":"semiconductors","num_results":5,"start_published_date":"2024-06-01"}
[Exa Tool] Autoprompt: A cutting-edge company in the field of semiconductors is:
Got output: [Document(id_='f0cb0faf-7506-4c44-b952-031ae92953ae', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Silicon carbide is a critical material for high-power, high-temperature applications, and is extremely difficult to produce. onsemi is one of the only companies in the world with the ability to manufacture SiC-based semiconductors from crystal growth to advanced packaging solutions. onsemi’s plan to expand SiC manufacturing with a multi-year brownfield investment of up to $2 billion (44 billi

### 2. Using `LoadAndSearchToolSpec`

Here we wrap the `search_and_retrieve_documents` functionality with the `load_and_search_tool_spec`.

In [32]:
from llama_index.core.tools.tool_spec.load_and_search import (
    LoadAndSearchToolSpec,
)

# The search_and_retrieve_documents tool is the third in the tool list, as seen above
search_and_retrieve_docs_tool = exa_tool.to_tool_list(
    spec_functions=["search_and_retrieve_documents"]
)[0]
date_tool = exa_tool.to_tool_list(spec_functions=["current_date"])[0]
wrapped_retrieve = LoadAndSearchToolSpec.from_defaults(search_and_retrieve_docs_tool)

Our wrapped retrieval tools separate loading and reading into separate interfaces. We use `load` to load the documents into the vector store, and `read` to query the vector store. Let's try it out again

In [33]:
wrapped_retrieve.load("This is the best explanation for machine learning transformers:")
print(wrapped_retrieve.read("what is a transformer"))
print(wrapped_retrieve.read("who wrote the first paper on transformers"))

[Exa Tool] Autoprompt: Here is the best explanation for machine learning transformers:
A transformer is a type of neural network architecture that is well-suited for tasks involving processing sequences as inputs. It is designed to create a numerical representation for each element within a sequence, capturing essential information about the element and its surrounding context. Transformers have been instrumental in revolutionizing natural language processing tasks, such as translation and autocomplete services, by leveraging their ability to understand and generate natural language text.
The first paper on transformers was written in 2017.


## Creating the Agent

We now are ready to create an Agent that can use Exa's services to their full potential. We will use our wrapped read and load tools, as well as the `get_date` utility for the following agent and test it out below:

In [36]:
# Just pass the wrapped tools and the get_date utility
agent = OpenAIAgent.from_tools(
    [*wrapped_retrieve.to_tool_list(), date_tool],
    verbose=True,
)

In [None]:
print(
    agent.chat(
        "Can you summarize everything published in the last month regarding news on superconductors"
    )
)

We asked the agent to retrieve documents related to superconductors from this month. It used the `get_date` tool to determine the current month, and then applied the filters in Exa based on publication date when calling `search`. It then loaded the documents using `retrieve_documents` and read them using `read_retrieve_documents`.

We can make another query to the vector store to read from it again, now that the articles are loaded: