# Building a Exa Search Powered Data Agent

<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/llama-index-integrations/tools/llama-index-tools-exa/examples/exa.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This tutorial walks through using the LLM tools provided by the [Exa API](https://exa.ai) to allow LLMs to use semantic queries to search for and retrieve rich web content from the internet.

To get started, you will need an [OpenAI api key](https://platform.openai.com/account/api-keys) and an [Exa API key](https://dashboard.exa.ai/api-keys)

We will import the relevant agents and tools and pass them our keys here:

In [None]:
# # Install the relevant LlamaIndex packages, incl. core and Exa tool
!pip install llama-index llama-index-core llama-index-tools-exa

In [None]:
# Get OS for environment variables
import os

# Set up OpenAI
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.llms.openai import OpenAI

# NOTE:
# You must have an OpenAI API key in the environment variable OPENAI_API_KEY
# You must have an Exa API key in the environment variable EXA_API_KEY

# Set up the Exa search tool
# from llama_index.tools.exa import ExaToolSpec

# Instantiate
exa_tool = ExaToolSpec(
    api_key=os.environ["EXA_API_KEY"],
    # max_characters=2000   # this is the default
)

# Get the list of tools to see what Exa offers
exa_tool_list = exa_tool.to_tool_list()
for tool in exa_tool_list:
    print(tool.metadata.name)

search
retrieve_documents
search_and_retrieve_documents
search_and_retrieve_highlights
find_similar
current_date


## Testing the Exa tools

We've imported our OpenAI agent, set up the API keys, and initialized our tool, checking the methods that it has available. Let's test out the tool before setting up our Agent.

All of the Exa search tools make use of the `AutoPrompt` option where Exa will pass the query through an LLM to refine it in line with Exa query best-practice.

In [None]:
exa_tool.search_and_retrieve_documents("machine learning transformers", num_results=3)

[Exa Tool] Autoprompt: Here is a comprehensive guide on machine learning transformers:


[Document(id_='af546f08-f706-40d3-a4fd-6c8613b0bab6', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='The famous paper “Attention is all you need” in 2017 changed the way we were thinking about attention. With enough data, matrix multiplications, linear layers, and layer normalization we can perform state-of-the-art-machine-translation. Nonetheless, 2020 was definitely the year of transformers! From natural language now they are into computer vision tasks. How did we go from attention to self-attention? Why does the transformer work so damn well? What are the critical components for its success? Read on and find out! In my opinion, transformers are not so hard to grasp. It\'s the combination of all the surrounding concepts that may be confusing, including attention. That’s why we will slowly build around all the fundamental concepts. With Recurrent Neural Networks (RNN’s) we used to treat sequences sequentially to kee

In [None]:
exa_tool.find_similar(
    "https://www.mihaileric.com/posts/transformers-attention-in-disguise/"
)

[{'title': 'Transformers: a Primer',
  'url': 'http://www.columbia.edu/~jsl2239/transformers.html',
  'id': 'http://www.columbia.edu/~jsl2239/transformers.html'},
 {'title': 'Illustrated Guide to Transformers- Step by Step Explanation',
  'url': 'https://towardsdatascience.com/illustrated-guide-to-transformers-step-by-step-explanation-f74876522bc0?gi=8fe76db5c4d9',
  'id': 'https://towardsdatascience.com/illustrated-guide-to-transformers-step-by-step-explanation-f74876522bc0?gi=8fe76db5c4d9'},
 {'title': 'The Transformer Attention Mechanism - MachineLearningMastery.com',
  'url': 'https://machinelearningmastery.com/the-transformer-attention-mechanism/',
  'id': 'https://machinelearningmastery.com/the-transformer-attention-mechanism/'}]

In [None]:
exa_tool.search_and_retrieve_documents(
    "This is a summary of recent research around diffusion models:", num_results=1
)

[Exa Tool] Autoprompt: Here is a summary of recent research around diffusion models:


[Document(id_='855f04cf-ec0c-462e-8fa7-c8cc23e54fc4', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Getting-started-with-Diffusion-Literature\nSummary of the most important papers and blogs about diffusion models for students to learn about diffusion models. Also contains an overview of all published robotics diffusion papers\nLearning about Diffusion models\nWhile there exist many tutorials for Diffusion models, below you can find an overview of some introduction blog posts and video, which I found the most intuitive and useful:\n\nWhat are Diffusion Models?: an introduction video, which introduces the general idea of diffusion models and some high-level math about how the model works\n Generative Modeling by Estimating Gradients of the Data Distribution: blog post from the one of the most influential authors in this area, which introduces diffusion models from the score-based perspective\n What are Diffusion Model

While `search_and_retrieve_documents` returns raw text from the source document, `search_and_retrieve_highlights` returns relevant curated snippets.

In [None]:
exa_tool.search_and_retrieve_highlights(
    "This is a summary of recent research around diffusion models:", num_results=1
)

[Exa Tool] Autoprompt: Here is a summary of recent research around diffusion models:


[Document(id_='38203997-c7f6-49d7-b977-ae3d57e22d26', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Diffusion models are a type of generative model and in this field, the main focus are vision based applications, thus all theory papers mentioned in the text below are mostly focused on image synthesis or similar tasks related to it. Diffusion models can be viewed from two perspectives: one is based on the initial idea of of Sohl-Dickstein et al., (2015) and the other is based on a different direction of research: score-based generative models. Song & Ermon, (2019) introduced the score-based generative models category. They presented the noise-conditioned score network (NCSN), which is a predecessor to diffusion model. The main idea of the paper is to learn the score function of the unknown data distribution with a neural network.', mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadat

### Exploring other Exa functionalities
There are additional parameters that you can pass to Exa methods.

You can filter return results based on the date that entity was published

In [None]:
# Example 1: Calling search_and_contents with date filters
exa_tool.search_and_retrieve_documents(
    "Advancements in quantum computing",
    num_results=3,
    start_published_date="2024-01-01",
    end_published_date="2024-07-10",
)

[Exa Tool] Autoprompt: Here is a recent advancement in quantum computing:


[Document(id_='9f6e1924-ba7f-46a4-b0ea-269625fde0e1', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='', mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'),
 Document(id_='e08b7309-7f59-43c3-a11b-43a581c74f15', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='References Goldin, G. A., Menikoff, R. & Sharp, D. H. Comments on ‘general theory for quantum statistics in two dimensions’. Phys. Rev. Lett. 54, 603–603 (1985). Article \n ADS \n MathSciNet \n CAS \n PubMed\n\nGoogle Scholar \n Moore, G. & Seiberg, N. Classical and quantum conformal field theory. Commun. Math. Phys. 123, 177–254 (1989). Article \n ADS \n MathSciNet\n\nGoogle Scholar \n Moore, G. & Read, N. Nonabelions in the fractional quantum Hall effect. Nucl. Phys. B 360, 362–

You can constrain results to only return from specified domains (or exclude domains)

In [None]:
# Example 2: Calling search_and_contents with include_domains filters
exa_tool.search_and_retrieve_documents(
    "Climate change mitigation strategies",
    num_results=3,
    include_domains=["www.nature.com", "www.sciencemag.org", "www.pnas.org"],
)

[Exa Tool] Autoprompt: Here is a comprehensive climate change mitigation strategy:


[Document(id_='de6db381-d546-4eef-9fc4-af19585b61be', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text="Published: 17 June 2012\n\nNature Climate Change \n volume 2, pages 471–474 (2012)Cite this article\n\n3119 Accesses\n\n51 Citations\n\n141 Altmetric\n\nMetrics details\n\nSubjects\n\nTwenty-one coherent major initiatives could together stimulate sufficient reductions by 2020 to bridge the global greenhouse-gas emissions gap.\n\nWe propose a new approach — which we call 'wedging the gap' — consisting of 21 coherent major initiatives that together would trigger greenhouse-gas emission reductions of around 10 gigatonnes of carbon dioxide equivalent (Gt CO2e) by 2020, plus the benefits of enhanced reductions in air-pollutant emissions. This supports and goes substantially beyond the emission reductions proposed by national governments under the United Nations Framework Convention on Climate Change (UNFCCC). The approach 

You can turn off autoprompt, enabling more direct and fine grained control of Exa querying.

In [None]:
# Example 3: Calling search_and_contents with autoprompt off
exa_tool.search_and_retrieve_documents(
    "Here is an article on the advancements of quantum computing",
    num_results=3,
    use_autoprompt=False,
)

[Exa Tool] Autoprompt: None


[Document(id_='dae22188-b0a2-4941-bf06-282f7a99f435', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='14 June 2023\n\n‘Benchmark’ experiment suggests quantum computers could have useful real-world applications within two years.\n\nFour years ago, physicists at Google claimed their quantum computer could outperform classical machines — although only at a niche calculation with no practical applications. Now their counterparts at IBM say they have evidence that quantum computers will soon beat ordinary ones at useful tasks, such as calculating properties of materials or the interactions of elementary particles.\n\nAccess options\n Access Nature and 54 other Nature Portfolio journals Get Nature+, our best-value online-access subscription cancel any time Subscribe to this journal Receive 51 print issues and online access $199.00 per year only $3.90 per issue Rent or buy this article Prices vary by article type $1.95 $39.9

Exa also has an option to do standard keyword based seach by specifying `type="keyword"`. 

In [None]:
# Example 4: Calling search_and_retrieve_highlights with keyword search type
exa_tool.search_and_retrieve_highlights(
    "Advancements in quantum computing", num_results=3, type="keyword"
)

[Exa Tool] Autoprompt: None


[Document(id_='f360e2cc-ddae-42f6-81f8-3ce268826ba3', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Data Acquisition, Preparation, and the Need for Application Knowledge Data is acquired in a\xa0classical form; the quantum properties are then applied to the data in the quantum computer. Data that will be used in subsequent quantum computations has a\xa0limited lifetime; the information degrades with time. The current state of quantum computing is referred to as the noisy intermediate-scale quantum (NISQ) era. These processors are sensitive to their environment, prone to quantum decoherence, and not yet capable of continuous quantum error correction. This is improving significantly with advancements in materials, and there are techniques that can be applied to refresh the information, such as, “Dynamical Decoupling”.', mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{co

Last, Magic Search is a new feature available in Exa, where queries will route to the best suited search type intelligently: either their proprietary neural search or industry-standard keyword search mentioned above

In [None]:
# Example 5: Calling search_and_retrieve_highlights with magic search (explicitly)
exa_tool.search_and_retrieve_highlights(
    "Advancements in quantum computing", num_results=3, type="magic"
)

[Exa Tool] Autoprompt: Here is a recent advancement in quantum computing:


[Document(id_='4d458de1-18c8-4e51-9dd9-d5f0be08c219', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Four years ago, physicists at Google claimed their quantum computer could outperform classical machines — although only at a niche calculation with no practical applications. Now their counterparts at IBM say they have evidence that quantum computers will soon beat ordinary ones at useful tasks, such as calculating properties of materials or the interactions of elementary particles. Access Nature and 54 other Nature Portfolio journals Get Nature+, our best-value online-access subscription cancel any time Subscribe to this journal Receive 51 print issues and online access $199.00 per year only $3.90 per issue Rent or buy this article Prices vary by article type $1.95 $39.95 Prices may be subject to local taxes which are calculated during checkout', mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_tem

We can see we have different tools to search for results, retrieve the results, find similar results to a web page, and finally a tool that combines search and document retrieval into a single tool. We will test them out in LLM Agents below:

### Using the Search and Retrieve documents tools in an Agent

We can create an agent with access to the above tools and start testing it out:

In [None]:
# We don't give the Agent our unwrapped retrieve document tools, instead pass the wrapped tools
agent = FunctionAgent(
    tools=exa_tool_list,
    llm=OpenAI(model="gpt-4.1"),
)

In [None]:
print(await agent.run("What are the best resturants in toronto?"))

## Avoiding Context Window Issues

The above example shows the core uses of the Exa tool. We can easily retrieve a clean list of links related to a query, and then we can fetch the content of the article as a cleaned up html extract. Alternatively, the search_and_retrieve_documents tool directly returns the documents from our search result.

We can see that the content of the articles is somewhat long and may overflow current LLM context windows.  

1. Use `search_and_retrieve_highlights`: This is an endpoint offered by Exa that directly retrieves relevant highlight snippets from the web, instead of full web articles. As a result you don't need to worry about indexing/chunking offline yourself!

2. Wrap `search_and_retrieve_documents` with `LoadAndSearchToolSpec`: We set up and use a "wrapper" tool from LlamaIndex that allows us to load text from any tool into a VectorStore, and query it for retrieval. This is where the `search_and_retrieve_documents` tool become particularly useful. The Agent can make a single query to retrieve a large number of documents, using a very small number of tokens, and then make queries to retrieve specific information from the documents.

### 1. Using `search_and_retrieve_highlights`

The easiest is to just use `search_and_retrieve_highlights` from Exa. This is essentially a "web RAG" endpoint - they handle chunking/embedding under the hood.

In [None]:
tools = exa_tool.to_tool_list(
    spec_functions=["search_and_retrieve_highlights", "current_date"]
)

In [None]:
agent = FunctionAgent(
    tools=tools,
    llm=OpenAI(model="gpt-4.1"),
)

In [None]:
response = await agent.run("Tell me more about the recent news on semiconductors")
print(f"Response: {str(response)}")

### 2. Using `LoadAndSearchToolSpec`

Here we wrap the `search_and_retrieve_documents` functionality with the `load_and_search_tool_spec`.

In [None]:
from llama_index.core.tools.tool_spec.load_and_search import (
    LoadAndSearchToolSpec,
)

# The search_and_retrieve_documents tool is the third in the tool list, as seen above
search_and_retrieve_docs_tool = exa_tool.to_tool_list(
    spec_functions=["search_and_retrieve_documents"]
)[0]
date_tool = exa_tool.to_tool_list(spec_functions=["current_date"])[0]
wrapped_retrieve = LoadAndSearchToolSpec.from_defaults(search_and_retrieve_docs_tool)

Our wrapped retrieval tools separate loading and reading into separate interfaces. We use `load` to load the documents into the vector store, and `read` to query the vector store. Let's try it out again

In [None]:
wrapped_retrieve.load("This is the best explanation for machine learning transformers:")
print(wrapped_retrieve.read("what is a transformer"))
print(wrapped_retrieve.read("who wrote the first paper on transformers"))

[Exa Tool] Autoprompt: Here is the best explanation for machine learning transformers:
A transformer is a type of neural network architecture that is well-suited for tasks involving processing sequences as inputs. It is designed to create a numerical representation for each element within a sequence, capturing essential information about the element and its neighboring context. Transformers have been instrumental in revolutionizing natural language processing tasks, such as translation and autocomplete services, by leveraging their capabilities in understanding and generating natural language text.
The first paper on transformers was written in 2017.


## Creating the Agent

We now are ready to create an Agent that can use Exa's services to their full potential. We will use our wrapped read and load tools, as well as the `get_date` utility for the following agent and test it out below:

In [None]:
# Just pass the wrapped tools and the get_date utility
agent = FunctionAgent(
    tools=[*wrapped_retrieve.to_tool_list(), date_tool],
    llm=OpenAI(model="gpt-4.1"),
)

In [None]:
print(
    await agent.run(
        "Can you summarize everything published in the last month regarding news on superconductors"
    )
)

We asked the agent to retrieve documents related to superconductors from this month. It used the `get_date` tool to determine the current month, and then applied the filters in Exa based on publication date when calling `search`. It then loaded the documents using `retrieve_documents` and read them using `read_retrieve_documents`.

We can make another query to the vector store to read from it again, now that the articles are loaded: