# NimbleRetriever

 NimbleRetriever lets you build RAG applications and AI Agents that have access to all the information on the web.

 NimbleRetriever is an API that executes search queries and retrieves web data for your RAG application or your AI Agent. It has two modes:


*   **Search & Retrieve**: Execute a search query, get the top result URLs, and retrieve the text from those URLs.
*   **Retrieve**: Provide a list of URLs, and retrieve the text/data from those URLs



Get started with the [NimbleRetriever](/docs/concepts/#retrievers).

Detailed documentation:  [API reference](https://api.python.langchain.com/en/latest/retrievers/langchain_nimble.retrievers.Nimble.NimbleRetriever.html).


## Setup

If you want to get automated tracing for individual queries, you can set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:

In [None]:
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"

### Installation

This retriever lives in the `langchain-community` package.

In [None]:
%pip install -qU langchain-community

To use `NimbleRetriever`, you need to get a Nimble API key.


*   Sign up for [Nimble](https://app.nimbleway.com/signup).
*   Go to Account Settings -> API Keys



In [None]:
import getpass
import os

os.environ["NIMBLE_API_KEY"] = getpass.getpass()

··········


## Instantiation

Now we can instantiate our retriever:


In [None]:
from langchain_community.retrievers import NimbleRetriever

retriever = NimbleRetriever(k=3)

## Usage

`NimbleRetriever` has these arguments:


*   `k` (optional) integer - Number of results to return (<=20)
*   `api_key` (optional) string - Nimble's API key, can be sent directly when instantiating the retreiever or with the environment variable (`NIMBLE_API_KEY`)
*   `search_engine` (optional) string - The search engine your query will be executed at, you can choose from
    * `google_search` (default value) - Google's search engine
    * `google_sge` - Google’s search generative experience (read more [here](https://www.nimbleway.com/blog/google-sge))
    * `bing_search` - Bing's search engine
    * `yandex_search` - Yandex search engine
*   `render` (optional) boolean - Enables or disables Javascript rendering on the target page (if enabled the results might return slower)
*   `locale` (optional) string - LCID standard locale used for the URL request. Alternatively, user can use auto for automatic locale based on country targeting.
*   `country` (optional) string - Country used to access the target URL, use ISO Alpha-2 Country Codes i.e. US, DE, GB
*   `parsing_type` (optional) string - The text structure of the returned `page_content`
    * `markdown` - Markdown format
    * `simplified_html` (default value) - Compressed version of the original html document (~8% of the orignial html size)
    * `plain_text` - Extracts just the text from the html
*   `links` (optional) Array of strings - Array of links to the requested websites to scrape, if chosen will return the raw html content from these html **(THIS WILL ACTIVATE THE SECOND MODE)**

You can read more about each argument in [Nimble's docs](https://docs.nimbleway.com/nimble-sdk/web-api/vertical-endpoints/serp-api/real-time-search-request#request-options).



### Example of the First mode, where we don't supply links but do supply a query


In [None]:
query = "Latest trends in artificial intelligence"

retriever.invoke(query)

 Document(metadata={'title': 'Five Trends in AI and Data Science for 2025', 'snippet': 'Jan 8, 2025 — 1. Leaders will grapple with both the promise and hype around agentic AI. · 2. The time has come to measure results from generative AI\xa0...', 'url': 'https://sloanreview.mit.edu/article/five-trends-in-ai-and-data-science-for-2025/', 'position': 2, 'entity_type': 'OrganicResult'}, page_content='Five Trends in AI and Data Science for 2025\nMobile Menu\nMenu\nSearch\nTopics\n< Back to Menu\nData, AI, & Machine Learning\nInnovation\nLeadership\nManaging Technology\nMarketing\nOperations\nSocial Responsibility\nStrategy\nWorkplace, Teams, & Culture\nAll Topics\nTrending\nAI & Machine Learning\nOrganizational Culture\nHybrid Work\nOur Research\n< Back to Menu\nBig ideas Research Projects\nArtificial Intelligence and Business Strategy\nResponsible AI\nFuture of the Workforce\nFuture of Leadership\nAll Research Projects\nSpotlight\n< Back to Menu\nMost Popular\nAI in Action\nHybrid Work\nCoa

And a single doc will look like so

In [19]:
import json

example_doc = retriever.invoke(query)[0]
print("Page Content: \n", json.dumps(example_doc.page_content, indent=2))
print("Metadata: \n", json.dumps(example_doc.metadata, indent=2))

Page Content: 
Metadata: 
 {
  "title": "8 AI and machine learning trends to watch in 2025",
  "snippet": "Jan 3, 2025 \u2014 1. Hype gives way to more pragmatic approaches \u00b7 2. Generative AI moves beyond chatbots \u00b7 3. AI agents are the next frontier \u00b7 4. Generative AI\u00a0...",
  "url": "https://www.techtarget.com/searchenterpriseai/tip/9-top-AI-and-machine-learning-trends",
  "position": 1,
  "entity_type": "OrganicResult"
}


### Example of the second mode where we supply links to scrape

In [26]:
retriever = NimbleRetriever(links=["example.com"])
retriever.invoke(input="")

[Document(metadata={'title': None, 'snippet': None, 'url': 'https://example.com', 'position': None, 'entity_type': 'HtmlContent'}, page_content='<!doctype html>\n<html>\n<head>\n    <title>Example Domain</title>\n\n    <meta charset="utf-8" />\n    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />\n    <meta name="viewport" content="width=device-width, initial-scale=1" />\n    <style type="text/css">\n    body {\n        background-color: #f0f0f2;\n        margin: 0;\n        padding: 0;\n        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;\n        \n    }\n    div {\n        width: 600px;\n        margin: 5em auto;\n        padding: 2em;\n        background-color: #fdfdff;\n        border-radius: 0.5em;\n        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);\n    }\n    a:link, a:visited {\n        color: #38488f;\n        text-decoration: none;\n    }\n    @media (max-width: 700

## Use within a chain

Like other retrievers, NimbleRetriever can be incorporated into LLM applications via [chains](/docs/how_to/sequence/).

We will need a LLM or chat model:


In [None]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)

In [None]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough

prompt = ChatPromptTemplate.from_template(
    """Answer the question based only on the context provided.

Context: {context}

Question: {question}"""
)


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [None]:
chain.invoke("Who is the CEO of Nimbleway?")

'The CEO of Nimble Way is Uriel Knorovich.'

## API reference

For detailed documentation of all NimbleRetriever features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/retrievers/langchain_nimble.retrievers.Nimble.NimbleRetriever.html).