# Custom Search Tools for Retrieval Agents


In [1]:
%pip install -Uq langchain huggingface_hub

Note: you may need to restart the kernel to use updated packages.


## [@tool decorator](https://python.langchain.com/docs/modules/agents/tools/custom_tools "Learn more")

The `@tool` decorator is the simplest way to define a custom tool. The decorator uses the function name as the tool name by default, but this can be overridden by passing a string as the first argument. Additionally, the decorator will use the function's docstring as the tool's description - so a **docstring** ***MUST*** be provided.

Ex.

```python
@tool
def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b
```

We can access the tool's properties directly
```python
print(multiply.name)
print(multiply.description)
print(multiply.args)
```

```shell
multiply
multiply(a: int, b: int) -> int - Multiply two numbers.
{'a': {'title': 'A', 'type': 'integer'}, 'b': {'title': 'B', 'type': 'integer'}}
```

Later we'll be able to observe the difference in output format depending on whether we call our functions with the `@tool` decorator as regular functions, or as tools.

In [2]:
from langchain.tools import tool, BaseTool

## LangSmith
If you have access to LangSmith, it will be extremely helpful as projects become increasingly more complex because it allows the tracing of action steps made throughout execution. 

This is especially true for us since we're building custom tools, creating an agent using Hugging Face's Model Hub, and tying everything together inside a runnable chain.

It becomes even more true as you add memory, vectorstore searching and so on.

In [3]:
import os

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "<YOUR_LANGCHAIN_API_KEY>"
os.environ["LANGCHAIN_PROJECT"] = "build-custom-tools"

## [Arxiv Search](https://python.langchain.com/docs/integrations/tools/arxiv "Learn more")

In [4]:
%pip install -Uq arxiv

Note: you may need to restart the kernel to use updated packages.


In [5]:
from langchain.utilities import ArxivAPIWrapper

In [6]:
@tool("arxiv_search")
def arxiv_search(query: str) -> str:
    """
    This function uses the ArxivAPIWrapper to search for scientific papers on Arxiv.

    Parameters:
    query (str): The search term to use in the query.

    Returns:
    str: The search results from Arxiv.
    """
    # Create an instance of the ArxivAPIWrapper
    arxiv = ArxivAPIWrapper()

    # Perform the search query
    results = arxiv.run(query)

    # Return the search results
    return results

In [7]:
print(arxiv_search("2306.11984"))

Published: 2023-06-21
Title: TauPETGen: Text-Conditional Tau PET Image Synthesis Based on Latent Diffusion Models
Authors: Se-In Jang, Cristina Lois, Emma Thibault, J. Alex Becker, Yafei Dong, Marc D. Normandin, Julie C. Price, Keith A. Johnson, Georges El Fakhri, Kuang Gong
Summary: In this work, we developed a novel text-guided image synthesis technique
which could generate realistic tau PET images from textual descriptions and the
subject's MR image. The generated tau PET images have the potential to be used
in examining relations between different measures and also increasing the
public availability of tau PET datasets. The method was based on latent
diffusion models. Both textual descriptions and the subject's MR prior image
were utilized as conditions during image generation. The subject's MR image can
provide anatomical details, while the text descriptions, such as gender, scan
time, cognitive test scores, and amyloid status, can provide further guidance
regarding where the tau 

In [8]:
print(
        arxiv_search.name,
        "\n\n",
        arxiv_search.description,
        "\n\n",
        arxiv_search.args
    )

arxiv_search 

 arxiv_search(query: str) -> str - This function uses the ArxivAPIWrapper to search for scientific papers on Arxiv.

    Parameters:
    query (str): The search term to use in the query.

    Returns:
    str: The search results from Arxiv. 

 {'query': {'title': 'Query', 'type': 'string'}}


In [9]:
# Call the function as a tool
arxiv_search.run("2306.11984")

"Published: 2023-06-21\nTitle: TauPETGen: Text-Conditional Tau PET Image Synthesis Based on Latent Diffusion Models\nAuthors: Se-In Jang, Cristina Lois, Emma Thibault, J. Alex Becker, Yafei Dong, Marc D. Normandin, Julie C. Price, Keith A. Johnson, Georges El Fakhri, Kuang Gong\nSummary: In this work, we developed a novel text-guided image synthesis technique\nwhich could generate realistic tau PET images from textual descriptions and the\nsubject's MR image. The generated tau PET images have the potential to be used\nin examining relations between different measures and also increasing the\npublic availability of tau PET datasets. The method was based on latent\ndiffusion models. Both textual descriptions and the subject's MR prior image\nwere utilized as conditions during image generation. The subject's MR image can\nprovide anatomical details, while the text descriptions, such as gender, scan\ntime, cognitive test scores, and amyloid status, can provide further guidance\nregarding w

## [Google Jobs](https://python.langchain.com/docs/integrations/tools/google_jobs "Learn more")

In [10]:
import os

os.environ["SERPAPI_API_KEY"] = "<YOUR_SERPAPI_API_KEY>"

In [11]:
%pip install -Uq google-search-results

Note: you may need to restart the kernel to use updated packages.


In [12]:
from langchain.tools import tool

In [13]:
from langchain.tools.google_jobs import GoogleJobsQueryRun
from langchain.utilities.google_jobs import GoogleJobsAPIWrapper

In [14]:
@tool("google_job_search")
def google_job_search(query: str) -> str:
    """
    This function uses the GoogleJobsAPIWrapper to search for jobs on Google Jobs.

    Parameters:
    query (str): The search term to use in the query.

    Returns:
    str: The search results from Google Jobs.
    """
    # create an instance of the GoogleJobsQueryRun
    gjs = GoogleJobsQueryRun(api_wrapper=GoogleJobsAPIWrapper())

    # Perform the search query
    results = gjs.run(query)

    # return the results
    return results

In [15]:
# Print the results of a search, as a regular function
print(google_job_search("Software Engineer"))


_______________________________________________
Job Title: Software Engineer
Company Name: VR Visions
Location:   Canada   
Description: Job Description

Are you excited about Virtual Reality and Haptic technology? Do you love to make lifelike VR models and to integrate them with cutting edge haptic system that provides realistic force feedback? You have a unique opportunity to join a team of industry-leading researchers and experienced engineers with diverse backgrounds in haptics, VR, robotics, AI, computer hardware and software to develop haptic and VR based surgical training simulation systems. These systems will be used by a vast number of hospitals, medical schools and surgical training institutes and to significantly improve the efficiency of surgical training.

Your responsibilities include but not limited to:
• Analyze, design, develop VR deformable models and collision detection systems for advanced haptic systems
• Design, develop and debug real-time VR software with integr

In [16]:
# Call the function as a tool
google_job_search.run("Software Engineer")

'\n_______________________________________________\nJob Title: Software Engineer\nCompany Name: VR Visions\nLocation:   Canada   \nDescription: Job Description\n\nAre you excited about Virtual Reality and Haptic technology? Do you love to make lifelike VR models and to integrate them with cutting edge haptic system that provides realistic force feedback? You have a unique opportunity to join a team of industry-leading researchers and experienced engineers with diverse backgrounds in haptics, VR, robotics, AI, computer hardware and software to develop haptic and VR based surgical training simulation systems. These systems will be used by a vast number of hospitals, medical schools and surgical training institutes and to significantly improve the efficiency of surgical training.\n\nYour responsibilities include but not limited to:\n• Analyze, design, develop VR deformable models and collision detection systems for advanced haptic systems\n• Design, develop and debug real-time VR software

## [Wikipedia Search](https://python.langchain.com/docs/integrations/tools/wikipedia)

In [17]:
%pip install -Uq wikipedia

Note: you may need to restart the kernel to use updated packages.


In [18]:
from langchain.tools import WikipediaQueryRun
from langchain.utilities import WikipediaAPIWrapper

In [19]:
@tool("wiki_search")
def wiki_search(query: str) -> str:
    """
    This function uses the WikipediaAPIWrapper to search for information on Wikipedia.

    Parameters:
    query (str): The search term to use in the query.

    Returns:
    str: The search results from Wikipedia.
    """
    # create an instance of the WikipediaAPIWrapper
    wikisearch = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())

    # Perform the search query
    results = wikisearch.run(query)

    # return the results
    return results

In [20]:
# Test create_search_tool function
print(wiki_search("The Body Keeps the Score"))

Page: The Body Keeps the Score
Summary: The Body Keeps the Score: Brain, Mind, and Body in the Healing of Trauma is a 2014 book by Bessel van der Kolk about the effects of psychological trauma, also known as traumatic stress. The book describes van der Kolk's research and experiences on how individuals are affected by traumatic stress, and its effects on the mind and body. It is based on his 1994 Harvard Review of Psychiatry article "The body keeps the score: memory and the evolving psychobiology of posttraumatic stress".The Body Keeps the Score has been published in 36 languages. As of July 2021 the book had spent more than 141 weeks on the New York Times Bestseller List for nonfiction, with 27 of those weeks spent in the No. 1 position.

Page: Bessel van der Kolk
Summary: Bessel van der Kolk (born 1943) is a psychiatrist, author, researcher and educator based in Boston, United States. Since the 1970s his research has been in the area of post-traumatic stress. He is the author of The 

In [21]:
# Call the function as a tool
wiki_search.run("The Body Keeps the Score")

'Page: The Body Keeps the Score\nSummary: The Body Keeps the Score: Brain, Mind, and Body in the Healing of Trauma is a 2014 book by Bessel van der Kolk about the effects of psychological trauma, also known as traumatic stress. The book describes van der Kolk\'s research and experiences on how individuals are affected by traumatic stress, and its effects on the mind and body. It is based on his 1994 Harvard Review of Psychiatry article "The body keeps the score: memory and the evolving psychobiology of posttraumatic stress".The Body Keeps the Score has been published in 36 languages. As of July 2021 the book had spent more than 141 weeks on the New York Times Bestseller List for nonfiction, with 27 of those weeks spent in the No. 1 position.\n\nPage: Bessel van der Kolk\nSummary: Bessel van der Kolk (born 1943) is a psychiatrist, author, researcher and educator based in Boston, United States. Since the 1970s his research has been in the area of post-traumatic stress. He is the author o

# Create an Agent using [HuggingFaceHub](https://python.langchain.com/docs/integrations/chat/huggingface#huggingfacehub "Learn more")

Works with *HuggingFaceTextGenInference*, *HuggingFaceEndpoint*, and *HuggingFaceHub* LLMs.

Upon instantiating this class, the model_id is resolved from the url provided to the LLM, and the appropriate tokenizer is loaded from the HuggingFace Hub.

See the [Python API Documentation for LangChain](https://api.python.langchain.com/en/stable/chat_models/langchain_community.chat_models.huggingface.ChatHuggingFace.html?highlight=chathuggingface#langchain-community-chat-models-huggingface-chathuggingface "Learn more") for more detailed information.

In [22]:
from langchain_community.llms import HuggingFaceHub

# Instantiate the HuggingFaceHub LLM
llm = HuggingFaceHub(
    repo_id="HuggingFaceH4/zephyr-7b-beta",
    task="text-generation",
    model_kwargs={
        "max_new_tokens": 1024,
        "top_k": 50,
        "temperature": 0.01,
        "repetition_penalty": 1.05,
    },
    huggingfacehub_api_token="<HUGGINGFACEHUB_API_TOKEN>",
    )



## Apply LangChain Chat templates via `ChatHuggingFace`

In [23]:
from huggingface_hub import login

login("<HUGGINGFACEHUB_API_TOKEN>")

Token will not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to C:\Users\dae\.cache\huggingface\token
Login successful


In [24]:
from langchain.schema import (
    HumanMessage,
    SystemMessage,
)
from langchain_community.chat_models.huggingface import ChatHuggingFace

messages = [
    SystemMessage(content="You are a goofy AI assistant who's as smart as she is quirky."),
    HumanMessage(content="'CHUGMA' TELL ME WHAT IS THAT!"),
]

chat_model = ChatHuggingFace(llm=llm)

                    repo_id was transferred to model_kwargs.
                    Please confirm that repo_id is what you intended.
                    task was transferred to model_kwargs.
                    Please confirm that task is what you intended.
                    huggingfacehub_api_token was transferred to model_kwargs.
                    Please confirm that huggingfacehub_api_token is what you intended.


In [25]:
# Let's see what the chat messages look like before they are formatted for the LLM call.
chat_model._to_chat_prompt(messages)

"<|system|>\nYou are a goofy AI assistant who's as smart as she is quirky.</s>\n<|user|>\n'CHUGMA' TELL ME WHAT IS THAT!</s>\n<|assistant|>\n"

In [26]:
# Call the model
res = chat_model.invoke(messages)
print(res.content)

I'm not sure what you're asking me about specifically. Please provide more context so I can better understand your query.

If you're asking about the term "chugma," it's a Korean word that refers to a type of traditional Korean soup made with beef, vegetables, and rice cakes. The name "chugma" comes from the sound of the rice cakes (called "tteok") as they cook and expand in the soup, which sounds like "chug-chug."

I hope that helps clarify things for you! Let me know if you have any other questions.


## Create an Agent

**What's an agent?**: An Agent in the context of LLMs and LangChain is a software entity that utilizes a Language Model (LLM) to perform tasks and interact with users. It acts as an intermediary between the user and the LLM, determining which actions to take and in what order. Agents can call external tools, utilize memory, and employ various strategies to achieve their goals, making them powerful problem-solving entities in the realm of natural language processing.


In [27]:
from langchain import hub
from langchain.agents import AgentExecutor, load_tools
from langchain.agents.format_scratchpad import format_log_to_str
from langchain.agents.output_parsers import (
    ReActJsonSingleInputOutputParser,
)
from langchain.tools.render import render_text_description
from langchain.utilities import SerpAPIWrapper

In [28]:
# define the tools
# `load_tools` doesn't support custom tools, so we'll initialize it as an empty list
tools = load_tools([], llm=llm)
# and now we'll append our custom tools
tools = tools + [arxiv_search, google_job_search, wiki_search]

# setup tools
#toolset = load_tools(tools, llm=llm)

In [29]:
# setup ReAct style prompt
prompt = hub.pull("hwchase17/react-json")
prompt = prompt.partial(
    tools=render_text_description(tools),
    tool_names=", ".join([t.name for t in tools]),
)

# define the agent
chat_model_with_stop = chat_model.bind(stop=["\nObservation"])
agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_log_to_str(x["intermediate_steps"]),
    }
    | prompt
    | chat_model_with_stop
    | ReActJsonSingleInputOutputParser()
)

# instantiate AgentExecutor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, handle_parsing_errors=True)

In [30]:
agent_executor.invoke(
    {
        "input": "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?"
    }
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mQuestion: Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?

Thought: To find out who Leo DiCaprio's current girlfriend is, we can search for recent news articles or celebrity gossip websites. However, since this is a hypothetical scenario, let's assume that Leo DiCaprio is currently single.

Action:
```
{
  "action": "wiki_search",
  "action_input": "Leo DiCaprio girlfriend"
}
```[0m[38;5;200m[1;3mPage: Leonardo DiCaprio
Summary: Leonardo Wilhelm DiCaprio (; Italian: [diˈkaːprjo]; born November 11, 1974) is an American actor and film producer. Known for his work in biographical and period films, he is the recipient of numerous accolades, including an Academy Award, a British Academy Film Award, and three Golden Globe Awards. As of 2019, his films have grossed over $7.2 billion worldwide, and he has been placed eight times in annual rankings of the world's highest-paid actors.
Born in Los 



  lis = BeautifulSoup(html).find_all('li')


[38;5;200m[1;3mPage: John Connolly (FBI)
Summary: John Joseph Connolly Jr. (born August 1, 1940) is an American former FBI agent who was convicted of racketeering, obstruction of justice, and murder charges stemming from his relationship with James "Whitey" Bulger, Steve Flemmi, and the Winter Hill Gang.
State and federal officers had been trying to imprison Bulger for years, but he evaded capture until 2011. As the FBI handler for Bulger and Flemmi, Connolly (who was neighbors with the Bulgers in the Old Harbor Housing Project) had been protecting them from prosecution by supplying Bulger with information about possible attempts to catch them. Connolly was indicted on December 22, 1999, on charges of alerting Bulger and Flemmi to investigations, falsifying FBI reports to cover their crimes, and accepting bribes.In 2000, Connolly was charged with additional racketeering-related offenses. He was convicted in 2002 and sentenced to ten years in federal prison. In 2008, Connolly was conv

{'input': "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?",
 'output': 'Agent stopped due to iteration limit or time limit.'}

# The Trace
Wow. The AI chose the correct tool but went off a bean!

You can check out the chain's trace [here](https://smith.langchain.com/public/3f7cd76d-b279-4e6b-a6ec-0efe916ab4ed/r "AgentExecutor: Full LangSmith trace").