# Exploring the capabilities of Langchain's Agents

## The Search Agents

Trying out what we can achieve with Search agents and RAG combined. Lets try follwing the Langchain Agents [Quickstart](https://python.langchain.com/docs/modules/agents/quick_start/)

In [1]:
!pip install -U google-api-python-client
!pip install -U langchain-openai

Collecting google-api-python-client
  Downloading google_api_python_client-2.126.0-py2.py3-none-any.whl.metadata (6.7 kB)
Collecting httplib2<1.dev0,>=0.19.0 (from google-api-python-client)
  Downloading httplib2-0.22.0-py3-none-any.whl.metadata (2.6 kB)
Collecting google-auth-httplib2<1.0.0,>=0.2.0 (from google-api-python-client)
  Downloading google_auth_httplib2-0.2.0-py2.py3-none-any.whl.metadata (2.2 kB)
Collecting google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0.dev0,>=1.31.5 (from google-api-python-client)
  Downloading google_api_core-2.18.0-py3-none-any.whl.metadata (2.7 kB)
Collecting uritemplate<5,>=3.0.1 (from google-api-python-client)
  Downloading uritemplate-4.1.1-py2.py3-none-any.whl.metadata (2.9 kB)
Collecting proto-plus<2.0.0dev,>=1.22.3 (from google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0.dev0,>=1.31.5->google-api-python-client)
  Downloading proto_plus-1.23.0-py3-none-any.whl.metadata (2.2 kB)
Downloading google_api_python_client-2.126.0-py2.py3-none-an

In [1]:
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

In [10]:
from langchain_community.utilities import GoogleSearchAPIWrapper
from langchain_core.tools import Tool
from langchain_openai import ChatOpenAI
from langchain.agents import load_tools, initialize_agent, AgentType, AgentExecutor
from langchain import hub
from langchain.agents import create_tool_calling_agent
from langchain.agents import AgentExecutor

# from langchain.prompts import ChatPromptTemplate

### Creating the Google search agent

- using the OpenAI ChatGPT 3.5

In [15]:
search = GoogleSearchAPIWrapper()

google_search = Tool(
    name="google_search",
    description="Search Google for recent results. Use this tool to get up to date information on any topic.",
    func=search.run,
)
llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo")

In [16]:
# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-functions-agent")

In [17]:
agent = create_tool_calling_agent(llm, [google_search], prompt)

In [18]:
agent_executor = AgentExecutor(agent=agent, tools=[google_search], verbose=True)

In [38]:
result = agent_executor.invoke({"input": "What time is the next flight from Sofia to Amsterdam and which airlines?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `google_search` with `next flight from Sofia to Amsterdam`


[0m[36;1m[1;3mIt's easy to use Google Flights to find the cheapest days to fly from Sofia to Amsterdam. Just click the Departure label near the top of the page to open the ... Find flights to Amsterdam from $70. Fly from Sofia on Air Serbia, Austrian Airlines, Bulgaria Air and more. Search for Amsterdam flights on KAYAK now to ... Flights from Amsterdam to Sofia. Use Google Flights to plan your next trip and find cheap one way or round trip flights from Amsterdam to Sofia. Find flights to Sofia from $117. Fly from Amsterdam on Air Serbia, Bulgaria Air, Norwegian and more. Search for Sofia flights on KAYAK now to find the best ... Flights between Sofia, Bulgaria and Amsterdam, Netherlands starting at £61. Choose between Wizz Air, easyJet, or KLM Royal Dutch Airlines to find the best ... Book one-way or return flights from Sofia to Amsterdam with no chan

In [39]:
print(result["output"])

The next flight from Sofia to Amsterdam is available on airlines such as Air Serbia, Austrian Airlines, and Bulgaria Air. You can find more details and book your flight on platforms like KAYAK.


- using an open LLM in this case Mistral 7b

In [2]:
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

from langchain import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig, pipeline
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_core.tools import Tool
from langchain_community.utilities import GoogleSearchAPIWrapper
import torch
model_name = "../ext_models/Mistral-7B-Instruct-v0.2"

In [3]:
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", trust_remote_code=True, torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

In [3]:
generation_config = GenerationConfig.from_pretrained(model_name)
generation_config.max_new_tokens = 512
generation_config.temperature = 0.01
generation_config.top_p = 0.9
generation_config.do_sample = True
generation_config.repetition_penalty = 1.1
generation_config.pad_token_id = tokenizer.eos_token_id
open_llm = HuggingFacePipeline(pipeline=pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    return_full_text=True,
    generation_config=generation_config,
))

In [4]:
search = GoogleSearchAPIWrapper()

google_search = Tool(
    name="google_search",
    description="Search Google for recent results. Use this tool to get up to date information on any topic.",
    func=search.run,
)

In [17]:
agent = initialize_agent(
            [tool], 
            open_llm, 
            agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
            handle_parsing_errors=True,
            verbose = True)

  warn_deprecated(


In [24]:
agent.run("<s> [INST] How to add custom prompt template to huggingface transformer pipeline? [/INST]")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I should search Google to find information on how to add a custom prompt template to the Hugging Face Transformer pipeline.
Action:
```
{
  "action": "google_search",
  "action_input": "How to add custom prompt template to Hugging Face Transformer pipeline"
}
```[0m
Observation: [36;1m[1;3mOct 18, 2023 ... ... Hugging Face with the transformers.pipeline ... prompt to instruct the model to use ... prompt format into the code, and change some parameters in ... Hugging Face's logo Hugging Face ... How to add a pipeline to Transformers? Testing ... examples of the custom chat prompt template also make use of this format. Nov 14, 2023 ... We are going be using Hugging Face to load our quantized Mistral-7B model. ... text_generation_pipeline = transformers.pipeline( ... # Create prompt ... Hugging Face's logo Hugging Face ... examples Community resources Custom Tools and Prompts ... pipeline, let's explore how you can u

'I could not find a specific and actionable answer on how to add a custom prompt template to the Hugging Face Transformer pipeline.'

> So it seems that the open LLM does not search correctly for some reason. We need to research futher..
- After some additional research, it seems we need to somehow import the Mistral Template for it to work properly
  - Quickhack is to just include it in the prompt
- pipeline does not allow us to customise the prompt
- we need to extend LLM by creating CustomLLM class as specified [here](https://medium.com/@jorgepardoserrano/building-a-langchain-agent-with-a-self-hosted-mistral-7b-a-step-by-step-guide-85eda2fbf6c2):
- OR what if we try and create the agent via create_tool_calling_agent(llm, tools, prompt), which has a prompt variable!

In [5]:
!pip install langchainhub
!pip install -U langchain
!pip install numexpr
!pip install -U wikipedia

Collecting langchainhub
  Downloading langchainhub-0.1.15-py3-none-any.whl.metadata (621 bytes)
Collecting types-requests<3.0.0.0,>=2.31.0.2 (from langchainhub)
  Downloading types_requests-2.31.0.20240406-py3-none-any.whl.metadata (1.8 kB)
Collecting urllib3<3,>=1.21.1 (from requests<3,>=2->langchainhub)
  Using cached urllib3-2.2.1-py3-none-any.whl.metadata (6.4 kB)
Downloading langchainhub-0.1.15-py3-none-any.whl (4.6 kB)
Downloading types_requests-2.31.0.20240406-py3-none-any.whl (15 kB)
Using cached urllib3-2.2.1-py3-none-any.whl (121 kB)
Installing collected packages: urllib3, types-requests, langchainhub
  Attempting uninstall: urllib3
    Found existing installation: urllib3 1.26.18
    Uninstalling urllib3-1.26.18:
      Successfully uninstalled urllib3-1.26.18
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
unstructured 0.12.6 requires urllib3==1

In [7]:
from langchain.agents import create_react_agent, create_json_chat_agent, load_tools
from langchain.prompts import ChatPromptTemplate, PromptTemplate
from langchain.llms.base import LLM
from transformers.models.mistral.modeling_mistral import MistralForCausalLM
from transformers.models.llama.tokenization_llama_fast import LlamaTokenizerFast

In [11]:
open_llm_template = """<s>[INST]You are designed to solve tasks using tools.

These are the tools you can use: {tool_names}.

These are the tools descriptions:

{tools}

If you have enough information to answer the query use the tool "Final Answer". Its parameters is the solution.
If there is not enough information, keep trying.

This is my query="{input}".
{agent_scratchpad}
[/INST]
"""
prompt = PromptTemplate(template=open_llm_template, 
                               input_variables=["input", "tools", "tool_names", "agent_scratchpad"]
                       )
# from langchain import hub
# prompt = hub.pull("hwchase17/react-chat")

#### Lets try and define a CustomLLM

> Note: Using the pipeline as llm is not sufficient to work with the custom prompts below.
> Taken from this [notebook](https://colab.research.google.com/drive/1e5gJaUtGVvzJP_Nr-sBEL4J2JubC5YrE) from the tutorial listed above for creating CustomLLM

In [8]:
class CustomLLMMistral(LLM):
    model: MistralForCausalLM
    tokenizer: LlamaTokenizerFast

    @property
    def _llm_type(self) -> str:
        return "custom"

    def _call(self, prompt: str, stop = None,
        run_manager = None) -> str:
        messages = [
         {"role": "user", "content": prompt},
        ]

        encodeds = self.tokenizer.apply_chat_template(messages, return_tensors="pt")
        model_inputs = encodeds.to(self.model.device)

        generated_ids = self.model.generate(model_inputs,
                                            max_new_tokens=512,
                                            do_sample=True,
                                            pad_token_id=tokenizer.eos_token_id,
                                            top_k=3, temperature=0.01, top_p=0.9)
        decoded = self.tokenizer.batch_decode(generated_ids)

        output = decoded[0].split("[/INST]")[1].replace("</s>", "").strip()

        if stop is not None:
          for word in stop:
            output = output.split(word)[0].strip()

        while not output.endswith("```"):
          output += "`"

        return output

    @property
    def _identifying_params(self):
        return {"model": self.model}

In [9]:
open_llm = CustomLLMMistral(model=model, tokenizer=tokenizer)

In [10]:
# Additional tools
tools = load_tools(["llm-math", "wikipedia"], open_llm)

In [12]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

system = '''Assistant is a large language model.

Assistant is designed to be able to assist with a wide range of tasks, from answering
simple questions to providing in-depth explanations and discussions on a wide range of
topics. As a language model, Assistant is able to generate human-like text based on
the input it receives, allowing it to engage in natural-sounding conversations and
provide responses that are coherent and relevant to the topic at hand.
'''

human = '''TOOLS
------
Assistant can ask the user to use tools to look up information that may be helpful in answering the users original question.
The tools the human can use are:

{tools}

RESPONSE FORMAT INSTRUCTIONS
----------------------------

When responding to me, please output a response in one of two formats:

**Option 1:**
Use this if you want the human to use a tool.
Markdown code snippet formatted in the following schema:

```json
{{
    "action": string, \ The action to take. Must be one of {tool_names}
    "action_input": string \ The input to the action
}}
```

**Option #2:**
Use this if you want to respond directly to the human. Markdown code snippet formatted in the following schema:

```json
{{
    "action": "Final Answer",
    "action_input": string \ You should put what you want to return to use here
}}
```

USER'S INPUT
--------------------
Here is the user's input (remember to respond with a markdown code snippet of a json blob with a single action, and NOTHING else):

User: {input}'''

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        MessagesPlaceholder("chat_history", optional=True),
        ("human", human),
        MessagesPlaceholder("agent_scratchpad")
    ]
)

In [13]:
agent = create_json_chat_agent(
    tools = tools + [google_search],
    llm = open_llm,
    prompt = prompt,
)

In [14]:
from langchain.agents import AgentExecutor

agent_executor = AgentExecutor(agent=agent, tools=tools + [google_search], verbose=True, handle_parsing_errors=True)

In [20]:
import langchain
langchain.debug = False
# agent_executor.invoke({"input": "Calculate the hypotenuse of an isosceles right angle triangle if the side is 2"})
agent_executor.invoke({"input": "What is particle physics? Is it the same as quantum physics?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m```json
{
    "action": "wikipedia",
    "action_input": "Particle physics, also known as high-energy physics, is a branch of physics that studies the nature of particles that are the constituents of what is often called matter and radiation. It is the study of particles that are the composites of quarks and leptons, as well as the force carriers and fields. Is it the same as quantum physics? No, particle physics is a subfield of quantum physics. Quantum physics, also known as quantum mechanics, is a fundamental theory in physics which describes nature at the smallest scales of energy levels of atoms and subatomic particles."
}
```[0m[33;1m[1;3mPage: Particle physics
Summary: Particle physics or high-energy physics is the study of fundamental particles and forces that constitute matter and radiation. The field also studies combinations of elementary particles up to the scale of protons and neutrons, while the study of comb

{'input': 'What is particle physics? Is it the same as quantum physics?',
 'output': 'Particle physics and quantum physics are related fields in physics. Particle physics is the study of particles that make up matter and radiation, while quantum physics is a fundamental theory that describes nature at the smallest scales. The Higgs boson is an elementary particle in the Standard Model of particle physics that gives mass to other particles. It was proposed by Peter Higgs and other scientists in the 1960s.'}