## <b><font color='darkblue'>Preface</font></b>
([article source](https://towardsdatascience.com/building-a-math-application-with-langchain-agents-23919d09a4d3)) <b><font size='3ptx'>A tutorial on why LLMs struggle with math, and how to resolve these limitations using LangChain Agents, OpenAI and Chainlit</font></b>

<b>In this tutorial, I will demonstrate how to use [LangChain](https://www.langchain.com/) agents to create a custom Math application utilising OpenAI’s GPT3.5 model</b>. For the application frontend, I will be using [**Chainlit**](https://chainlit.io/), an easy-to-use open-source Python framework. This generative math application, <b>let’s call it “Math Wiz”, is designed to help users with their math or reasoning/logic questions</b>.
![Wiz diagram](images/fig1.PNG)

## <b><font color='darkblue'>Why do LLMs struggle with Math?</font></b>
<b><font size='3ptx'>Large Language Models (LLMs) are known to be quite bad at Math, as well as reasoning tasks, and this is a common trait among many language models.</font></b>

There are a few different reasons for this:
* **Lack of training data:** One reasons is the limitations of their training data. Language models, trained on vast text datasets, may lack sufficient mathematical problems and solutions. This can lead to misinterpretations of numbers, forgetting important calculation steps, and a lack of quantitative reasoning skills.
* **Lack of numeric representations**: Another reason is that LLMs are designed to understand and generate text, operating on tokens instead of numeric values. Most text-based tasks can have multiple reasonable answers. However, math problems typically have only one correct solution.
* **Generative nature**: Due to the generative nature of these language models, generating consistently accurate and precise answers to math questions can be challenging for LLMs.

This makes the “Math problem” the perfect candidate for utilising LangChain agents. Agents are systems that use a language model to interact with other tools to break down a complex problem (more on this later). The code for this tutorial is available on my [**GitHub**](https://github.com/tahreemrasul/math_app_langchain).

## <b><font color='darkblue'>Application Flow</font></b>
<b><font size='3ptx'>The application flow for Math Wiz is outlined in the flowchart below. The agent in our pipeline will have a set of tools at its disposal that it can use to answer a user query. </font></b>

<b>The Large Language Model (LLM) serves as the “brain” of the agent, guiding its decisions</b>. When a user submits a question, the agent uses the LLM to select the most appropriate tool or a combination of tools to provide an answer. If the agent determines it needs multiple tools, it will also specify the order in which the tools are used.

![App flow](images/fig2.PNG)

The agent for our Math Wiz app will be using the following tools:
* **Wikipedia Tool**: this tool will be responsible for fetching the latest information from Wikipedia using the Wikipedia API. While there are paid tools and APIs available that can be integrated inside LangChain, I would be using Wikipedia as the app’s online source of information.
* **Calculator Tool**: this tool would be responsible for solving a user’s math queries. This includes anything involving numerical calculations. For example, if a user asks what the square root of 4 is, this tool would be appropriate.
* **Reasoning Tool**: the final tool in our application setup would be a reasoning tool, responsible for tackling logical/reasoning-based user queries. Any mathematical word problems should also be handled with this tool.

Now that we have a rough application design, we can began thinking about building this application.

### <b><font color='darkgreen'>Understanding LangChain Agents</font></b>
<b><font size='3ptx'>LangChain agents enhance the interaction with language models by providing an interface for more complex and interactive tasks.</font></b>

<b>We can think of an agent as an intermediary between users and a large language model</b>. Agents seek to break down a seemingly complex user query, that our LLM might not be able to tackle on its own, into easier, actionable steps.

In our application flow, we defined a few different tools that we would like to use for our math application. <b>Based on the user input, the agent should decide which of these tools to use. If a tool is not required, it should not be used. LangChain agents can simplify this for us</b>. These agents use a language model to choose a sequence of actions to take.

<b>Essentially, the LLM acts as the “brain” of the agent, guiding it on which tool to use for a particular query, and in which order.</b> This is different from LangChain chains where the sequence of actions are hardcoded in code. LangChain offers a wide set of tools that can be integrated with an agent. These tools include, and are not limited to, online search tools, API-based tools, chain-based tools etc.

For more information on LangChain agents and their types, see [this](https://python.langchain.com/docs/modules/agents/).

### <b><font color='darkgreen'>Step-by-Step Implementation</font></b>

#### <b>Step 1</b>
Create a <font color='olive'>chatbot.py</font> script and import the necessary dependencies:
```python
from langchain_openai import OpenAI
from langchain.chains import LLMMathChain, LLMChain
from langchain.prompts import PromptTemplate
from langchain_community.utilities import WikipediaAPIWrapper
from langchain.agents.agent_types import AgentType
from langchain.agents import Tool, initialize_agent
from dotenv import load_dotenv


load_dotenv()
```

#### <b>Step 2</b>
Next, we will define our OpenAI-based Language Model. LangChain offers the langchain-openai package which can be used to define an instance of the OpenAI model. We will be using the `gpt-3.5-turbo-instruct` model from OpenAI. The [**dotenv**](https://pypi.org/project/python-dotenv/) package would already be handling the API key so you do not need to explicitly define it here:
```python
llm = OpenAI(model='gpt-3.5-turbo-instruct',
             temperature=0)
```

We would be using this LLM both within our math and reasoning chains and as the decision maker for our agent.

#### <b>Step 3</b>
<b>When constructing your own agent, you will need to provide it with a list of tools that it can use</b>. Besides the actual function that is called, the Tool consists of a few other parameters:
* **name (str)**, is required and must be unique within a set of tools provided to an agent.
* **description (str)**, is optional but recommended, as it is used by an agent to determine tool use

We will now create our three tools. The first one will be the online tool using the Wikipedia API wrapper:
```python
wikipedia = WikipediaAPIWrapper()
wikipedia_tool = Tool(
    name="Wikipedia",
    func=wikipedia.run,
	description="A useful tool for searching the Internet 
to    find information on world events, issues, dates, years, etc. Worth 
using for general topics. Use precise questions.")
```

In the code above, we have defined an instance of the Wikipedia API wrapper. Afterwards, we have wrapped it inside a LangChain <font color='blue'><b>Tool</b></font>, with the name, function and description.

<b>Next, let’s define the tool that we will be using for calculating any numerical expressions</b>. LangChain offers the [**LLMMathChain**](https://api.python.langchain.com/en/latest/chains/langchain.chains.llm_math.base.LLMMathChain.html#langchain.chains.llm_math.base.LLMMathChain) which uses the [**numexpr**](https://pypi.org/project/numexpr/) Python library to calculate mathematical expressions. It is also important that we clearly define what this tool would be used for. The `description` can be helpful for the agent in deciding which tool to use from a set of tools for a particular user query. For our chain-based tools, we will be using the <font color='blue'><b>Tool</b>.from_function()</font> method:
```python
problem_chain = LLMMathChain.from_llm(llm=llm)
math_tool = Tool.from_function(name="Calculator",
                func=problem_chain.run,
                 description="Useful for when you need to answer questions 
about math. This tool is only for math questions and nothing else. Only input
math expressions.")
```

<b>Finally, we will be defining the tool for logic/reasoning-based queries</b>. We will first create a prompt to instruct the model with executing the specific task. Then we will create a simple LLMChain for this tool, passing it the LLM and the prompt:
```python
word_problem_template = """You are a reasoning agent tasked with solving 
the user's logic-based questions. Logically arrive at the solution, and be 
factual. In your answers, clearly detail the steps involved and give the 
final answer. Provide the response in bullet points. 
Question  {question} Answer"""

math_assistant_prompt = PromptTemplate(input_variables=["question"],
                                       template=word_problem_template
                                       )
word_problem_chain = LLMChain(llm=llm,
                              prompt=math_assistant_prompt)
word_problem_tool = Tool.from_function(name="Reasoning Tool",
                                       func=word_problem_chain.run,
                                       description="Useful for when you need 
to answer logic-based/reasoning questions.",
                                    )
```

#### <b>Step 4</b>
We will now initialize our agent with the tools we have created above. We will also specify the LLM to help it choose which tools to use and in what order:
```python
agent = initialize_agent(
    tools=[wikipedia_tool, math_tool, word_problem_tool],
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=False,
    handle_parsing_errors=True
)

print(agent.invoke(
    {"input": "I have 3 apples and 4 oranges. I give half of my oranges 
               away and buy two dozen new ones, alongwith three packs of 
               strawberries. Each pack of strawberry has 30 strawberries. 
               How  many total pieces of fruit do I have at the end?"}))
```

In [1]:
#!python3 chatbot.py

### <b><font color='darkgreen'>Creating Chainlit Application</font></b>
<b><font size='3ptx'>We will be using [Chainlit](https://chainlit.io/), an open-source Python framework, to build our application. </font></b>

With Chainlit, you can build conversational AI applications with a few simple lines of code. To get a deeper understanding of Chainlit functionalities and how the app is set up, you can take a look at my [**article here**](https://medium.com/@tahreemrasul/building-a-chatbot-application-with-chainlit-and-langchain-3e86da0099a6?source=post_page-----23919d09a4d3--------------------------------).

## <b><font color='darkblue'>Testing and Validation</font></b>
<b><font size='3ptx'>Let’s now validate the performance of our bot.</font></b>

We have not integrated any memory into our bot, so each query would have to be its own function call. Let’s ask our app a few math questions. For comparison, I am attaching screenshots of each response for the same query from both Chat GPT 3.5, and our Math Wiz app.

In [1]:
import chatbot
import os
import openai
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv

load_dotenv()
openai.api_key = os.environ['OPENAI_API_KEY']

  warn_deprecated(
  warn_deprecated(


### <b><font color='darkgreen'>Arithmetic Questions</font></b>

#### <b>Question 1</b>
```
Question? What is the cube root of 625?

Response:
The cube root of 625 is 8.549879733383484.
```

![fig3](images/fig3.PNG)

In [2]:
question = "What is the cube root of 625?"

In [3]:
# By purely LLM
chatbot.input_quesiton(question)

'The cube root of 625 is 5.'

In [4]:
# By Math Wiz
chatbot.input_quesiton("A:" + question)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use a calculator to solve this math problem.
Action: Calculator
Action Input: 625^(1/3)[0m
Observation: [33;1m[1;3mAnswer: 8.549879733383484[0m
Thought:[32;1m[1;3m I now know the final answer.
Final Answer: The cube root of 625 is 8.549879733383484.[0m

[1m> Finished chain.[0m


'The cube root of 625 is 8.549879733383484.'

<b>Our Math Wiz app was able to correctly answer this question. However, ChatGPT’s response is incorrect</b>. It not only complicated the reasoning steps unnecessarily but also failed to reach the correct answer. However, on a separate occasion, ChatGPT was able to answer this question correctly. This is of course unreliable.

#### <b>Question 2</b>
```
what is cube root of 81? Multiply with 13.27, and subtract 5.

# correct answer = 52.4195
```

In [5]:
math_question2 = "what is cube root of 81? Multiply with 13.27, and subtract 5."

In [6]:
# By purely LLM
print(chatbot.input_quesiton(math_question2))

The cube root of 81 is 4.

4 * 13.27 = 53.08

53.08 - 5 = 48.08

Therefore, the result of multiplying the cube root of 81 by 13.27 and subtracting 5 is 48.08.


In [7]:
# By Math Wiz
print(chatbot.input_quesiton("A:" + math_question2))



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use the calculator tool to solve this math problem.
Action: Calculator
Action Input: (81^(1/3))*13.27-5[0m
Observation: [33;1m[1;3mAnswer: 52.415955393937914[0m
Thought:[32;1m[1;3m I now know the final answer.
Final Answer: 52.415955393937914[0m

[1m> Finished chain.[0m
52.415955393937914


Our Math Wiz app was able to correctly answer this question too. However, once again, ChatGPT’s response isn’t correct. Occasionally, ChatGPT can answer math questions correctly, but this is subject to prompt engineering and multiple inputs.

### <b><font color='darkgreen'>Reasoning Questions</font></b>
Let’s ask our app a few reasoning/logic questions. Some of these questions have an arithmetic component to them. I’d expect the agent to decide which tool to use in each case.

#### <b>Question 1</b>
```
I have 3 apples and 4 oranges. I give half of my oranges away and buy two 
dozen new ones, alongwith three packs of strawberries. Each pack of 
strawberry has 30 strawberries. How  many total pieces of fruit do I have at 
the end?

# correct answer = 3 + 2 + 24 + 90 = 119
```

In [13]:
reason_question1 = '''I have 3 apples and 4 oranges. I give half of my oranges away and buy two 
dozen new ones, along with three packs of strawberries.
Each pack of strawberry has 30 strawberries.
How many total pieces of fruit do I have at  the end?'''

In [9]:
# By purely LLM
print(chatbot.input_quesiton(reason_question1))

After giving away half of my oranges, I have 2 oranges left. 

Then I buy two dozen new oranges, which is 24 oranges. 

So, I now have a total of 2 + 24 = 26 oranges. 

I also have 3 packs of strawberries, with each pack having 30 strawberries. 

So, I have 3 * 30 = 90 strawberries. 

Adding the apples, oranges, and strawberries together, I have a total of 3 (apples) + 26 (oranges) + 90 (strawberries) = 119 pieces of fruit. 

Therefore, I have 119 pieces of fruit at the end.


In [14]:
# By Math Wiz
print(chatbot.input_quesiton("A:" + reason_question1))



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to figure out how many oranges I have left after giving half away and how many strawberries I have in total.
Action: Calculator
Action Input: 4/2 + (3*24) + (3*30)[0m
Observation: [33;1m[1;3mAnswer: 164.0[0m
Thought:[32;1m[1;3m I now know how many total pieces of fruit I have.
Final Answer: 164 pieces of fruit.[0m

[1m> Finished chain.[0m
164 pieces of fruit.


#### <b>Question 2</b>
```
Steve's sister is 10 years older than him. Steve was born when the cold war 
ended. When was Steve's sister born?

# correct answer = 1991 - 10 = 1981
```

In [15]:
reason_question2 = "Steve's sister is 10 years older than him. Steve was born when the cold war ended. When was Steve's sister born?"

In [16]:
# By purely LLM
chatbot.input_quesiton(reason_question2)

"Steve was born in 1991 when the Cold War ended. Therefore, Steve's sister was born in 1981, which is 10 years before Steve's birth."

In [17]:
# By Math Wiz
chatbot.input_quesiton("A:" + reason_question2)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find out the year the cold war ended and then calculate 10 years before that.
Action: Calculator
Action Input: 2021 - 10[0m
Observation: [33;1m[1;3mAnswer: 2011[0m
Thought:[32;1m[1;3m Now I know the year Steve's sister was born.
Final Answer: Steve's sister was born in 2011.[0m

[1m> Finished chain.[0m


"Steve's sister was born in 2011."

#### <b>Question 3</b>
```
give me the year when Tom Cruise's Top Gun released raised to the power 2

# correct answer = 1987**2 = 3944196
```

In [2]:
reason_question3 = "give me the year when Tom Cruise's Top Gun released raised to the power 2"

In [3]:
# By purely LLM
chatbot.input_quesiton(reason_question3)

'1986^2 = 3,944'

In [4]:
# By Math Wiz
chatbot.input_quesiton("A:" + reason_question3)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find the year when Top Gun was released and then raise it to the power of 2
Action: Wikipedia
Action Input: "Top Gun (film)"[0m
Observation: [36;1m[1;3mPage: Top Gun
Summary: Top Gun is a 1986 American action drama film directed by Tony Scott and produced by Don Simpson and Jerry Bruckheimer, with distribution by Paramount Pictures. The screenplay was written by Jim Cash and Jack Epps Jr., and was inspired by an article titled "Top Guns", written by Ehud Yonay and published in California magazine three years earlier. It stars Tom Cruise as Lieutenant Pete "Maverick" Mitchell, a young naval aviator aboard the aircraft carrier USS Enterprise. He and his radar intercept officer, Lieutenant (junior grade) Nick "Goose" Bradshaw (Anthony Edwards), are given the chance to train at the United States Navy's Fighter Weapons School (Top Gun) at Naval Air Station Miramar in San Diego, California. Kelly McGillis, Val Kilmer 

'3944196'

Our Math Wiz app was able to correctly answer this question. ChatGPT’s response is once again incorrect. Even though it was able to correctly figure out the release date of the film, the final calculation was incorrect.

## <font color='darkblue'><b>Conclusion & Next Steps</b></font>
<font size='3ptx'><b>In this tutorial, we used LangChain agents and tools to create a math solver that could also tackle a user’s reasoning/logic questions</b></font>. We saw that our `Math Wiz` app correctly answered all questions, however, most answers given by ChatGPT were incorrect. This is a great first step in building the tool. The **LLMMathChain** can however fail based on the input we are providing, if it contains string-based text. This can be tackled in a few different ways, such as by creating error handling utilities for your code, adding post-processing logic for the **LLMMathChain**, as well as using custom prompts. You could also increase the efficacy of the tool by including a search tool for more sophisticated and accurate results, since Wikipedia might not have updated information sometimes. You can find the code from this tutorial on my [**GitHub**](https://github.com/tahreemrasul/math_app_langchain).

## <b><font color='darkblue'>Supplement</font></b>
* [LangChain doc - Tool Calling with LangChain](https://blog.langchain.dev/tool-calling-with-langchain/)