# Function Calling in LLM Models

## Introduction
LLM models excel at various NLP tasks such as:

- **Question & Answering (Q&A)**
- **Pattern-based problem-solving**
- **Language translation**
- **Other NLP-related capabilities**

With the right **prompt engineering techniques**, many companies have replaced some of their traditional NLP-based solutions. These conventional solutions typically involve:

- **Data collection** 📊
- **Data processing** 🔄
- **Model training & testing** 🏋️‍♂️

Such tasks often require **a skilled data scientist**, making the process resource-intensive.

## Why Function Calling?
The core idea behind **Function Calling** is:

- 🤔 What if we want an LLM model to perform a specific task **in a customized way**, beyond its training?
- 🤖 What if we need the model to complete a task **it wasn't explicitly trained for**?
- ⏳ What if the context provided to the LLM is **real-time and dynamic**?

This is where **Function Calling** plays a crucial role! It extends the capabilities of LLMs by enabling them to:

- Interact with external systems 📡
- Process dynamic inputs 📥
- Retrieve real-time data ⏳
- Execute specific functions 🛠️

## Exploring Function Calling
Now that we understand the purpose, let's explore **Function Calling** in more detail!



# Function vs Tool in Python and LLMs

- In **Python**, we refer to it as a **function**.
- In **LLM terminology**, the same thing is called a **tool**.


In [3]:
## Lets create a simple function

from langchain_core.tools import tool
from typing import Optional

@tool
def greet(name:str,role:Optional[str]=None)-> str: # i don't know where below company exists. please don't get offended. This is strictly for educational purpose
    """
    this function will take a name and will greet them in the ways of WiseTechGaint companies culture

    Args:
        name (str): name of the employee in WiseTechGaint
        role (str,optional): role of the employee in WiseTechGaint. Defaults to None

    Returns:
        str: greeting message to the person

    Examples:
        >>> greet("sweet")
        Hello sweet, we are proud to invite you to WiseTechGaint. From today you are part of WSTians.

        >>> greet("chole","software developer")
        Hello Chole, we are pround to invite you to WiseTechGaint. Good Luck with your software developer role. From today you are part of WSTians
    """

    if role:
        return f"Hello {name}, we are pround to invite you to WiseTechGaint. Good Luck with your {role} role. From today you are part of WSTians"
    return f"Hello {name}, we are proud to invite you to WiseTechGaint. From today you are part of WSTians."

In [13]:
from langchain.chat_models import init_chat_model
from dotenv import load_dotenv

load_dotenv() # if have .env file which have groq API key as environment variable

tools = [greet] # we will all our tools here
llm = init_chat_model("deepseek-r1-distill-llama-70b",model_provider="groq")
llm_with_tools = llm.bind_tools(tools) # bind your tools here

In [14]:
model_resp = llm_with_tools.invoke("what is the result of 2 multiplied with 100") # when i have asked this math question, LLM understood that it need not call the greet function that i have created above.

In [16]:
model_resp.content

'The result of multiplying 2 by 100 is 200.\n\nStep-by-Step Explanation:\n1. **Understand the Problem**: We need to calculate 2 multiplied by 100.\n2. **Multiplication as Repeated Addition**: Multiplying 2 by 100 is the same as adding 100 two times: 100 + 100.\n3. **Perform the Addition**: Adding 100 and 100 gives 200.\n4. **Verification**: Another way to think about it is that multiplying by 100 adds two zeros to the number, turning 2 into 200.\n5. **Conclusion**: Therefore, 2 × 100 equals 200.\n\nAnswer: 200'

In [18]:
model_resp.tool_calls # you can see the tool call is empty

[]

In [22]:
model_resp = llm_with_tools.invoke("a new employee named andrew joined our company as a technical associate. kindly greet him.") # let's observe now

In [24]:
model_resp.content # now you can observe LLM model did not response with a message.

''

In [26]:
model_resp.tool_calls # instead LLM understood the context and retuns the function that has to be called along with function arguments

[{'name': 'greet',
  'args': {'name': 'andrew', 'role': 'technical associate'},
  'id': 'call_2tcq',
  'type': 'tool_call'}]

LLM models can't run the function by itself (**Langgraph solved this problem differently. We will explore this later in Agents Concepts**). But, it can return the function that we need to call.

In [28]:
for each_tool in model_resp.tool_calls:
    function_call = eval(each_tool["name"])
    print("model resp: ",function_call.invoke(each_tool["args"])) # we can only invoke the tool this way. not like normal function

model resp:  Hello andrew, we are pround to invite you to WiseTechGaint. Good Luck with your technical associate role. From today you are part of WSTians


### Now Let's Learn Structured Outputs

In [32]:
text_analysis_response = llm.invoke("""
Hi my name is Nichole, i am currently working in ABC tech company in Vadodara. I have been working in this place for mrore than 
5 years. This place is nice but my family is in London. Since we have an office in London, i am requesting you to pack and ship me
over there permenantly. i will continue to work productively as if i am right now. thank you

---
Task: Find names, locations, company_names in above sentence. Also, give a short 15 words summary of the above sentence. Also, provide sentiment of the sentence

Note: i only need JSON response. I dont want your explanation
Sample Response : {"names":[],"locations":[],"company_names":[],"summary":"","sentiment":""}
""")

In [37]:
text_analysis_response.content

'<think>\nAlright, I need to help the user extract specific information from their query. They provided a sentence and want me to find names, locations, and company names. Let me read through the sentence carefully.\n\nFirst, looking for names. I see "Nichole" mentioned at the beginning. That\'s a name. I don\'t see any other names in the sentence.\n\nNext, locations. There are two places mentioned: "Vadodara" and "London." Both are cities, so those are the locations.\n\nFor company names, the sentence says "ABC tech company." So that\'s the company name. I should include that.\n\nNow, the summary needs to be 15 words. The main points are Nichole working in Vadodara for ABC Tech, wanting to move to London because her family is there, and the office exists there. I\'ll condense that into a concise sentence.\n\nLastly, the sentiment. The overall tone is positive because she\'s expressing gratitude and willingness to continue working productively. So the sentiment is positive.\n\nI should

We got the output correctly in json, but we still need to write some logic to get JSON correctly.


Now Let's See how **Structured Output** Solves this problem

In [67]:
from pydantic import BaseModel,Field
from typing import List, Literal,Union,Any

class text_result(BaseModel):
    """
    For NLP Tasks related to extracting entities, intent, summary and sentiment
    """
    names:List[str] = Field(...,description="List of person names")
    locations:List[str] = Field(...,description="List of geographical locations")
    company_names:List[str] = Field(...,description="List of any type of company name")
    summary:str = Field(...,description="summary of the conversation/text")
    sentiment:Literal["positive","negative","neutral"] = Field(...,description="sentiment of the conversation/text")

It is not a compulsion that we have to use Pydantic


We can use normal `json schema` or standard `typing` module too. But `Pydantic` makes sure the data types are strictly followed

In [46]:
structured_llm = llm.with_structured_output(text_result)
structured_resp = structured_llm.invoke("""
Hi my name is Nichole, i am currently working in ABC tech company in Vadodara. I have been working in this place for mrore than 
5 years. This place is nice but my family is in London. Since we have an office in London, i am requesting you to pack and ship me
over there permenantly. i will continue to work productively as if i am right now. thank you

---
Task: Find names, locations, company_names in above sentence. Also, give a short 15 words summary of the above sentence. Also, provide sentiment of the sentence
""")

In [55]:
structured_resp # This is the beauty of structured response

text_result(names=['Nichole'], locations=['Vadodara', 'London'], company_names=['ABC Tech'], summary='Nichole requests a transfer to London from Vadodara for family reasons.', sentiment='neutral')

In [57]:
structured_resp.names # we can directly access extracted fields without any post-processing

['Nichole']

#### What if you want structured output for a specific query and for the rest, respond normally ?

In [69]:
class DefaultResponse(BaseModel):
    """
    For conversations other than NLP related tasks
    """
    response:str

class LLMResponse(BaseModel):
    modelresp: Union[text_result,DefaultResponse]

In [75]:
complex_structured_llm = llm.with_structured_output(LLMResponse,include_raw=False) # we can also preserve llmresponse using include_raw=True

structured_resp = complex_structured_llm.invoke("""
Hi my name is Nichole, i am currently working in ABC tech company in Vadodara. I have been working in this place for mrore than 
5 years. This place is nice but my family is in London. Since we have an office in London, i am requesting you to pack and ship me
over there permenantly. i will continue to work productively as if i am right now. thank you

---
Task: Find names, locations, company_names in above sentence. Also, give a short 15 words summary of the above sentence. Also, provide sentiment of the sentence
""")

In [76]:
structured_resp.modelresp # see we got a structured output for above query

text_result(names=['Nichole'], locations=['Vadodara', 'London'], company_names=['ABC Tech'], summary='Nichole requests transfer to London to be closer to family while continuing productive work.', sentiment='positive')

In [72]:
structured_resp = complex_structured_llm.invoke("what is the result if 5 is multiplied with 123")

In [74]:
structured_resp.modelresp # As you can see here, for other queries the response structure is different

DefaultResponse(response='The result of multiplying 5 by 123 is 615.')

`Note`: We can't use a single **LLM instance** to perform both **function calling** and **StructuredOutput**

# 🚀 A Quick Recap on What We Have Learned So Far

## 📌 Key Takeaways

1. **Using Python Functions as Tools for LLM Models**  
   - We learned how to pass Python functions as tools to LLM models.  
   - We can add multiple tools, but adding **30 or more** can be overwhelming. ⚠️ Too many tools might confuse the model.

2. **Structuring LLM Responses**  
   - We can structure LLM responses in a desired format using:
     - `pydantic` (✅ Preferred method)
     - JSON Schema
     - Standard `typing` library
   - If multiple structured response models are needed, we can use a **single parent Pydantic model**.
   - The `Union` feature in the `typing` library helps include multiple models efficiently.

3. **Tool Calls vs. Structured Responses**  
   - ❌ We **cannot** use both **tool calls** and **structured output responses** simultaneously for a **single LLM model instance**.

# ⚡ Model Response Caching

## 🤔 Why Do We Need This?
- If you are using an LLM model hosted on **OpenAI, Cohere, or GroqAI**, you are **paying** for each input and output token processed per request. 💰
- By default, even if a question has been answered before, the LLM still processes it again, **costing you unnecessarily**.
- **Solution:** Caching model results helps **save costs** and **improves response times**. ⚡

## 📍 Where Is This Cache Stored?
1. **In-Memory Caching** (🚀 Fast but temporary)
   - Limited by process lifecycle—cache **disappears** when the process restarts.

2. **Database-Based Caching** (✅ Persistent & Reliable)
   - Stores cache **across runtime restarts**.
   - Popular options include:
     - **SQLite** (Local file-based persistence 🗂️)
     - **Redis Cache** (Local & cloud-based database solution 🌍)
     - **AstraDB Cache** (Cloud-based database solution ☁️)
   - ...and many more!

---

### Redis Cloud Storage Cache

In [88]:
# I have created a free redis server for demo purpose. I am not an idiot i will delete this server by the time you see this.

import redis

redis_config = redis.Redis(
    host='redis-13626.c266.us-east-1-3.ec2.redns.redis-cloud.com',
    port=13626,
    decode_responses=True,
    username="default",
    password="UdKYui6jv3Bf7LiRr222fmeRMmo96Pll",
)

In [99]:
from langchain_core.globals import set_llm_cache
from langchain.cache import InMemoryCache ## use this for in memory cache

In [89]:
set_llm_cache(RedisCache(redis_=redis_config)) # I have set Redis Cloud Server for cache

In [107]:
%%time
resp = llm.invoke("tell me about Agents in 25 words") # you can see the response time for first hit is 2 seconds

CPU times: total: 15.6 ms
Wall time: 2.07 s


In [108]:
resp.content

"<think>\nOkay, so I need to explain what agents are in 25 words. Let me think about what an agent is. From what I remember, in computing, an agent is like a program that does tasks on your behalf. They can be autonomous, meaning they work on their own without much interference. They might have different roles, like a user agent in a web browser or a software agent handling specific tasks.\n\nI think agents can be smart, using AI to make decisions or learn. They might interact with environments, like in robotics or chatbots. Also, there are different types, such as intelligent agents that adapt and learn. So, putting this together, I should mention that agents are autonomous, can be AI-driven, operate in various environments, and handle tasks like information retrieval or decision-making.\n\nLet me count the words to make sure it's around 25. I'll try to be concise and cover the key points: autonomy, AI, environments, tasks, adaptability, learning, and examples like chatbots and robots

In [110]:
%%time
resp = llm.invoke("tell me about Agents in 25 words") # same question, now it took 389ms. proof of caching in case you didn't believe me

CPU times: total: 0 ns
Wall time: 349 ms


You can see data is inserted to redis server

![Redis Server Cache](redis_cache.PNG)

### InMemory Cache

In [111]:
from langchain_core.globals import set_llm_cache
from langchain.cache import InMemoryCache

set_llm_cache(InMemoryCache())

In [112]:
%%time
resp = llm.invoke("tell me about Agents in 25 words") # Since, we changed cache to InMemory, it again took 2.9 seconds

CPU times: total: 46.9 ms
Wall time: 2.9 s


In [114]:
resp.content # You can also observe content is different here. because LLM wont give same response for same query if caching is not implemented

'<think>\nOkay, so I need to figure out what an agent is, and then describe it in 25 words. Hmm, where do I start? I\'ve heard the term "agent" used in different contexts, so maybe I should break it down.\n\nFirst, in computing, I think an agent is like a program that does tasks automatically. Like, maybe a software agent that helps with something. I remember hearing about chatbots being a type of agent. They interact with users, answer questions, maybe even help with customer service. So, that\'s one aspect.\n\nThen there\'s the idea of agents in a more general sense. I think in AI, an intelligent agent is something that perceives its environment and takes actions to achieve goals. That makes sense. So, an agent could be a program or a system that operates with some degree of autonomy.\n\nWait, autonomy is a key point here. Agents don\'t just follow explicit instructions every time; they can make decisions based on their environment. But how exactly do they do that? I guess they use s

In [118]:
%%time
resp = llm.invoke("tell me about Agents in 25 words") # this is even faster 1.12 ms because this is in-memory where as, previously we used a cloud Redis DB

CPU times: total: 0 ns
Wall time: 1.12 ms


In [119]:
resp.content

'<think>\nOkay, so I need to figure out what an agent is, and then describe it in 25 words. Hmm, where do I start? I\'ve heard the term "agent" used in different contexts, so maybe I should break it down.\n\nFirst, in computing, I think an agent is like a program that does tasks automatically. Like, maybe a software agent that helps with something. I remember hearing about chatbots being a type of agent. They interact with users, answer questions, maybe even help with customer service. So, that\'s one aspect.\n\nThen there\'s the idea of agents in a more general sense. I think in AI, an intelligent agent is something that perceives its environment and takes actions to achieve goals. That makes sense. So, an agent could be a program or a system that operates with some degree of autonomy.\n\nWait, autonomy is a key point here. Agents don\'t just follow explicit instructions every time; they can make decisions based on their environment. But how exactly do they do that? I guess they use s