<a href="https://colab.research.google.com/github/jaideep11061982/GenAINotebooks/blob/main/Xlams_YTUBE_027_PUBLIC_xLAM_Function_Calling.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

xLAM ::

https://blog.salesforceairesearch.com/large-action-models/

In [None]:
import shutil, os, subprocess
from google.colab import drive
import requests
drive.mount('/content/drive')
os.chdir('/content/drive/MyDrive/YouTube/')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
!python --version

Python 3.10.12


In [None]:
! pip install transformers==4.41.0 datasets==2.19.1 tokenizers==0.19.1 flask==2.2.5



In [None]:
import json
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
torch.random.manual_seed(0)
model_name = "Salesforce/xLAM-7b-fc-r"
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_name)

##### Model and Tokenizer Loaded

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [None]:
# Please use our provided instruction prompt for best performance
task_instruction = """
You are an expert in composing functions. You are given a question and a set of possible functions.
Based on the question, you will need to make one or more function/tool calls to achieve the purpose.
If none of the functions can be used, point it out and refuse to answer.
If the given question lacks the parameters required by the function, also point it out.
""".strip()

format_instruction = """
The output MUST strictly adhere to the following JSON format, and NO other text MUST be included.
The example format is as follows. Please make sure the parameter type is correct. If no function call is needed, please make tool_calls an empty list '[]'.
```
{
    "tool_calls": [
    {"name": "func_name1", "arguments": {"argument1": "value1", "argument2": "value2"}},
    ... (more tool calls as required)
    ]
}
```
""".strip()

In [None]:
from pprint import pprint
get_weather_api = {
    "name": "get_weather",
    "description": "Get the current weather for a location",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, New York"
            },
            "unit": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"],
                "description": "The unit of temperature to return"
            }
        },
        "required": ["location"]
    }
}

search_api = {
    "name": "search",
    "description": "Search for information on the internet",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "The search query, e.g. 'latest news on AI'"
            }
        },
        "required": ["query"]
    }
}

openai_format_tools = [get_weather_api, search_api]

# Helper function to convert openai format tools to our more concise xLAM format
def convert_to_xlam_tool(tools):
    ''''''
    if isinstance(tools, dict):
        return {
            "name": tools["name"],
            "description": tools["description"],
            "parameters": {k: v for k, v in tools["parameters"].get("properties", {}).items()}
        }
    elif isinstance(tools, list):
        return [convert_to_xlam_tool(tool) for tool in tools]
    else:
        return tools

# Helper function to build the input prompt for our model
def build_prompt(task_instruction: str, format_instruction: str, tools: list, query: str):
    prompt = f"[BEGIN OF TASK INSTRUCTION]\n{task_instruction}\n[END OF TASK INSTRUCTION]\n\n"
    prompt += f"[BEGIN OF AVAILABLE TOOLS]\n{json.dumps(xlam_format_tools)}\n[END OF AVAILABLE TOOLS]\n\n"
    prompt += f"[BEGIN OF FORMAT INSTRUCTION]\n{format_instruction}\n[END OF FORMAT INSTRUCTION]\n\n"
    prompt += f"[BEGIN OF QUERY]\n{query}\n[END OF QUERY]\n\n"
    return prompt

xlam_format_tools = convert_to_xlam_tool(openai_format_tools)

01. DEFAULT EXAMPLE

In [None]:
def custom_func_def(query):

  content = build_prompt(task_instruction, format_instruction, xlam_format_tools, query)

  messages=[
      { 'role': 'user', 'content': content}
  ]
  inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

  # tokenizer.eos_token_id is the id of <|EOT|> token
  outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
  res = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)
  return(res)

In [None]:
# Define the input query and available tools


query = "What's the weather like in New York in fahrenheit?"
print(custom_func_def(query))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:100015 for open-end generation.



{"tool_calls": [{"name": "get_weather", "arguments": {"location": "New York", "unit": "fahrenheit"}}]}


In [None]:
query = "give latest news on startups"
print(custom_func_def(query))


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:100015 for open-end generation.



{"tool_calls": [{"name": "search", "arguments": {"query": "latest news on startups"}}]}


02. OWN API USING FLASK

In [None]:
from flask import Flask, request, jsonify
import threading
app = Flask(__name__)



@app.route("/")
def main():
    return jsonify(status="healthy")

@app.route('/customer', methods=['GET'])
def get_customer():
    customer_id = request.args.get('customer_id')
    dummy_data = {
    '1001': {'customer_id': '1001', 'name': 'John Doe', 'status': 'active', 'balance': 250.00},
    '1002': {'customer_id': '1002', 'name': 'Jane Smith', 'status': 'inactive', 'balance': 0.00},
    '1003': {'customer_id': '1003', 'name': 'Alice Brown', 'status': 'active', 'balance': 1250.50},
    }

    if not customer_id:
        return jsonify({"error": "customer_id is required"}), 400
    customer_data = dummy_data.get(customer_id, {"error": "Customer not found"})
    return jsonify(customer_data)

@app.route('/send_email', methods=['GET'])
def send_email():
    email = request.args.get('email')

    customer_data = {"status":"sent"}
    return jsonify(customer_data)

if __name__ == "__main__":
    threading.Thread(target=lambda: app.run(debug=True, use_reloader=False)).start()

In [None]:
! curl http://localhost:5000/

INFO:werkzeug:127.0.0.1 - - [08/Sep/2024 12:33:22] "GET / HTTP/1.1" 200 -


{
  "status": "healthy"
}


In [None]:
! curl -X GET "http://localhost:5000/customer?customer_id=1002"

INFO:werkzeug:127.0.0.1 - - [08/Sep/2024 12:34:05] "GET /customer?customer_id=1002 HTTP/1.1" 200 -


{
  "balance": 0.0,
  "customer_id": "1002",
  "name": "Jane Smith",
  "status": "inactive"
}


In [None]:
! curl -X GET "http://localhost:5000/send_email?email=abc@abc.com"

INFO:werkzeug:127.0.0.1 - - [08/Sep/2024 12:34:19] "GET /send_email?email=abc@abc.com HTTP/1.1" 200 -


{
  "status": "sent"
}


03. CALLS to API

In [None]:
from pprint import pprint
customer_api = {
    "name": "customer",
    "description": "Get the customer_id as input and return results",
    "parameters": {
        "type": "object",
        "properties": {
            "customer_id": {
                "type": "integer",
                "description": " the customer_id"
            }
        },
        "required": ["customer_id"]
    }
}

send_email_api = {
    "name": "send_email",
    "description": "Send email to email address given in function",
    "parameters": {
        "type": "object",
        "properties": {
            "email": {
                "type": "string",
                "description": "emailid"
            }
        },
        "required": ["email"]
    }
}

openai_format_tools = [customer_api, send_email_api]
xlam_format_tools = convert_to_xlam_tool(openai_format_tools)
openai_format_tools

[{'name': 'customer',
  'description': 'Get the customer_id as input and return results',
  'parameters': {'type': 'object',
   'properties': {'customer_id': {'type': 'integer',
     'description': ' the customer_id'}},
   'required': ['customer_id']}},
 {'name': 'send_email',
  'description': 'Send email to email address given in function',
  'parameters': {'type': 'object',
   'properties': {'email': {'type': 'string', 'description': 'emailid'}},
   'required': ['email']}}]

In [None]:
# Define the input query and available tools
# Build the input and start the inference



query = "get results for customer id 1001"
print(custom_func_def(query))

query = "send email to address abc@abc.com"
print(custom_func_def(query))


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:100015 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:100015 for open-end generation.



{"tool_calls": [{"name": "customer", "arguments": {"customer_id": 1001}}]}

{"tool_calls": [{"name": "send_email", "arguments": {"email": "abc@abc.com"}}]}


In [None]:
def base_call_api(query):
  base_url = "http://localhost:5000/"
  json_response_from_model = json.loads(custom_func_def(query))
  api_url = json_response_from_model["tool_calls"][0]['name']
  api_url = base_url + api_url

  # api_url

  params = json_response_from_model["tool_calls"][0]["arguments"]
  response = requests.get(api_url, params=params)
  return(response.json())

In [None]:

query = "get results for customer id 1001"
print(base_call_api(query))

query = "send email to address abc@abc.com"
print(base_call_api(query))


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:100015 for open-end generation.
INFO:werkzeug:127.0.0.1 - - [08/Sep/2024 11:46:39] "GET /customer?customer_id=1001 HTTP/1.1" 200 -
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:100015 for open-end generation.


{'balance': 250.0, 'customer_id': '1001', 'name': 'John Doe', 'status': 'active'}


INFO:werkzeug:127.0.0.1 - - [08/Sep/2024 11:46:41] "GET /send_email?email=abc@abc.com HTTP/1.1" 200 -


{'status': 'sent'}
