# **OpenFunctions from Gorilla - Hosted Locally** 🚀

[![GitHub](https://badges.aleen42.com/src/github.svg)](https://github.com/ShishirPatil/gorilla)  [![arXiv](https://img.shields.io/badge/arXiv-2305.15334-<COLOR>.svg?style=flat-square)](https://arxiv.org/abs/2305.15334)   [![Discord](https://img.shields.io/discord/1111172801899012102?label=Discord&logo=discord&logoColor=green&style=flat-square)](https://discord.gg/SwTyuTAxX3)  [![Twitter](https://img.shields.io/twitter/url?url=https://twitter.com/shishirpatil_/status/1661780076277678082)](https://twitter.com/shishirpatil_/status/1661780076277678082)

🦍 Try out Gorilla for yourself! Here, we show how you can download the OpenFunctions model from Gorilla and run inference locally on Google Colab's FREE T4 GPU! 🤖

👾 WARNING: running Gorilla locally requires some patience, as the model must be downloaded. Expect to wait up to 15 minutes for the model to load, possibly more depending on Colab usage. ⏳


🟢 Now with Apache-2.0! Gorilla is commercially usable with no obligations 🚀


💃 If you want to use Gorilla or build on top of it: Feel absolutely free to do so - we believe in open source research and you don't even have to tell us! In case you choose to do something with it, we have a vibrant community in [Discord](https://discord.gg/se2NgJxtJc)! Stop by and say Hi 👋


<img src="https://github.com/ShishirPatil/gorilla/blob/gh-pages/assets/img/logo.png?raw=true" width=20% height=20%>

# First, some setup and downloading the model:

### 🚨**[IMPORTANT]🚨 Model will not load without this step:**
Before you begin, click on the "Runtime" tab in the toolbar and select "Change Runtime Type". Select the free "T4 GPU" option and save. (Alternatively, click "Connect" in the top right corner, which may automatically connect you to the GPU if you have used it in the past).

In [1]:
# Install and import necessary libraries
!pip install -qU sentencepiece accelerate bitsandbytes
import json
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, BitsAndBytesConfig

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m270.9/270.9 kB[0m [31m6.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m105.0/105.0 MB[0m [31m10.4 MB/s[0m eta [36m0:00:00[0m
[?25h

In [63]:
# This function will format prompts correctly
def get_prompt(user_query: str, functions: list = []) -> str:
    """
    Generates a conversation prompt based on the user's query and a list of functions.

    Parameters:
    - user_query (str): The user's query.
    - functions (list): A list of functions to include in the prompt.

    Returns:
    - str: The formatted conversation prompt.
    """
    if len(functions) == 0:
        return f"USER: <<question>> {user_query}\nASSISTANT: "
    functions_string = json.dumps(functions)
    return f"USER: <<question>> {user_query} <<function>> {functions_string}\nASSISTANT: "

# This function will return Gorilla's response
def get_gorilla_response(query: str, functions: list=[]) -> str:
    prompt = get_prompt(query, functions)
    output = pipe(prompt)[0]['generated_text'].splitlines()[-1]
    return output[output.index(":")+1:].strip()

In [9]:
# Device Setup
device : str = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32

# Quantization Setup
# This configuration compresses the model and makes it possible to use with Colab's GPU
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch_dtype
)

### Warning: This cell will take a long time to run (up to 15 minutes or more)

In [8]:
# Model and tokenizer setup
model_id : str = "gorilla-llm/gorilla-openfunctions-v1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch_dtype,
    quantization_config=bnb_config,
    low_cpu_mem_usage=True,
    device_map="auto"
)

# Pipeline setup
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=128,
    batch_size=16,
    torch_dtype=torch_dtype,
)

config.json:   0%|          | 0.00/657 [00:00<?, ?B/s]

pytorch_model.bin.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

pytorch_model-00001-of-00002.bin:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

pytorch_model-00002-of-00002.bin:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/183 [00:00<?, ?B/s]

---

# 🚀 Using gorilla is as easy as calling `get_gorilla_response()` with your prompt! Try out Gorilla, and share your interesting findings in `#showcase` on our [Discord](https://discord.gg/3apqwwME)! 🤩

# Gorilla OpenFunctions

## Gorilla OpenFunctions is a drop in replacement for OpenAI Function Call API!

## Introduction
**OpenFunctions** is designed to extend Large Language Model (LLM) Chat Completion features by formulating executable API calls from natural language instructions and API context. With this LLMs can fill parameters for a diverse range of services, from Instagram and Doordash to tools like Google Calendar and Stripe, to enterprise services such as Salesforce and Datadog. With Open Functions, even those unfamiliar with API calls or programming can use the model to generate desired API calls. Trained on a curated collection of API documentations and associated Q&A pairs, OpenFunctions is another step in the Gorilla Paradigm's ongoing evolution, aiming for enhanced quality and accuracy in function call generation.

<!-- Insert Image here -->

## Code Function Calling API vs. REST API
Throughout our data collection process, we've discerned that general API calling can broadly bifurcate into two categories:

1. **Code Function Calling APIs**:
    - Predominantly observed in external Python packages like Numpy and Sklearn.
    - Characterized by well-defined and easily formatted calls.
    - Simply knowing the `api_name` (e.g., `numpy.sum()`) and `arguments` specifications allows the extrapolation of an executable function API.
    - Owing to its consistent format and fixed locality, fine-tuning the model requires relatively minimal data.

2. **REST APIs**
    - Traditional `GET` and `POST` requests.

## How to use Open Functions
Leveraging **Gorilla OpenFunctions** is refreshingly straightforward:

1. **Define Your Functions**:
    - Furnish a JSON file detailing your custom functions.
    - Each function should encompass fields: `name`, `api_call`, `description`, and `parameters`.
    - Below is an example for a comprehensive API documentation suitable for Open Function:
      ```python
      function_documentation = {  
          "name" : "Order Food on Uber",
          "api_call": "uber.eat.order",
          "description": "Order food on uber eat, specifying items and their quantities",
          "parameters": [
              {
                  "name": "restaurants",
                  "description": "The chosen restaurant"
              },
              {
                  "name": "items",
                  "description": "List of selected items"
              },
              {
                  "name": "quantities",
                  "description": "Quantities corresponding to the chosen items"
              }
          ]
      }
      ```

2. **Ask Your Question**:
    - Frame your requirement conversationally.
    - For instance: *I want to order five burgers and six chicken wings from McDonald's.*

3. **Get Your Function Call**:
    - The model deciphers your request and reciprocates with a Python function call.
    - This paradigm expands horizons for both developers and laypersons, enabling them to harness intricate functionalities sans extensive coding.
      ```python
      Input:
      get_gorilla_response(prompt="I want to order five burgers and six chicken wings from McDonald's.", functions=[function_documentation])
      
      Output:
      uber.eat.order(restaurants="McDonald", items=["chicken wings", "burgers"], quantities=[6,5])
      ```

---

## Here are some examples. Just run the cells to see how Gorilla performs!

You can edit any of the blocks in-place, and try it out yourself. Also try adding your own cells below!

### Weather:

In [64]:
query = "What's the weather like in Boston, in degrees celsius?"

# Example dummy function hard coded to return the same weather
# In production, this could be your backend API or an external API
def get_current_weather(location, unit="fahrenheit"):
    """Get the current weather in a given location"""
    weather_info = {
        "location": location,
        "temperature": "72",
        "unit": unit,
        "forecast": ["sunny", "windy"],
    }
    return json.dumps(weather_info)


functions = [
    {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA",
                },
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
            },
            "required": ["location"],
        },
    }
]

In [66]:
output = get_gorilla_response(query, functions)
print(output)

get_current_weather(location="Boston", unit="celsius")


### Uber:

In [67]:
query: str = "Call me an Uber ride type \"Plus\" in Berkeley at zipcode 94704 in 10 minutes"
functions = [
    {
        "name": "Uber Carpool",
        "api_name": "uber.ride",
        "description": "Find suitable ride for customers given the location, type of ride, and the amount of time the customer is willing to wait as parameters",
        "parameters":  [
            {"name": "loc", "description": "Location of the starting place of the Uber ride"},
            {"name": "type", "enum": ["plus", "comfort", "black"], "description": "Types of Uber ride user is ordering"},
            {"name": "time", "description": "The amount of time in minutes the customer is willing to wait"}
        ]
    }
]

In [68]:
output = get_gorilla_response(query, functions)
print(output)

uber.ride(loc="94704", type="plus", time=10)


### Uber Eats:

In [69]:
query = "I want to order five burgers and six chicken wings from McDonald's."
function_documentation = {
    "name" : "Order Food on Uber",
    "api_call": "uber.eat.order",
    "description": "Order food on uber eat, specifying items and their quantities",
    "parameters": [
        {
            "name": "restaurants",
            "description": "The chosen restaurant"
        },
        {
            "name": "items",
            "description": "List of selected items"
        },
        {
            "name": "quantities",
            "description": "Quantities corresponding to the chosen items"
        }
    ]
}

In [70]:
output = get_gorilla_response(query, functions=[function_documentation])
print(output)

uber.eat.order(restaurants="McDonald's", items=["burger", "chicken wing"], quantities=5, 6)


### AWS:

In [71]:
query = "I want to list the exports for my bot with the bot id \"my-bot-id\" and the bot version \"v2\"."
functions = [
    {
        "domain": "Cloud Infrastructure",
        "framework": "aws",
        "functionality": "Lists the exports for a bot, bot locale, or custom vocabulary. Exports are kept in the list for 7 days.",
        "api_name": "aws.lexv2-models.list-exports",
        "api_arguments": [
            {
                "name": "bot-id",
                "description": "\nThe unique identifier that Amazon Lex assigned to the bot."
            },
            {
                "name": "bot-version",
                "description": "\nThe version of the bot to list exports for."
            }
        ],
        "python_environment_requirements": [
            "aws"
        ],
        "example_code": [],
        "output": {
            "botId -> (string)": "\nThe unique identifier assigned to the bot by Amazon Lex.",
            "botVersion -> (string)": "\nThe version of the bot that was exported.",
            "exportSummaries -> (list)": "\nSummary information for the exports that meet the filter criteria specified in the request. The length of the list is specified in the maxResults parameter. If there are more exports available, the nextToken field contains a token to get the next page of results.\n(structure)\n\nProvides summary information about an export in an export list.\nexportId -> (string)\n\nThe unique identifier that Amazon Lex assigned to the export.\nresourceSpecification -> (structure)\n\nInformation about the bot or bot locale that was exported.\nbotExportSpecification -> (structure)\n\nParameters for exporting a bot.\nbotId -> (string)\n\nThe identifier of the bot assigned by Amazon Lex.\nbotVersion -> (string)\n\nThe version of the bot that was exported. This will be either DRAFT or the version number.\n\nbotLocaleExportSpecification -> (structure)\n\nParameters for exporting a bot locale.\nbotId -> (string)\n\nThe identifier of the bot to create the locale for.\nbotVersion -> (string)\n\nThe version of the bot to export.\nlocaleId -> (string)\n\nThe identifier of the language and locale to export. The string must match one of the locales in the bot.\n\ncustomVocabularyExportSpecification -> (structure)\n\nThe parameters required to export a custom vocabulary.\nbotId -> (string)\n\nThe identifier of the bot that contains the custom vocabulary to export.\nbotVersion -> (string)\n\nThe version of the bot that contains the custom vocabulary to export.\nlocaleId -> (string)\n\nThe locale of the bot that contains the custom vocabulary to export.\n\ntestSetExportSpecification -> (structure)\n\nSpecifications for the test set that is exported as a resource.\ntestSetId -> (string)\n\nThe unique identifier of the test set.\n\n\nfileFormat -> (string)\n\nThe file format used in the export files.\nexportStatus -> (string)\n\nThe status of the export. When the status is Completed the export is ready to download.\ncreationDateTime -> (timestamp)\n\nThe date and time that the export was created.\nlastUpdatedDateTime -> (timestamp)\n\nThe date and time that the export was last updated.\n\n",
            "nextToken -> (string)": "\nA token that indicates whether there are more results to return in a response to the ListExports operation. If the nextToken field is present, you send the contents as the nextToken parameter of a ListExports operation request to get the next page of results.",
            "localeId -> (string)": "\nThe locale specified in the request."
        },
        "api_arguments_all": {
            "--bot-id ": "\nThe unique identifier that Amazon Lex assigned to the bot.",
            "--bot-version ": "\nThe version of the bot to list exports for.",
            "--sort-by ": "\nDetermines the field that the list of exports is sorted by. You can sort by the LastUpdatedDateTime field in ascending or descending order.\nattribute -> (string)\n\nThe export field to use for sorting.\norder -> (string)\n\nThe order to sort the list.\n",
            "--filters ": "\nProvides the specification of a filter used to limit the exports in the response to only those that match the filter specification. You can only specify one filter and one string to filter on.\n(structure)\n\nFilters the response form the ListExports operation\nname -> (string)\n\nThe name of the field to use for filtering.\nvalues -> (list)\n\nThe values to use to filter the response. The values must be Bot , BotLocale , or CustomVocabulary .\n(string)\n\noperator -> (string)\n\nThe operator to use for the filter. Specify EQ when the ListExports operation should return only resource types that equal the specified value. Specify CO when the ListExports operation should return resource types that contain the specified value.\n\n",
            "--max-results ": "\nThe maximum number of exports to return in each page of results. If there are fewer results than the max page size, only the actual number of results are returned.",
            "--next-token ": "\nIf the response from the ListExports operation contains more results that specified in the maxResults parameter, a token is returned in the response.\nUse the returned token in the nextToken parameter of a ListExports request to return the next page of results. For a complete set of results, call the ListExports operation until the nextToken returned in the response is null.\n",
            "--locale-id ": "\nSpecifies the resources that should be exported. If you don\u00e2\u0080\u0099t specify a resource type in the filters parameter, both bot locales and custom vocabularies are exported."
        }
    }
]

In [72]:
output = get_gorilla_response(query, functions)
print(output)

aws.lexv2-models.list-exports(bot-id="my-bot-id", bot-version="v2")


# More on Function Calling from OpenAI

Function calling allows you to more reliably get structured data back from the model. For example, you can:

## Use Cases

- **Chatbots with API calls**: Create chatbots that answer questions by calling external APIs (e.g. like ChatGPT Plugins)
  - e.g. `send_email(to: string, body: string)`
  - e.g. `get_current_weather(location: string, unit: 'celsius' | 'fahrenheit')`
  
- **Natural Language to API Conversion**: Convert natural language into API calls
  - e.g. Convert "Who are my top customers?" to `get_customers(min_revenue: int, created_before: string, limit: int)` and call your internal API
  
- **Data Extraction**: Extract structured data from text
  - e.g. `extract_data(name: string, birthday: string)`
  - e.g. `sql_query(query: string)`

## Workflow

The basic sequence of steps for function calling is as follows:

1. **User Query**: Call the model with the user query and a set of functions defined in the `functions` parameter.
2. **Function Invocation**: The model can choose to call a function; if so, the content will be a stringified JSON object adhering to your custom schema (note: the model may generate invalid JSON or hallucinate parameters).
3. **Parse and Execute**: Parse the string into JSON in your code, and call your function with the provided arguments if they exist.
4. **Response**: Call the model again by appending the function response as a new message, and let the model summarize the results back to the user.