# *Outlines Notebook - WIP*

Built by [Trelis](https://trelis.com).

Built with help from: https://outlines-dev.github.io/outlines/quickstart

---
## Getting Set Up
### Colab Setup
- You can run training on a free Google Colab Notebook for 7B models if you use quantization.
- Save a copy of this notebook: Go to File -> Save a copy in Drive. (optional, but needed if you want to make changes).
- Go to the menu -> Runtime -> Change Runtime Type - Select GPU (T4). [Make sure to comment out flash attention when loading the model if you are using a T4 as flash is only supported on newer GPUs).
- Then go to Runtime -> Run all.
- It takes about 2-5 mins* for the installation (which all happens in the cloud in this notebook).
- Once all cells have run, you'll find the chat interface at the bottom.-
- *Optionally, you can comment back in the code below to mount Google Drive. This will download the model to your Google Drive, bringing down the total start time to about 3 mins.

### Setup on an Ampere GPU (A40, A6000, A100, H100) with Cuda 12.1 and Pytorch 2.2.1 - RECOMMENDED.
Ampere architecture GPUs allow for the use of Flash Attention, which provides a speed up. Otherwise, you  need to train with fp16 instead of bf16.

For the best reproducibility, run this script on an A6000 using a one-click template from Runpod ([affiliate link for sign up here](https://runpod.io/?ref=jmfkcdio), supports Trelis' YouTube channel) or VastAI ([affiliate link for sign up here](https://cloud.vast.ai/?ref_id=98762), supports Trelis' YouTube channel):
- Runpod one-click template [here](https://runpod.io/gsc?template=ifyqsvjlzj&ref=jmfkcdio) - easier setup.
- Vast.ai one-click template [here](https://cloud.vast.ai/?ref_id=98762&creator_id=98762&name=Fine-tuning%20Notebook%20by%20Trelis%20-%20Cuda%2012.1) - offers smaller GPUs (which are cheaper to run).

In [1]:
!pip install outlines -q -U
!pip install transformers -q -U
!pip install hf_transfer -q -U
!pip install datasets -q -U
!pip install accelerate -q -U

[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.3.1[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.3.1[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.3.1[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m


In [1]:
%env HF_HUB_ENABLE_HF_TRANSFER=True

env: HF_HUB_ENABLE_HF_TRANSFER=True


In [1]:
import outlines
from transformers import AutoTokenizer

# model_slug="microsoft/phi-2"
model_slug="openchat/openchat_3.5"
tokenizer = AutoTokenizer.from_pretrained(model_slug)

# Load the openchat/openchat_3.5 model using outlines
# model = outlines.models.transformers("openchat/openchat_3.5",device="cuda")
model = outlines.models.transformers(model_slug,device="cuda")

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

  return self.fget.__get__(instance, owner)()
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


## Forced Function Calling

In [27]:
from pydantic import BaseModel, Field, constr
from typing import Optional
from enum import Enum

# Define an Enum for units
class Units(str, Enum):
    metric = "metric"
    imperial = "imperial"

class WeatherQuery(BaseModel):
    function_call: constr(max_length=20) = Field(default="get_weather", description="The function to call.")
    city: constr(max_length=50) = Field(..., description="The city for which to get the weather.")
    units: Optional[Units] = Field(default=Units.metric, description="The units for temperature (metric or imperial).")

def serialize_function_for_prompt(function):
    """
    Serializes a function's signature into a descriptive string for use as a pre-prompt.
    
    Args:
        function (Callable): The function to serialize.
    
    Returns:
        str: The serialized function description.
    """
    from inspect import signature, Parameter
    
    params = signature(function).parameters.values()
    param_descriptions = []
    
    for param in params:
        param_type = param.annotation.__name__
        default = param.default
        if default is Parameter.empty:
            param_desc = f"{param.name}: {param_type} (required)"
        else:
            default_value = f"'{default}'" if isinstance(default, str) else default
            param_desc = f"{param.name}: {param_type} (optional, default: {default_value})"
        
        param_descriptions.append(param_desc)
    
    func_desc = f"Function: {function.__name__}\nParameters:\n- " + "\n- ".join(param_descriptions)
    return func_desc

# Define the get_weather function as before
def get_weather(city: str, units: str = "metric") -> str:
    weather = "Partly cloudy" if city.strip().lower() == "dublin" else "Sunny"
    units_suffix = "C" if units == "metric" else "F"
    return f"{weather}, 15 {units_suffix}"

# Generate the function description
func_description = serialize_function_for_prompt(get_weather)
# print(func_description)

generator = outlines.generate.json(model, WeatherQuery)

# Updating the preprompt with instructions for the response format
preprompt = "You have access to the following functions, if required. Please respond in the format of a function call, specifying the function name and required parameters in a structured JSON-like syntax:"
prompt = preprompt + '\n\n' + func_description + '\n\n' + "Generate a function call to check the weather in Dublin."

# Format the prompt for chat-based generation
messages = [
    {"role": "user", "content": prompt}
]
formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Display the formatted prompt
print('---')
print(formatted_prompt)
print('---')

# Generate the response according to the regex pattern
result = generator(formatted_prompt)

# Display the generated response
print(result)
print('---')

# Simulate calling the `get_weather` function based on the model's output
if result.function_call == "get_weather":
    weather_description = get_weather(result.city, result.units.value)
    print(f"The weather in {result.city} is: {weather_description}")
else:
    print("No valid function call generated.")

---
<s>GPT4 Correct User: You have access to the following functions, if required. Please respond in the format of a function call, specifying the function name and required parameters in a structured JSON-like syntax:

Function: get_weather
Parameters:
- city: str (required)
- units: str (optional, default: 'metric')

Generate a function call to check the weather in Dublin.<|end_of_turn|>GPT4 Correct Assistant:
---
function_call='get_weather' city='Dublin' units=<Units.metric: 'metric'>
---
The weather in Dublin is: Partly cloudy, 15 C


## Function calling only if Relevant

In [11]:
import outlines
from transformers import AutoTokenizer

# Define a regex pattern that matches either a structured function call or a plain text response
regex_pattern = r'(function_call:\w+,.*?[^{}]{1,500}|[^_]{1,500})'

# Use the outlines.generate.regex function with the defined pattern
generator = outlines.generate.regex(model, regex_pattern)

# Construct your preprompt with a generic function call description
preprompt = """
In this system, you can call functions to perform specific tasks. To call a function, use the format:
`function_call:function_name,arg1:value1,arg2:value2,...`
For example: `function_call:get_weather,city:Dublin,units:metric`

If no function call is needed, simply provide a plain text response without any special formatting.
"""

# Your actual prompt asking to get the weather in Dublin
user_prompt = "Get the weather in Dublin."

# Combine the preprompt with the user's prompt
prompt = f"{preprompt}\n{user_prompt}"

# Format the prompt for chat-based generation
messages = [
    {"role": "user", "content": prompt}
]
formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Display the formatted prompt
print('---')
print(formatted_prompt)
print('---')

# Generate the response according to the regex pattern
result = generator(formatted_prompt)

# Display the generated response
print(result)
print('---')

KeyboardInterrupt: 

## Forced Function Calling with Regex
The idea here (not built yet) is to have a "direct response" function that just pass through the input argument as the response. In this way, the model always will call a function and will choose between whether it's a get_weather or direct response function.

In [2]:
import re
from typing import Optional
from enum import Enum
from outlines import generate

# Define an Enum for units
class Units(str, Enum):
    metric = "metric"
    imperial = "imperial"

def serialize_function_for_prompt(function):
    """
    Serializes a function's signature into a descriptive string for use as a pre-prompt.
    
    Args:
        function (Callable): The function to serialize.
    
    Returns:
        str: The serialized function description.
    """
    from inspect import signature, Parameter
    
    params = signature(function).parameters.values()
    param_descriptions = []
    
    for param in params:
        param_type = param.annotation.__name__
        default = param.default
        if default is Parameter.empty:
            param_desc = f"{param.name}: {param_type} (required)"
        else:
            default_value = f"'{default}'" if isinstance(default, str) else default
            param_desc = f"{param.name}: {param_type} (optional, default: {default_value})"
        
        param_descriptions.append(param_desc)
    
    func_desc = f"Function: {function.__name__}\nParameters:\n- " + "\n- ".join(param_descriptions)
    return func_desc

# Define the get_weather function as before
def get_weather(city: str, units: str = "metric") -> str:
    weather = "Partly cloudy" if city.strip().lower() == "dublin" else "Sunny"
    units_suffix = "C" if units == "metric" else "F"
    return f"{weather}, 15 {units_suffix}"

# Generate the function description
func_description = serialize_function_for_prompt(get_weather)

# Updating the preprompt with instructions for the response format
preprompt = "You have access to the following functions, if required. Please respond in the format of a function call, specifying the function name and required parameters in a structured JSON-like syntax:"
prompt = preprompt + '\n\n' + func_description + '\n\n' + "Generate a function call to check the weather in Dublin."

# Format the prompt for chat-based generation
messages = [
    {"role": "user", "content": prompt}
]
formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Display the formatted prompt
print('---')
print(formatted_prompt)
print('---')

# Update the regex pattern for the function call
function_call_pattern = r'\{.*"function_call"\s*:\s*"([^"]+)"\s*,\s*"args"\s*:\s*\{([^}]*)\}'

# Generate the response according to the updated regex pattern
generator = generate.regex(model, function_call_pattern)

# Generate the response according to the regex pattern
result = generator(formatted_prompt)

# Display the generated response
print(result)
print('---')

---
<s>GPT4 Correct User: You have access to the following functions, if required. Please respond in the format of a function call, specifying the function name and required parameters in a structured JSON-like syntax:

Function: get_weather
Parameters:
- city: str (required)
- units: str (optional, default: 'metric')

Generate a function call to check the weather in Dublin.<|end_of_turn|>GPT4 Correct Assistant:
---


KeyboardInterrupt: 