# Tutorial 8: Async StrictJSON

- This is an async version of Tutorial 0: StrictJSON using fully async functions and classes

- We use `AsyncFunction` and `strict_json_async`
    - These are the async equivalents of `Function` and `strict_json`
    
- Using Async can help do parallel processes simulataneously, resulting in a much faster workflow

# Setup Guide

## Step 1: Install TaskGen

In [1]:
# !pip install taskgen-ai

## Step 2: Set up OpenAI API Key

In [2]:
#Python way to set up OpenAI API Keys
import os
os.environ['OPENAI_API_KEY'] = '<YOUR_API_KEY_HERE>'

## Step 3: Import required functions

In [3]:
from taskgen import *

# 1. Basic Generation

- **system_prompt**: Write in whatever you want GPT to become. "You are a \<purpose in life\>"
- **user_prompt**: The user input. Later, when we use it as a function, this is the function input
- **output_format**: JSON of output variables in a dictionary, with the key as the output key, and the value as the output description
    - The output keys will be preserved exactly, while GPT will generate content to match the description of the value as best as possible

#### Example Usage
```python
res = await strict_json_async(system_prompt = 'You are a classifier',
                    user_prompt = 'It is a beautiful and sunny day',
                    output_format = {'Sentiment': 'Type of Sentiment',
                                    'Adjectives': 'Array of adjectives',
                                    'Words': 'Number of words'})
                                    
print(res)
```

#### Example Output
```{'Sentiment': 'Positive', 'Adjectives': ['beautiful', 'sunny'], 'Words': 7}```

In [4]:
### Async
res = await strict_json_async(system_prompt = 'You are a classifier',
                    user_prompt = 'It is a beautiful and sunny day',
                    output_format = {'Sentiment': 'Type of Sentiment',
                                    'Adjectives': 'Array of adjectives',
                                    'Words': 'Number of words'})
print(res)

{'Sentiment': 'Positive', 'Adjectives': ['beautiful', 'sunny'], 'Words': 7}


## Easy to split into corresponding elements

In [5]:
res['Sentiment'], res['Adjectives'], res['Words']

('Positive', ['beautiful', 'sunny'], 7)

# 2. Type forcing output variables
- Generally, ```strict_json_async``` will infer the data type automatically for you for the output fields
- However, if you would like very specific data types, you can do data forcing using ```type: <data_type>``` at the last part of the output field description
- ```<data_type>``` must be of the form `int`, `float`, `str`, `dict`, `list`, `array`, `code`, `Dict[]`, `List[]`, `Array[]`, `Enum[]`, `bool` for type checking to work
- `code` removes all unicode escape characters that might interfere with normal code running
- The `Enum` and `List` are not case sensitive, so `enum` and `list` works just as well
- For `Enum[list_of_category_names]`, it is best to give an "Other" category in case the LLM fails to classify correctly with the other options.
- If `list` or `List[]` is not formatted correctly in LLM's output, we will correct it by asking the LLM to list out the elements line by line
- For `dict`,  we can further check whether keys are present using `Dict[list_of_key_names]`
- Other types will first be forced by rule-based conversion, any further errors will be fed into LLM's error feedback mechanism
- If `<data_type>` is not the specified data types, it can still be useful to shape the output for the LLM. However, no type checking will be done.
- Note: GPT understands the word `Array` better than `List` since `Array` is the official JSON object type, so backend, any type with the word `List` will be converted to `Array`. It is also recommended that you mention `Array` instead of `List` in your `output_format` free text description

### LLM-based checks
- If you would like the LLM to ensure that the type is being met, use `type: ensure <requirement>`
- This will run a LLM to check if the requirement is met. If requirement is not met, the LLM will generate what needs to be done to meet the requirement, which will be fed into the error-correcting loop of `strict_json_async`

#### Example Usage 1
```python
res = await strict_json_async(system_prompt = 'You are a classifier',
                    user_prompt = 'It is a beautiful and sunny day',
                    output_format = {'Sentiment': 'Type of Sentiment, type: Enum["Pos", "Neg", "Other"]',
                                    'Adjectives': 'Array of adjectives, type: List[str]',
                                    'Words': 'Number of words, type: int',
                                    'In English': 'Whether sentence is in English, type: bool'})
                                    
print(res)
```

#### Example Output 1
```{'Sentiment': 'Pos', 'Adjectives': ['beautiful', 'sunny'], 'Words': 7, 'In English': True}```



In [6]:
# Async
res = await strict_json_async(system_prompt = 'You are a classifier',
                    user_prompt = 'It is a beautiful and sunny day',
                    output_format = {'Sentiment': 'Type of Sentiment, type: Enum["Pos", "Neg", "Other"]',
                                    'Adjectives': 'Array of Adjectives, type: List[str]',
                                    'Words': 'Number of words, type: int',
                                    'In English': 'Whether sentence is in English, type: bool'})

res

{'Sentiment': 'Pos',
 'Adjectives': ['beautiful', 'sunny'],
 'Words': 7,
 'In English': True}

# 4. Functions
- Enhances ```strict_json_async()``` with a function-like interface for repeated use of modular LLM-based functions (or wraps external functions)
- Use angle brackets <> to enclose input variable names. First input variable name to appear in `fn_description` will be first input variable and second to appear will be second input variable. For example, `fn_description = 'Adds up two numbers, <var1> and <var2>'` will result in a function with first input variable `var1` and second input variable `var2`
- (Optional) If you would like greater specificity in your function's input, you can describe the variable after the : in the input variable name, e.g. `<var1: an integer from 10 to 30>`. Here, `var1` is the input variable and `an integer from 10 to 30` is the description.
- (Optional) If your description of the variable is one of `int`, `float`, `str`, `dict`, `list`, `array`, `Dict[]`, `List[]`, `Array[]`, `Enum[]`, `bool`, we will enforce type checking when generating the function inputs in `get_next_subtask` method of the `Agent` class. Example: `<var1: int>`. Refer to Section 3. Type Forcing Output Variables for details.
- Inputs (primary):
    - **fn_description**: String. Function description to describe process of transforming input variables to output variables. Variables must be enclosed in <> and listed in order of appearance in function input.
        - New feature: If `external_fn` is provided and no `fn_description` is provided, then we will automatically parse out the fn_description based on docstring of `external_fn`. Only requirement is that the docstring must contain the names of all compulsory input variables
    - **output_format**: Dict. Dictionary containing output variables names and description for each variable.
    
- Inputs (optional):
    - **examples** - Dict or List[Dict]. Examples in Dictionary form with the input and output variables (list if more than one)
    - **external_fn** - Python Function. If defined, instead of using LLM to process the function, we will run the external function. 
        If there are multiple outputs of this function, we will map it to the keys of `output_format` in a one-to-one fashion
    - **fn_name** - String. If provided, this will be the name of the function. Otherwise, if `external_fn` is provided, it will be the name of `external_fn`. Otherwise, we will use LLM to generate a function name from the `fn_description`
    - **kwargs** - Dict. Additional arguments you would like to pass on to the strict_json function
        
- Outputs:
    JSON of output variables in a dictionary (similar to ```strict_json_async```)
    
#### Example Usage 1 (Description only)
```python
# basic configuration with variable names (in order of appearance in fn_description)
fn = AsyncFunction(fn_description = 'Output a sentence with <obj> and <entity> in the style of <emotion>', 
                     output_format = {'output': 'sentence'})
# If fn_name is missing from definition and you want llm to autogenerate it then call async_init on that function
await fn.async_init()

# Use the function
await fn('ball', 'dog', 'happy') #obj, entity, emotion
```

#### Example Output 1
```{'output': 'The happy dog chased the ball.'}```

In [7]:
# If fn_name is missing from definition and you want llm to autogenerate it then call async_init on that function

fn =  AsyncFunction(fn_description = 'Output a sentence with <obj> and <entity> in the style of <emotion>', 
                     output_format = {'output': 'sentence'})
await fn.async_init()
res= await fn('ball', 'dog', 'happy') #obj, entity, emotion

res

{'output': 'The dog happily played with the ball.'}

## External Function Examples

In [8]:
def consecutive_sum(x):
    return x, x+1, x+2

# Async
fn_async = AsyncFunction(fn_description = 'Given input <x: int>, output x, x+1, x+8', 
            output_format = {'output1': 'x', 'output2': 'x+8', 'output3': 'x+2'},
            external_fn = consecutive_sum)

await fn_async.async_init()

# Use the function
res =await fn_async(4) #x
res

{'output1': 4, 'output2': 5, 'output3': 6}

In [9]:
# Async External function

import asyncio

# Async external function
async def consecutive_sum_async(x):
    await asyncio.sleep(1)  # simulate some async operation like I/O
    return x, x+1, x+2

# Async
# an external function with multiple output variables
fn_async = AsyncFunction(fn_description = 'Given input <x: int>, output x, x+1, x+8', 
            output_format = {'output1': 'x', 'output2': 'x+8', 'output3': 'x+2'},
            external_fn = consecutive_sum_async)

# Use the function
res =await fn_async(4) #x
res

{'output1': 4, 'output2': 5, 'output3': 6}

## Example inferring of fn_description from docstring and type hints

In [10]:
# Docstring must provide all input variables
# We will ignore shared_variables, *args and **kwargs
def add_number_to_list(num1: int, num_list: list, *args, **kwargs):
    '''Adds num1 to num_list'''
    num_list.append(num1)
    return num_list

# Async
fn_async = AsyncFunction(external_fn = add_number_to_list, 
    output_format = {'num_list': 'Array of numbers'})


print(str(fn_async))

# Use the function
res_async = await fn_async(3, [2, 4, 5])
res_async

Description: Adds <num1: int> to <num_list: list>
Input: ['num1', 'num_list']
Output: {'num_list': 'Array of numbers'}



{'num_list': [2, 4, 5, 3]}

# 5. Integrating with your own LLM
- StrictJSON has native support for OpenAI LLMs (you can put the LLM API parameters inside `strict_json_async` or `AsyncFunction` directly)
- If your LLM is not from OpenAI, it is really easy to integrate with your own Custom LLM
- Simply pass your custom LLM function inside the `llm` parameter of `strict_json_async` or `AsyncFunction`
    - Inputs:
        - system_prompt: String. Write in whatever you want the LLM to become. e.g. "You are a \<purpose in life\>"
        - user_prompt: String. The user input. Later, when we use it as a function, this is the function input
    - Output:
        - res: String. The response of the LLM call

#### Example Custom LLM
```python

async def custom_llm_async(system_prompt: str, user_prompt: str):
    ''' Here, we use OpenAI for illustration, you can change it to your own LLM '''
    # ensure your LLM imports are all within this function
    from openai import OpenAI
    
    # define your own LLM here
    client = AsyncOpenAI()
    response = await client.chat.completions.create(
        model='gpt-3.5-turbo',
        temperature = 0,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ]
    )
    return response.choices[0].message.content
```

#### Example Usage with `strict_json_async`
```python

res_ = await strict_json_async(system_prompt = 'You are a classifier',
                    user_prompt = 'It is a beautiful and sunny day',
                    output_format = {'Sentiment': 'Type of Sentiment',
                                    'Adjectives': 'Array of adjectives',
                                    'Words': 'Number of words'},
                                     llm = custom_llm_async) # set this to your own LLM                                     

print(res)
```

#### Example Output
```{'Sentiment': 'Positive', 'Adjectives': ['beautiful', 'sunny'], 'Words': 7}```

In [11]:
from openai import AsyncOpenAI

async def custom_llm_async(system_prompt: str, user_prompt: str):
    ''' Here, we use OpenAI for illustration, you can change it to your own LLM '''
    # ensure your LLM imports are all within this function
    
    # define your own LLM here
    client = AsyncOpenAI()
    response = await client.chat.completions.create(
        model='gpt-3.5-turbo',
        temperature = 0,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ]
    )
    return response.choices[0].message.content

### Executing Custom LLMs

In [12]:
# Async
llm = custom_llm_async

res = await strict_json_async(system_prompt = 'You are a classifier',
                    user_prompt = 'It is a beautiful and sunny day',
                    output_format = {'Sentiment': 'Type of Sentiment',
                                    'Adjectives': 'Array of adjectives',
                                    'Words': 'Number of words'},
                                     llm = llm) # set this to your own LLM
print(res)

{'Sentiment': 'Positive', 'Adjectives': ['beautiful', 'sunny'], 'Words': 7}
