# Understanding Session 2: Customization and Rate-Limited Concurrency

The `Session` object can be fully customized, including models, model parameters and rate limits, to accustom various usecases. 

Most usefully you can customize: 

- `service`: rate limit api service
- `llmconfig`: the default model parameters API calls in the session

By default, an `OpenAI` service will be provided with the default config settings:

```python
{
    "model": "gpt-4-1106-preview",
    "frequency_penalty": 0,
    "max_tokens": None,
    "n": 1,
    "presence_penalty": 0, 
    "response_format": {"type": "text"}, 
    "seed": None, 
    "stop": None,
    "stream": False,
    "temperature": 0.7, 
    "top_p": 1, 
    "tools": None,
    "tool_choice": "none", 
    "user": None
    }
```
    

If you wish to change the default behavior of a session, you can:
- create a new api service
- or modify llmconfig settings

## Service

we currently support `OpenAI` and `OpenRouter`, and more service options are under construction.

In [1]:
# quick start
import lionagi as li
service_openai = li.Services.OpenAI()

# or
service_openrouter = li.Services.OpenRouter()

If you want to config the service, you can consider passing in:

- `api_key`: If your API key is already stored in your environment, no explicit input is required. Otherwise, you can input your API key here.
- `key_scheme`: By default, we assume that the environment variable storing your API key named `"OPENAI_API_KEY"` (or `"OPENROUTER_API_KEY"` for `OpenRouter`). If the variable name is different, you can specify it here. 
- `token_encoding_name`: The default `token_encoding_name` is "cl100k_base".
- `max_tokens`: The maximum number of tokens allowed per interval. Default to 100000.
- `max_requests`: The maximum number of requests allowed per interval. Default to 1000.
- `interval`: The time interval in seconds for replenishing capacities. Default to 60.

Currently, we only support `chat/completions` endpoint. More endpoints will be available in the future.

## `llmconfig`

`llmconfig` can be set when creating the session or updated later.

In [2]:
llmconfig_ = {...}

# passing in the llmconfig as a dict 
session1 = li.Session("you are a helpful assistant", llmconfig=llmconfig_)

# or
session1.llmconfig.update(llmconfig_)

## Concurrency

In [3]:
# we will use numpy to generate random numbers for this part
# %pip install numpy

In [4]:
# let us use a simple conditional calculator session as an example
# in this example, we will have two steps in the instruction, first step would be choosing between sum or diff based on a case number
# and second step would be choosing between times or plus based on the sign of the first step

system = "You are asked to perform as a calculator. Return only a numeric value, i.e. int or float, no text."

instruct1 = {
    "sum the absolute values": "provided with 2 numbers, return the sum of their absolute values. i.e. |x|+|y|",}

instruct2 = {
    "diff the absolute values": "provided with 2 numbers, return the difference of absolute values. i.e. |x|-|y|",}

instruct3 = {
    "if previous response is positive": "times 2. i.e. *2", # case 0
    "else": "plus 2. i.e. +2",                              # case 1
}

In [5]:
# create a case and context
case = 0
context = {"x": 7, "y": 3}
instruct = instruct1 if case == 0 else instruct2

In [6]:
from timeit import default_timer as timer
start = timer()
calculator = li.Session(system)

step1 = await calculator.initiate(instruct, context=context)
step2 = await calculator.followup(instruct3, temperature=0.5)     # you can also modify parameters for each API call

print(f"step1 result: {step1}")
print(f"step2 result: {step2}")

elapsed_time = timer() - start
print(f"run clock time: {elapsed_time:0.2f} seconds")

step1 result: 10
step2 result: {"response": "20"}
run clock time: 1.70 seconds


In [7]:
# now let us run 10 senerios in parallel
import numpy as np

num_iterations = 10

In [8]:
# generate random numbers
ints1 = np.random.randint(-10, 10, size=num_iterations)
ints2 = np.random.randint(0, 10, size=num_iterations)
cases = np.random.randint(0,2, size=num_iterations)

# define a simple parser function
f = lambda i: {"x": str(ints1[i]), "y": str(ints2[i]), "case": str(cases[i])}

In [9]:
contexts = li.lcall(range(num_iterations), f)

li.lcall(range(num_iterations), lambda i: print(contexts[i]));

{'x': '3', 'y': '4', 'case': '0'}
{'x': '9', 'y': '9', 'case': '0'}
{'x': '2', 'y': '3', 'case': '1'}
{'x': '3', 'y': '1', 'case': '0'}
{'x': '2', 'y': '6', 'case': '0'}
{'x': '-8', 'y': '4', 'case': '0'}
{'x': '9', 'y': '0', 'case': '1'}
{'x': '-8', 'y': '5', 'case': '1'}
{'x': '-10', 'y': '7', 'case': '0'}
{'x': '1', 'y': '6', 'case': '1'}


In [10]:
async def calculator_workflow(context):
    
    calculator = li.Session(system)       # construct a session instance
    context = context.copy()
    case = int(context.pop("case"))
    
    instruct = instruct1 if case == 0 else instruct2
    res1 = await calculator.initiate(instruct, context=context)    # run the steps
    res2 = await calculator.followup(instruct3, temperature=0.5)
    
    return (res1, res2)

In [11]:
start = timer()
outs = await li.alcall(contexts, calculator_workflow)
elapsed_time = timer() - start
print(f"num_workload: {num_iterations}")
print(f"run clock time: {elapsed_time:0.2f} seconds")

num_workload: 10
run clock time: 3.25 seconds


In [12]:
outs

[('7', '{"response": "14"}'),
 ('18', '{"response": "36"}'),
 ('1.0', '{"response": "2.0"}'),
 ('4', '{"response": "8"}'),
 ('8', '16'),
 ('12', '{"response": "24"}'),
 ('9.0', '18.0'),
 ('3.0', '6.0'),
 ('17', '{"response": "34"}'),
 ('-5', '{"response": "-3"}')]

## Customized API service concurrent calls

By default, all the session will initiate a new default service object.

If you would like to have one service using across sessions, you need to pass in the **same** service object during construction.

In [13]:
# now let us change the rate limit to check whether it is working
service = li.Services.OpenAI()

async def calculator_workflow(context):
    
    calculator = li.Session(system, service=service)       # construct a session instance
    context = context.copy()
    case = int(context.pop("case"))
    instruct = instruct1 if case == 0 else instruct2

    res1 = await calculator.initiate(instruct, context=context)    # run the steps
    res2 = await calculator.followup(instruct3)

    return (res1, res2)

In [14]:
start = timer()

outs = await li.alcall(contexts, calculator_workflow)  

elapsed_time = timer() - start
print(f"num_workload: {num_iterations}")
print(f"run clock time: {elapsed_time:0.2f} seconds")

num_workload: 10
run clock time: 4.87 seconds


In [15]:
outs

[('7', '{"response": "14"}'),
 ('18', '{"response": "36"}'),
 ('1', '2'),
 ('4', '{"response": "8"}'),
 ('8', '16'),
 ('12', '{"response": "24"}'),
 ('9.0', '18.0'),
 ('3', '6'),
 ('17', '{"response": "34"}'),
 ('-5', '-3')]