# SimpleAIChat

Here's some fun, hackable examples on how simpleaichat works:

- Creating a [Python coding assistant](examples/notebooks/simpleaichat_coding.ipynb) without any unnecessary accompanying output, allowing 5x faster generation at 1/3rd the cost. ([Colab](https://colab.research.google.com/github/minimaxir/simpleaichat/blob/main/examples/notebooks/simpleaichat_coding.ipynb))
- Allowing simpleaichat to [provide inline tips](examples/notebooks/chatgpt_inline_tips.ipynb) following ChatGPT usage guidelines. ([Colab](https://colab.research.google.com/github/minimaxir/simpleaichat/blob/main/examples/notebooks/chatgpt_inline_tips.ipynb))
- Async interface for [conducting many chats](examples/notebooks/simpleaichat_async.ipynb) in the time it takes to receive one AI message. ([Colab](https://colab.research.google.com/github/minimaxir/simpleaichat/blob/main/examples/notebooks/simpleaichat_async.ipynb))
- Create your own Tabletop RPG (TTRPG) setting and campaign by using [advanced structured data models](examples/notebooks/schema_ttrpg.ipynb). ([Colab](https://colab.research.google.com/github/minimaxir/simpleaichat/blob/main/examples/notebooks/schema_ttrpg.ipynb))

In [1]:
from dotenv import load_dotenv
load_dotenv()
import os
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

True

In [2]:
import json
import pprint as pp
from pprint import PrettyPrinter, pprint, pformat
pprint = pp.PrettyPrinter(sort_dicts=False).pprint
pformat = pp.PrettyPrinter(sort_dicts=False).pformat

def pretty_print(json_object):
    print(json.dumps(json_object, indent=2, sort_keys=False, default=pformat))

In [14]:
from simpleaichat import AIChat
import inspect

SOURCE_CODE = {}
def add_source_code(obj):
    obj_name = obj.__name__
    source_code = inspect.getsource(obj)
    SOURCE_CODE[obj_name] = source_code

def get_source_code():
    return SOURCE_CODE
    
add_source_code(AIChat)
# SOURCE_CODE["AIChat"] = inspect.getsource(AIChat)
get_source_code()

{'AIChat': 'class AIChat(BaseModel):\n    client: Union[Client, AsyncClient]\n    default_session: Optional[ChatSession]\n    sessions: Dict[Union[str, UUID], ChatSession] = {}\n\n    class Config:\n        arbitrary_types_allowed = True\n        json_loads = orjson.loads\n        json_dumps = orjson_dumps\n\n    def __init__(\n        self,\n        character: str = None,\n        character_command: str = None,\n        system: str = None,\n        id: Union[str, UUID] = uuid4(),\n        prime: bool = True,\n        default_session: bool = True,\n        console: bool = True,\n        **kwargs,\n    ):\n\n        client = Client()\n        system_format = self.build_system(character, character_command, system)\n\n        sessions = {}\n        new_default_session = None\n        if default_session:\n            new_session = self.new_session(\n                return_session=True, system=system_format, id=id, **kwargs\n            )\n\n            new_default_session = new_session\n  

In [15]:
# from simpleaichat import AIChat

# ai = AIChat(system="Write a fancy GitHub README based on the user-provided project name.")
# ai("simpleaichat")
# print(ai)

## Building AI-based Apps


The trick with working with new chat-based apps that wasn't readily available with earlier iterations of GPT-3 is the addition of the system prompt: a different class of prompt that guides the AI behavior throughout the entire conversation. In fact, the chat demos above are actually using [system prompt tricks](https://github.com/minimaxir/simpleaichat/blob/main/PROMPTS.md#interactive-chat) behind the scenes! OpenAI has also released an official guide for [system prompt best practices](https://platform.openai.com/docs/guides/gpt-best-practices) to building AI apps.

For developers, you can instantiate a programmatic instance of `AIChat` by explicitly specifying a system prompt, or by disabling the console.

In [5]:
ai = AIChat(system="You are a helpful assistant.")
print('## AIChat(system="You are a helpful assistant."):\n')
print(ai)
# pprint(ai.get_session().dict())

ai = AIChat(console=False)  # same as above
print('\n## AIChat(console=False):\n')
print(ai)
# pprint(ai.get_session().dict())

## AIChat(system="You are a helpful assistant."):

{
  "id": "9454965b-52bc-4721-9373-ed0624bf7861",
  "created_at": "2023-06-20T02:19:43.984394+00:00",
  "auth": {
    "api_key": "**********"
  },
  "model": "gpt-3.5-turbo",
  "system": "You are a helpful assistant.",
  "params": {
    "temperature": 0.7
  },
  "messages": [],
  "input_fields": [
    "content",
    "role",
    "name"
  ],
  "save_messages": true,
  "total_prompt_length": 0,
  "total_completion_length": 0,
  "total_length": 0
}

## AIChat(console=False):

{
  "id": "9454965b-52bc-4721-9373-ed0624bf7861",
  "created_at": "2023-06-20T02:19:43.997186+00:00",
  "auth": {
    "api_key": "**********"
  },
  "model": "gpt-3.5-turbo",
  "system": "You are a helpful assistant.",
  "params": {
    "temperature": 0.7
  },
  "messages": [],
  "input_fields": [
    "content",
    "role",
    "name"
  ],
  "save_messages": true,
  "total_prompt_length": 0,
  "total_completion_length": 0,
  "total_length": 0
}


You can also pass in a `model` parameter, such as `model="gpt-4"` if you have access to GPT-4, or `model="gpt-3.5-turbo-16k"` for a larger-context-window ChatGPT.

In [6]:
ai = AIChat(
    console=False,
    # save_messages=False,  # with schema I/O, messages are never saved
    model="gpt-3.5-turbo-0613",
    params={"temperature": 0.0},
)
print('\n## AIChat(console=False):\n')
print(ai)


## AIChat(console=False):

{
  "id": "9454965b-52bc-4721-9373-ed0624bf7861",
  "created_at": "2023-06-20T02:19:56.539307+00:00",
  "auth": {
    "api_key": "**********"
  },
  "model": "gpt-3.5-turbo-0613",
  "system": "You are a helpful assistant.",
  "params": {
    "temperature": 0.0
  },
  "messages": [],
  "input_fields": [
    "content",
    "role",
    "name"
  ],
  "save_messages": true,
  "total_prompt_length": 0,
  "total_completion_length": 0,
  "total_length": 0
}



You can then feed the new `ai` class with user input, and it will return and save the response from ChatGPT:

In [7]:
response = ai("What is the capital of California?")
print(response)
print(ai)

The capital of California is Sacramento.


Alternatively, you can stream responses by token with a generator if the text generation itself is too slow:

In [13]:
from rich.console import Console
console = Console()

for chunk in ai.stream("What is the capital of California?", params={"max_tokens": 5}):
    response_td = chunk  # dict contains "delta" for the new token and "response"
    # response_td = response_td["response"]  # dict contains "delta" for the new token and "response"
    # print(f'"delta": "{response_td["delta"]}", "response": {response_td["response"]}')
    delta = response_td["delta"]
    response = response_td["response"].replace(delta, f"[{delta}]")
    print(response)
print(ai)

[The]
The[ capital]
The capital[ of]
The capital of[ California]
The capital of California[ is]
{
  "id": "9454965b-52bc-4721-9373-ed0624bf7861",
  "created_at": "2023-06-20T02:22:50.412686+00:00",
  "auth": {
    "api_key": "**********"
  },
  "model": "gpt-3.5-turbo",
  "system": "You are a helpful assistant.",
  "params": {
    "temperature": 0.7
  },
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of California?",
      "received_at": "2023-06-20T02:22:50.413529+00:00"
    },
    {
      "role": "assistant",
      "content": "The capital of California is",
      "received_at": "2023-06-20T02:22:51.418384+00:00"
    }
  ],
  "input_fields": [
    "content",
    "role",
    "name"
  ],
  "save_messages": true,
  "total_prompt_length": 0,
  "total_completion_length": 0,
  "total_length": 0
}


Further calls to the ai object will continue the chat, automatically incorporating previous information from the conversation.

In [20]:
ai = AIChat(console=False, model="gpt-3.5-turbo-0301")

response = ai("What is the capital of California?")
print(response)
# print(ai)
response = ai("When was it founded?")
print(response)
print(ai)
# ai.default_session.messages

The capital of California is Sacramento.
Sacramento was founded on February 27, 1850. It was named after the Sacramento River, which runs through the city.
{
  "id": "183d84c2-77cf-461d-a14c-9ed66d4a5050",
  "created_at": "2023-06-20T02:31:58.325404+00:00",
  "auth": {
    "api_key": "**********"
  },
  "model": "gpt-3.5-turbo-0301",
  "system": "You are a helpful assistant.",
  "params": {
    "temperature": 0.7
  },
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of California?",
      "received_at": "2023-06-20T02:31:58.329111+00:00"
    },
    {
      "role": "assistant",
      "content": "The capital of California is Sacramento.",
      "received_at": "2023-06-20T02:31:59.247147+00:00",
      "finish_reason": "stop",
      "prompt_length": 26,
      "completion_length": 7,
      "total_length": 33
    },
    {
      "role": "user",
      "content": "When was it founded?",
      "received_at": "2023-06-20T02:31:59.248271+00:00"
    },
    {
      "

You can also save chat sessions (as CSV or JSON) and load them later. The API key is not saved so you will have to provide that when loading.

In [28]:
# CSV, will only save messages
ai.save_session("chat_session.csv", format="csv")  # CSV
ai.load_session("chat_session.csv")
print("\n## CSV\n")
print(ai)

# JSON
ai.save_session("chat_session.json", format="json", minify=True)  # JSON
ai.load_session("chat_session.json")
print("\n## JSON\n")
print(ai)


## CSV

{
  "id": "183d84c2-77cf-461d-a14c-9ed66d4a5050",
  "created_at": "2023-06-20T02:31:58.325404+00:00",
  "auth": {
    "api_key": "**********"
  },
  "model": "gpt-3.5-turbo-0301",
  "system": "You are a helpful assistant.",
  "params": {
    "temperature": 0.7
  },
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of California?",
      "received_at": "2023-06-20T02:31:58.329111+00:00"
    },
    {
      "role": "assistant",
      "content": "The capital of California is Sacramento.",
      "received_at": "2023-06-20T02:31:59.247147+00:00",
      "finish_reason": "stop",
      "prompt_length": 26,
      "completion_length": 7,
      "total_length": 33
    },
    {
      "role": "user",
      "content": "When was it founded?",
      "received_at": "2023-06-20T02:31:59.248271+00:00"
    },
    {
      "role": "assistant",
      "content": "Sacramento was founded on February 27, 1850. It was named after the Sacramento River, which runs through the 

In [38]:
ai = AIChat(
    console=False,
    save_messages=False,  # with schema I/O, messages are never saved
    model="gpt-3.5-turbo-0613",
    params={"temperature": 0.0},
)
response = ai("What is the capital of California?")
print(response)
ai.get_session().dict()

{'id': UUID('affeba06-5b1b-4059-b65f-0e54208605d9'),
 'created_at': datetime.datetime(2023, 6, 20, 1, 24, 15, 454767, tzinfo=datetime.timezone.utc),
 'auth': {'api_key': SecretStr('**********')},
 'api_url': 'https://api.openai.com/v1/chat/completions',
 'model': 'gpt-3.5-turbo-0613',
 'system': 'You are a helpful assistant.',
 'params': {'temperature': 0.0},
 'messages': [],
 'input_fields': {'content', 'name', 'role'},
 'recent_messages': None,
 'save_messages': False,
 'total_prompt_length': 46,
 'total_completion_length': 26,
 'total_length': 72,
 'title': None}

### Functions

A large number of popular venture-capital-funded ChatGPT apps don't actually use the "chat" part of the model. Instead, they just use the system prompt/first user prompt as a form of natural language programming. You can emulate this behavior by passing a new system prompt when generating text, and not saving the resulting messages.

In [18]:
json = '{"title": "An array of integers.", "array": [-1, 0, 1]}'

params = {"temperature": 0.0, "max_tokens": 100}  # a temperature of 0.0 is deterministic

# We namespace the function by `id` so it doesn't affect other chats.
# Settings set during session creation will apply to all generations from the session,
# but you can change them per-generation, as is the case with the `system` prompt here.
ai = AIChat(console=False, id="function", params=params, save_messages=False)
output = ai(json, id="function", system="Format the user-provided JSON as YAML.")
print(output)


title: "An array of integers."
array:
  - -1
  - 0
  - 1


In [21]:
functions = [
             "Format the user-provided JSON as YAML.",
             "Write a limerick based on the user-provided JSON.",
             "Translate the user-provided JSON from English to French."
            ]
for function in functions:
    output = ai(json, id="function", system=function)
    print(output)
    
print(ai)

title: "An array of integers."
array:
  - -1
  - 0
  - 1


KeyError: "No AI generation: {'error': {'message': 'That model is currently overloaded with other requests. You can retry your request, or contact us through our help center at help.openai.com if the error persists. (Please include the request ID a780fc22e32699984b145cf4bbf2f539 in your message.)', 'type': 'server_error', 'param': None, 'code': None}}"

#### Function Calling

Newer versions of ChatGPT also support "[function calling](https://platform.openai.com/docs/guides/gpt/function-calling)", but the real benefit of that feature is the ability for ChatGPT to support structured input and/or output, which now opens up a wide variety of applications! simpleaichat streamlines the workflow to allow you to just pass an `input_schema` and/or an `output_schema`.

You can construct a schema using a [pydantic](https://docs.pydantic.dev/latest/) BaseModel.


In [22]:
from pydantic import BaseModel, Field

ai = AIChat(
    console=False,
    save_messages=False,  # with schema I/O, messages are never saved
    model="gpt-3.5-turbo-0613",
    params={"temperature": 0.0},
)

class get_event_metadata(BaseModel):
    """Event information"""

    description: str = Field(description="Description of event")
    city: str = Field(description="City where event occured")
    year: int = Field(description="Year when event occured")
    month: str = Field(description="Month when event occured")

# returns a dict, with keys ordered as in the schema
ai("First iPhone announcement", output_schema=get_event_metadata)

{'description': 'The first iPhone was announced by Apple Inc.',
 'city': 'San Francisco',
 'year': 2007,
 'month': 'January'}

See the [TTRPG Generator Notebook](examples/notebooks/schema_ttrpg.ipynb) for a more elaborate demonstration of schema capabilities.


### Tools

One of the most recent aspects of interacting with ChatGPT is the ability for the model to use "tools." As popularized by [LangChain](https://github.com/hwchase17/langchain), tools allow the model to decide when to use custom functions, which can extend beyond just the chat AI itself, for example retrieving recent information from the internet not present in the chat AI's training data. This workflow is analogous to ChatGPT Plugins.

Parsing the model output to invoke tools typically requires a number of shennanigans, but simpleaichat uses [a neat trick](https://github.com/minimaxir/simpleaichat/blob/main/PROMPTS.md#tools) to make it fast and reliable! Additionally, the specified tools return a `context` for ChatGPT to draw from for its final response, and tools you specify can return a dictionary which you can also populate with arbitrary metadata for debugging and postprocessing. Each generation returns a dictionary with the `response` and the `tool` function used, which can be used to set up workflows akin to [LangChain](https://github.com/hwchase17/langchain)-style Agents, e.g. recursively feed input to the model until it determines it does not need to use any more tools.

You will need to specify functions with docstrings which provide hints for the AI to select them:


In [23]:
from simpleaichat.utils import wikipedia_search, wikipedia_search_lookup

# This uses the Wikipedia Search API.
# Results from it are nondeterministic, your mileage will vary.
def search(query):
    """Search the internet."""
    wiki_matches = wikipedia_search(query, n=3)
    return {"context": ", ".join(wiki_matches), "titles": wiki_matches}

def lookup(query):
    """Lookup more information about a topic."""
    page = wikipedia_search_lookup(query, sentences=3)
    return page

params = {"temperature": 0.0, "max_tokens": 100}
ai = AIChat(params=params, console=False)

ai("San Francisco tourist attractions", tools=[search, lookup])

{'context': "Fisherman's Wharf, San Francisco, Tourist attractions in the United States, Lombard Street (San Francisco)",
 'titles': ["Fisherman's Wharf, San Francisco",
  'Tourist attractions in the United States',
  'Lombard Street (San Francisco)'],
 'tool': 'search',
 'response': "There are many popular tourist attractions in San Francisco, including Fisherman's Wharf and Lombard Street. Fisherman's Wharf is a bustling waterfront area known for its seafood restaurants, souvenir shops, and sea lion sightings. Lombard Street, on the other hand, is a famous winding street with eight hairpin turns that offers stunning views of the city. Both of these attractions are must-sees for anyone visiting San Francisco."}

In [24]:
ai("Lombard Street?", tools=[search, lookup])

{'context': 'Lombard Street is an east–west street in San Francisco, California that is famous for a steep, one-block section with eight hairpin turns. Stretching from The Presidio east to The Embarcadero (with a gap on Telegraph Hill), most of the street\'s western segment is a major thoroughfare designated as part of U.S. Route 101. The famous one-block section, claimed to be "the crookedest street in the world", is located along the eastern segment in the Russian Hill neighborhood.',
 'tool': 'lookup',
 'response': 'Lombard Street is a famous street in San Francisco, California known for its steep, one-block section with eight hairpin turns. It stretches from The Presidio to The Embarcadero and is part of U.S. Route 101. The one-block section, located in the Russian Hill neighborhood, is claimed to be "the crookedest street in the world" and is a popular tourist attraction.'}

In [25]:
ai("Thanks for your help!", tools=[search, lookup])

{'response': "You're welcome! If you have any more questions or need further assistance, feel free to ask.",
 'tool': None}