<a href="https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/agentchat_teachability.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Making OpenAI Assistants Teachable

Conversational assistants based on LLMs can remember the current chat with the user, and can even demonstrate in-context learning of things that the user teaches the assistant during the chat. But these memories and learnings are lost once the chat is over, or when a single chat grows too long for the LLM to handle effectively. In subsequent chats, the user is forced to repeat any necessary instructions over and over.

The optional agent capability called `Teachability` addresses these limitations by persisting user teachings across chat boundaries in long-term memory (a vector database). Memories (called memos) are created and saved to disk throughout a conversation, then loaded from disk later. Instead of copying all the memos into the context window, which would eat up valuable space, individual memos are retrieved into context only as needed. This allows the user to teach many facts, preferences and skills to the teachable agent just once, and have it remember them in later chats.

In making decisions about memo storage and retrieval, `Teachability` calls an instance of `TextAnalyzerAgent` to analyze pieces of text in several different ways. This adds extra LLM calls involving a relatively small number of tokens. These calls can add a few seconds to the time a user waits for a response.

This notebook demonstrates how `Teachability` can be added to instances of `GPTAssistantAgent`
so that they can learn facts, preferences, and skills from users. As explained [here](https://microsoft.github.io/autogen/blog/2023/11/13/OAI-assistants), each instance of `GPTAssistantAgent` wraps an OpenAI Assistant that can be given a set of tools including functions, code interpreter, and retrieval. Assistants with these tools are demonstrated in separate standalone sections below, which can be run independently.

## Requirements

AutoGen requires `Python>=3.8`. To run this notebook example, please install the [teachable] option.
```bash
pip install "pyautogen[teachable]"
```

In [None]:
%%capture --no-stderr
# %pip install "pyautogen[teachable]"

## Set your API Endpoint

The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.

In [1]:
import autogen
from autogen.agentchat.contrib.capabilities.teachability import Teachability

config_list = autogen.config_list_from_json(
    env_or_file="OAI_CONFIG_LIST",
    file_location=".",
    filter_dict={
        "model": ["gpt-4", "gpt-4-1106-preview", "gpt4", "gpt-4-32k"],
    },
)

print(config_list[0]["model"])

gpt-4-1106-preview


It first looks for environment variable "OAI_CONFIG_LIST" which needs to be a valid json string. If that variable is not found, it then looks for a json file named "OAI_CONFIG_LIST". It filters the configs by models (you can filter by other keys as well). After application of the filter shown above, only the gpt-4 models are considered.

The config list may look like the following:
```python
config_list = [
    {
        'model': 'gpt-4-1106-preview',
        'api_key': '<your OpenAI API key here>',
    },
    {
        'model': 'gpt-4',
        'api_key': '<your OpenAI API key here>',
    },
    {
        'model': 'gpt-4',
        'api_key': '<your Azure OpenAI API key here>',
        'base_url': '<your Azure OpenAI API base here>',
        'api_type': 'azure',
        'api_version': '2023-06-01-preview',
    },
    {
        'model': 'gpt-4-32k',
        'api_key': '<your Azure OpenAI API key here>',
        'base_url': '<your Azure OpenAI API base here>',
        'api_type': 'azure',
        'api_version': '2023-06-01-preview',
    },
]
```

If you open this notebook in colab, you can upload your files by clicking the file icon on the left panel and then choose "upload file" icon.

You can set the value of config_list in other ways if you prefer, e.g., loading from a YAML file.

## Teachable OpenAI Assistant with Functions
This section is based on [agentchat_oai_assistant_function_call.ipynb](https://github.com/microsoft/autogen/blob/teach_cap/notebook/agentchat_oai_assistant_function_call.ipynb), demonstrating how to leverage OSS Insight (Open Source Software Insight) for advanced GitHub data analysis.

### Function schema and implementation

This section provides the function schema definition and their implementation details. These functions are tailored to fetch and process data from GitHub, utilizing OSS Insight's capabilities.

In [2]:
import logging
import os
import requests

logger = logging.getLogger(__name__)
logger.setLevel(logging.WARNING)

ossinsight_api_schema = {
  "name": "ossinsight_data_api",
  "parameters": {
    "type": "object",
    "properties": {
      "question": {
        "type": "string",
        "description": (
            "Enter your GitHub data question in the form of a clear and specific question to ensure the returned data is accurate and valuable. "
            "For optimal results, specify the desired format for the data table in your request."
        ),
      }
    },
    "required": [
      "question"
    ]
  },
  "description": "This is an API endpoint allowing users (analysts) to input question about GitHub in text format to retrieve the realted and structured data."
}

def get_ossinsight(question):
    """
    Retrieve the top 10 developers with the most followers on GitHub.
    """
    url = "https://api.ossinsight.io/explorer/answer"
    headers = {"Content-Type": "application/json"}
    data = {
        "question": question,
        "ignoreCache": True
    }

    response = requests.post(url, headers=headers, json=data)
    if response.status_code == 200:
        answer = response.json()
    else:
        return f"Request to {url} failed with status code: {response.status_code}"

    report_components = []
    report_components.append(f"Question: {answer['question']['title']}")
    if answer['query']['sql']  != "":
        report_components.append(f"querySQL: {answer['query']['sql']}")

    if answer.get('result', None) is None or len(answer['result']['rows']) == 0:
        result = "Result: N/A"
    else:
        result = "Result:\n  " + "\n  ".join([str(row) for row in answer['result']['rows']])
    report_components.append(result)

    if  answer.get('error', None) is not None:
        report_components.append(f"Error: {answer['error']}")
    return "\n\n".join(report_components) + "\n\n"

### Construct the assistant with function calling as a tool

In [3]:
from autogen import config_list_from_json
from autogen.agentchat.contrib.gpt_assistant_agent import GPTAssistantAgent
from autogen import UserProxyAgent

assistant_id = os.environ.get("ASSISTANT_ID", None)
config_list = config_list_from_json("OAI_CONFIG_LIST")
llm_config = {
    "config_list": config_list,
    "assistant_id": assistant_id,
     "tools": [
        {
            "type": "function",
            "function": ossinsight_api_schema,
        }
    ]
}

oss_analyst = GPTAssistantAgent(
    name="OSS Analyst",                            
    instructions=(
        "Hello, Open Source Project Analyst. You'll conduct comprehensive evaluations of open source projects or organizations on the GitHub platform, "
        "analyzing project trajectories, contributor engagements, open source trends, and other vital parameters. "
        "Please carefully read the context of the conversation to identify the current analysis question or problem that needs addressing."
    ),
    llm_config=llm_config,
    verbose=True,
)
oss_analyst.register_function(
    function_map={
        "ossinsight_data_api": get_ossinsight,
    }
)

GPT Assistant only supports one OpenAI client. Using the first client in the list.
assistant OSS Analyst does not exist, creating a new assistant


### Give the assistant a task involving function calls

In [4]:
user_proxy = UserProxyAgent(name="user_proxy",
    code_execution_config={
        "work_dir": "coding"
    },
    is_termination_msg=lambda msg: "TERMINATE" in msg["content"],
    human_input_mode="NEVER",
    max_consecutive_auto_reply=0)

user_proxy.initiate_chat(oss_analyst, message="Top 10 developers with the most followers")

[33muser_proxy[0m (to OSS Analyst):

Top 10 developers with the most followers

--------------------------------------------------------------------------------
[35m
>>>>>>>> EXECUTING FUNCTION ossinsight_data_api...[0m
[35m
Input arguments: {'question': 'Who are the top 10 developers with the most followers on GitHub?'}
Output:
Question: Who are the top 10 developers with the most followers on GitHub?

querySQL: SELECT `login` AS `user_login`, `followers` AS `followers` FROM `github_users` ORDER BY `followers` DESC LIMIT 10

Result:
  {'followers': 196404, 'user_login': 'torvalds'}
  {'followers': 96850, 'user_login': 'yyx990803'}
  {'followers': 85337, 'user_login': 'gaearon'}
  {'followers': 77032, 'user_login': 'ruanyf'}
  {'followers': 76251, 'user_login': 'gustavoguanabara'}
  {'followers': 68210, 'user_login': 'bradtraversy'}
  {'followers': 66512, 'user_login': 'JakeWharton'}
  {'followers': 60972, 'user_login': 'peng-zhihui'}
  {'followers': 60311, 'user_login': 'sindreso

### Make the assistant teachable
We can make any `ConversableAgent` teachable by adding a `Teachability` object to it.

In [5]:
teachability = Teachability(reset_db=True, llm_config={"config_list": config_list})
teachability.add_to_agent(oss_analyst)

[92m
CLEARING MEMORY[0m


### Test the teachable assistant
This time let's teach the assistant a more specific way of formatting lists.

In [8]:
user_proxy.initiate_chat(oss_analyst, message="List the top 10 developers with the most followers. When listing things, please put them in a table.")

[33muser_proxy[0m (to OSS Analyst):

List the top 10 developers with the most followers. When listing things, please put them in a table.

--------------------------------------------------------------------------------
[35m
>>>>>>>> EXECUTING FUNCTION ossinsight_data_api...[0m
[35m
Input arguments: {'question': "List the top 10 developers with the most followers on GitHub. Please provide the data in a table format including the developers' usernames and the number of followers they have."}
Output:
Question: List the top 10 developers with the most followers on GitHub. Please provide the data in a table format including the developers' usernames and the number of followers they have.

querySQL: SELECT `login` AS `Developer`, `followers` AS `Followers` FROM `github_users` ORDER BY `followers` DESC LIMIT 10

Result:
  {'Developer': 'torvalds', 'Followers': 196404}
  {'Developer': 'yyx990803', 'Followers': 96850}
  {'Developer': 'gaearon', 'Followers': 85337}
  {'Developer': 'ruanyf'

Finally, let's see whether the assistant can remember to use our preferred formatting in a new chat.

In [9]:
user_proxy.initiate_chat(oss_analyst, message="List the top 10 developers with the most followers.", clear_history=True)

[33muser_proxy[0m (to OSS Analyst):

List the top 10 developers with the most followers.

--------------------------------------------------------------------------------
[35m
>>>>>>>> EXECUTING FUNCTION ossinsight_data_api...[0m
[35m
Input arguments: {'question': 'List the top 10 developers with the most followers. Please provide the data in a table format.'}
Output:
Question: List the top 10 developers with the most followers. Please provide the data in a table format.

querySQL: SELECT login AS user_login, followers FROM github_users ORDER BY followers DESC LIMIT 10

Result:
  {'followers': 196404, 'user_login': 'torvalds'}
  {'followers': 96850, 'user_login': 'yyx990803'}
  {'followers': 85337, 'user_login': 'gaearon'}
  {'followers': 77032, 'user_login': 'ruanyf'}
  {'followers': 76251, 'user_login': 'gustavoguanabara'}
  {'followers': 68210, 'user_login': 'bradtraversy'}
  {'followers': 66512, 'user_login': 'JakeWharton'}
  {'followers': 60972, 'user_login': 'peng-zhihui'}
 

### Delete the teachable assistant
All OpenAI Assistants can be created, viewed and deleted manually through OpenAI's [website](https://platform.openai.com/assistants). They can also be deleted through the `GPTAssistantAgent` instance.

In [10]:
oss_analyst.delete_assistant()

Permanently deleting assistant...
