<a href="https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/agentchat_web_info.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AutoGen Agents with Retrieval Augmented Generation

**`AutoGen`** is a versatile framework that facilitates the creation of LLM applications by employing multiple agents capable of interacting with one another to tackle tasks. These AutoGen agents can be tailored to specific needs, engage in conversations, and seamlessly integrate human participation. They are adaptable to different operation modes that encompass the utilization of LLMs, human inputs, and various tools.

**`RAG`** stands for `Retrieval Augmented Generation`, a natural language processing (NLP) technique that combines two essential components: **retrieval** and **generation**.

The previous tutorial [AutoGen + LangChain = Super AI Agents](https://github.com/sugarforever/LangChain-Advanced/blob/main/Integrations/AutoGen/autogen_langchain_uniswap_ai_agent.ipynb) introduced how to build an AutoGen application that can execute tasks requiring specific documents knowledge. This is a typical RAG use case, aka. document based chatbot.

The latest **AutoGen** version already supports RAG natively with the feature package `retrievechat`.

In this tutorial, we are going to rebuild the same feature demonstrated in the previous tutorial. We will utilize `AutoGen` `retrievechat` feature.

This tutorial is inspired by the [Blog - Retrieval-Augmented Generation (RAG) Applications with AutoGen](https://microsoft.github.io/autogen/blog/2023/10/18/RetrieveChat) of [Li Jiang](https://github.com/thinkall).

Credits go to Li Jiang! 🙌

Let's roll! 🚴🏻‍♀️ 🚴🏻 🚴🏻‍♂️

## Use Case



In this tutorial, I will create the retrieval augmented agents with the following document:

[RETRIEVAL AUGMENTED GENERATION AND REPRESENTATIVE
VECTOR SUMMARIZATION FOR LARGE UNSTRUCTURED
TEXTUAL DATA IN MEDICAL EDUCATION](https://arxiv.org/pdf/2308.00479.pdf)

You should be able to see the agents are able to perform retrieval augmented generation based on the document above and answer question.

### Environment Preparation

In [1]:
%pip install pyautogen[retrievechat] langchain "chromadb<0.4.15" -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.6/50.6 kB[0m [31m1.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m294.6/294.6 kB[0m [31m14.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m31.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m448.1/448.1 kB[0m [31m24.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m59.3 MB/s[0m eta [36m0:00:

In [1]:
!pip install --upgrade chromadb
%pip install pyautogen[retrievechat] langchain



In [2]:
import os
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


In [1]:
import autogen

config_list = autogen.config_list_from_json(
    "/content/drive/MyDrive/surya/Foundation Models and Generative AI_Minor/OAI_CONFIG_LIST.json",
    filter_dict={
        "model": ["gpt-4"],
    },
)

Dask dataframe query planning is disabled because dask-expr is not installed.

You can install it with `pip install dask[dataframe]` or `conda install dask`.
This will raise in a future version.



### Steps

#### 1. Configure Embedding Function

We will use OpenAI embedding function.

In [2]:
from chromadb.utils import embedding_functions

openai_embedding_function = embedding_functions.OpenAIEmbeddingFunction(api_key = config_list[0]["api_key"])

#### 2. Configure Text Splitter

LangChain has done a great job in text splitting, so we will use its components.

In [3]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(separators=["\n\n", "\n", "\r", "\t"])

#### 3. Configure Vector Store

By default, the AutoGen retrieval augmented agents use `chromadb` as vector store.

Developers can configure preferred vector store by extending the class `RetrieveUserProxyAgent` and overriding function `retrieve_docs`.

AutoGen also supports simple configuration items to customize Chromadb storage.

In this demo, we will specify the collection name by `retreive_config` item `collection_name`. You should be able to see it in step 4.


#### 4. Create Retrieval Augmented Agents

In [8]:
!pip install --upgrade autogen


Collecting autogen
  Using cached autogen-0.3.1-py3-none-any.whl.metadata (27 kB)
Using cached autogen-0.3.1-py3-none-any.whl (350 kB)
Installing collected packages: autogen
Successfully installed autogen-0.3.1


In [4]:
from autogen.agentchat.contrib.retrieve_assistant_agent import RetrieveAssistantAgent
from autogen.agentchat.contrib.retrieve_user_proxy_agent import RetrieveUserProxyAgent

llm_config = {
    "request_timeout": 600,
    "config_list": config_list,
    "temperature": 0
}

assistant = RetrieveAssistantAgent(
    name="assistant",
    system_message="You are a helpful assistant.",
    llm_config=llm_config,
)

rag_agent = RetrieveUserProxyAgent(
    human_input_mode="NEVER",
    retrieve_config={
        "task": "qa",
        "docs_path": "/content/drive/MyDrive/surya/Foundation Models and Generative AI_Minor/rag.pdf",
        "collection_name": "rag_collection",
        "embedding_function": openai_embedding_function,
        "custom_text_split_function": text_splitter.split_text,
    },
)





  assistant = RetrieveAssistantAgent(


In [7]:
pip install openai==0.27.0


Collecting openai==0.27.0
  Downloading openai-0.27.0-py3-none-any.whl.metadata (13 kB)
Downloading openai-0.27.0-py3-none-any.whl (70 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m70.1/70.1 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: openai
  Attempting uninstall: openai
    Found existing installation: openai 1.52.2
    Uninstalling openai-1.52.2:
      Successfully uninstalled openai-1.52.2
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
autogen 0.3.1 requires openai>=1.3, but you have openai 0.27.0 which is incompatible.
pyautogen 0.3.0 requires openai>=1.3, but you have openai 0.27.0 which is incompatible.[0m[31m
[0mSuccessfully installed openai-0.27.0


In [13]:
import time
import openai

openai.api_key = "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"

def get_response(prompt):
    try:
        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": prompt}]
        )
        return response.choices[0].message["content"]
    except openai.error.RateLimitError:
        print("Rate limit exceeded. Waiting and retrying...")
        time.sleep(10)  # Wait before retrying
        return get_response(prompt)

print(get_response("Hello, world!"))


Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit exceeded. Waiting and retrying...
Rate limit

KeyboardInterrupt: 

### It's time to start a chat with the RAG agent.

In [12]:
assistant.reset()
rag_agent.initiate_chat(assistant, problem="What is the workflow in docGPT?", n_results=2)

>What is the workflow in docGPT?
RetrieveChatAgent (to assistant):

What is the workflow in docGPT?

--------------------------------------------------------------------------------


TypeError: Completions.create() got an unexpected keyword argument 'request_timeout'