<!--
tags: ["RAG"]
description: |
    Explore the use of AutoGen's RagAgent for tasks like code generation from docstrings, answering complex questions with human feedback, and exploiting features like Update Context, custom prompts, and few-shot learning.
-->

# Using RagAgent for Retrieval-Augmented Code Generation and Question Answering

Introducing an agent capable of performing Retrieval-Augmented Generation (RAG) for the given message.

Upon receipt of a message, the agent employs RAG to generate a reply. It retrieves documents based on the message, then generates a reply using both the retrieved documents and the message itself. Additionally, it supports automatic context updates during the conversation, either autonomously or at the user`s request.

## Table of Contents
We'll demonstrate six examples of using RetrieveChat for code generation and question answering:

- [Example 1: Use RagAgent to help generate code](#example-1)
- [Example 2: Use RagAgent to answer a multi-hop question.](#example-2)


````{=mdx}
:::info Requirements
Some extra dependencies are needed for this notebook, which can be installed via pip:

```bash
pip install pyautogen[rag] flaml[automl]
```

For more information, please refer to the [installation guide](/docs/installation/).
:::
````

## Set your API Endpoint

The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.


In [1]:
import json
import os
import logging
import autogen
from autogen.agentchat.contrib.rag import RagAgent, logger
from autogen.agentchat.contrib.rag.splitter import SUPPORTED_EXTENSIONS

logger.setLevel(logging.DEBUG)

config_list = autogen.config_list_from_json(env_or_file="OAI_CONFIG_LIST")

assert len(config_list) > 0
print("models to use: ", [config_list[i]["model"] for i in range(len(config_list))])

  from .autonotebook import tqdm as notebook_tqdm


models to use:  ['gpt-35-turbo', 'gpt-35-turbo-0613']


````{=mdx}
:::tip
Learn more about configuring LLMs for agents [here](/docs/llm_configuration).
:::
````

## Construct agents for RAG

We start by initializing the `RagAgent` and `UserProxyAgent`. The `RagAgent` will be responsible for replying to `UserProxyAgent` with retrieval-augmented generation based on the given knowledge base contained in the `docs_path`.

In [2]:
print("Accepted file formats for `docs_path`:")
print(SUPPORTED_EXTENSIONS)

Accepted file formats for `docs_path`:
['.txt', '.json', '.csv', '.tsv', '.md', '.html', '.htm', '.rtf', '.rst', '.jsonl', '.log', '.xml', '.yaml', '.yml', '.pdf']


In [3]:
llm_config = {
    "timeout": 60,
    "config_list": config_list,
    # "cache_seed": None,  # set to None if you want to disable caching
}


def termination_msg(x):
    return isinstance(x, dict) and "TERMINATE" == str(x.get("content", ""))[-9:].upper()


userproxy = autogen.UserProxyAgent(
    name="userproxy",
    is_termination_msg=termination_msg,
    human_input_mode="NEVER",
    code_execution_config=False,  # {"use_docker":False, "work_dir":"./tmp"},
    # default_auto_reply="Reply `TERMINATE` if the task is done.",
    description="The boss who ask questions and give tasks.",
)

# Check the docstring of RagAgent for more details on the configuration.
rag_config = {
    "docs_path": [
        "https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Examples/Integrate%20-%20Spark.md",
        "https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Research.md",
        os.path.join(os.path.abspath(""), "..", "website", "docs"),
    ],
    "overwrite": True,  # set to False to avoid overwriting the existing documents
}

rag = RagAgent(
    name="rag",
    is_termination_msg=termination_msg,
    human_input_mode="NEVER",
    max_consecutive_auto_reply=5,
    llm_config=llm_config,
    rag_config=rag_config,
    code_execution_config=False,
    description="Assistant that can answer questions and generate code based on retrieved documents.",
)

  return torch._C._cuda_getDeviceCount() > 0
2024-02-26 06:11:01,692 - utils.py:   16 - DEBUG - Processing file: ./tmp/download/Integrate%20-%20Spark.md, url: https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Examples/Integrate%20-%20Spark.md[0m
2024-02-26 06:11:02,055 - utils.py:   16 - DEBUG - [32mSplit ./tmp/download/Integrate%20-%20Spark.md into 2 chunks.[0m
2024-02-26 06:11:02,055 - utils.py:   16 - DEBUG - Processing file: ./tmp/download/Research.md, url: https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Research.md[0m
2024-02-26 06:11:02,058 - utils.py:   16 - DEBUG - [32mSplit ./tmp/download/Research.md into 2 chunks.[0m
2024-02-26 06:11:02,058 - utils.py:   16 - DEBUG - Processing file: /datadrive/autogen/notebook/../website/docs/Research.md, url: None[0m
2024-02-26 06:11:02,060 - utils.py:   16 - DEBUG - [32mSplit /datadrive/autogen/notebook/../website/docs/Research.md into 1 chunks.[0m
2024-02-26 06:11:02,060 - utils.py:   16 - DE

### Example 1

[Back to top](#table-of-contents)

Use RagAgent to help generate sample code and automatically run the code and fix errors if there is any.

Problem: How to use FLAML for a classification task and train the model in 30 seconds. Use spark to parallel the training. Force cancel jobs if time limit is reached.

In [4]:
code_problem = "How to use FLAML for a classification task and train the model in 30 seconds. Use spark to parallel the training. Force cancel jobs if time limit is reached."
rag.rag_filter_document = {"$contains": "spark"}  # filter documents that contain "spark"
_ = userproxy.initiate_chat(rag, message=code_problem)

[33muserproxy[0m (to rag):

How to use FLAML for a classification task and train the model in 30 seconds. Use spark to parallel the training. Force cancel jobs if time limit is reached.

--------------------------------------------------------------------------------


2024-02-26 06:11:04,725 - utils.py:   16 - DEBUG - is_code_execution_result: False[0m
2024-02-26 06:11:04,726 - utils.py:   16 - DEBUG - [32mInput message: How to use FLAML for a classification task and train the model in 30 seconds. Use spark to parallel the training. Force cancel jobs if time limit is reached.[0m
2024-02-26 06:11:04,726 - utils.py:   16 - DEBUG - [32mReceived raw message: How to use FLAML for a classification task and train the model in 30 seconds. Use spark to parallel the training. Force cancel jobs if time limit is reached.[0m
2024-02-26 06:11:04,727 - utils.py:   16 - DEBUG - [32mPerforming RAG for message: How to use FLAML for a classification task and train the model in 30 seconds. Use spark to parallel the training. Force cancel jobs if time limit is reached.[0m
2024-02-26 06:11:04,753 - utils.py:   16 - DEBUG - [32mTask predicted: `code`[0m
2024-02-26 06:11:04,754 - utils.py:   16 - DEBUG - __call__ took 0.03 seconds.[0m
2024-02-26 06:11:04,754 - ut

[33mrag[0m (to userproxy):

You can use FLAML for a classification task and train the model in 30 seconds while using Spark to parallelize the training. Force cancel jobs if the time limit is reached using the following code snippet:

```python
import pandas as pd
from pyspark.sql import SparkSession
from flaml import AutoML

spark = SparkSession.builder \
    .appName("FLAML") \
    .config("spark.sql.execution.arrow.enabled","true") \
    .getOrCreate()

train_data = spark.read.format("csv").option("header", "true").option(
    "inferSchema", "true").load("train.csv")

train_pd = train_data.toPandas()
X_train = train_pd.drop(columns=['target'])
y_train = train_pd['target']

automl = AutoML()

settings = {
    "time_budget": 30,
    "metric": 'accuracy',
    "task": 'classification',
    "n_jobs": -1,
    "model_history": True,
    "log_file_name": "log.txt",
    "estimator_list": ['lgbm', 'rf', 'xgboost'],
    "use_spark": True,
    "force_cancel": True,
    "num_samples": -1,
    

2024-02-26 06:11:04,977 - utils.py:   16 - DEBUG - History message: {'content': 'How to use FLAML for a classification task and train the model in 30 seconds. Use spark to parallel the training. Force cancel jobs if time limit is reached.', 'role': 'user'}[0m
2024-02-26 06:11:04,977 - utils.py:   16 - DEBUG - History message: {'content': 'You can use FLAML for a classification task and train the model in 30 seconds while using Spark to parallelize the training. Force cancel jobs if the time limit is reached using the following code snippet:\n\n```python\nimport pandas as pd\nfrom pyspark.sql import SparkSession\nfrom flaml import AutoML\n\nspark = SparkSession.builder \\\n    .appName("FLAML") \\\n    .config("spark.sql.execution.arrow.enabled","true") \\\n    .getOrCreate()\n\ntrain_data = spark.read.format("csv").option("header", "true").option(\n    "inferSchema", "true").load("train.csv")\n\ntrain_pd = train_data.toPandas()\nX_train = train_pd.drop(columns=[\'target\'])\ny_train =

### Example 2

[Back to top](#table-of-contents)

Use RagAgent to answer a multi-hop question.

Problem: The common authors of the paper FLAML and paper AutoGen.

In [5]:
# reset the assistant. Always reset the assistant before starting a new conversation.
rag.reset()

qa_problem = "The common authors of the paper FLAML and paper AutoGen."
rag.rag_filter_document = None  # clear filter
_ = userproxy.initiate_chat(rag, message=qa_problem)

[33muserproxy[0m (to rag):

The common authors of the paper FLAML and paper AutoGen.

--------------------------------------------------------------------------------


2024-02-26 06:11:04,989 - utils.py:   16 - DEBUG - is_code_execution_result: False[0m
2024-02-26 06:11:04,989 - utils.py:   16 - DEBUG - [32mInput message: The common authors of the paper FLAML and paper AutoGen.[0m
2024-02-26 06:11:04,990 - utils.py:   16 - DEBUG - [32mReceived raw message: The common authors of the paper FLAML and paper AutoGen.[0m
2024-02-26 06:11:04,990 - utils.py:   16 - DEBUG - [32mPerforming RAG for message: The common authors of the paper FLAML and paper AutoGen.[0m
2024-02-26 06:11:05,016 - utils.py:   16 - DEBUG - [32mTask predicted: multihop[0m
2024-02-26 06:11:05,017 - utils.py:   16 - DEBUG - __call__ took 0.03 seconds.[0m
2024-02-26 06:11:05,017 - utils.py:   16 - DEBUG - [32mRefined message for db query: ['Who are the authors of the paper FLAML', 'Who are the authors of the paper AutoGen', 'Are there any authors common to both papers'][0m
2024-02-26 06:11:05,080 - utils.py:   16 - DEBUG - retrieve_docs took 0.06 seconds.[0m
2024-02-26 06:11:

[33mrag[0m (to userproxy):

The common authors of the paper FLAML and paper AutoGen are Qingyun Wu, Erkang Zhu, and Chi Wang.

Source: https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Research.md /datadrive/autogen/notebook/../website/docs/Research.md /datadrive/autogen/notebook/../website/docs/Ecosystem.md

--------------------------------------------------------------------------------
[33muserproxy[0m (to rag):



--------------------------------------------------------------------------------


2024-02-26 06:11:05,137 - utils.py:   16 - DEBUG - History message: {'content': 'The common authors of the paper FLAML and paper AutoGen.', 'role': 'user'}[0m
2024-02-26 06:11:05,138 - utils.py:   16 - DEBUG - History message: {'content': 'The common authors of the paper FLAML and paper AutoGen are Qingyun Wu, Erkang Zhu, and Chi Wang.\n\nSource: https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Research.md /datadrive/autogen/notebook/../website/docs/Research.md /datadrive/autogen/notebook/../website/docs/Ecosystem.md', 'role': 'assistant'}[0m
2024-02-26 06:11:05,138 - utils.py:   16 - DEBUG - History message: {'content': 'The common authors of the paper FLAML and paper AutoGen are Qingyun Wu, Erkang Zhu, and Chi Wang.\n\nSource: https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Research.md /datadrive/autogen/notebook/../website/docs/Research.md /datadrive/autogen/notebook/../website/docs/Ecosystem.md', 'role': 'assistant'}[0m
2024-02-26 06:11:05,