# RetrieveChat based FinRobot-RAG

In this demo, we showcase the RAG usecase of our finrobot, which inherits from autogen's RetrieveChat implementation.

In [6]:
import autogen
from finrobot.agents.workflow import SingleAssistantRAG

for openai configuration, rename OAI_CONFIG_LIST_sample to OAI_CONFIG_LIST and replace the api keys

In [3]:
# Read OpenAI API keys from a JSON file
llm_config = {
    "config_list": autogen.config_list_from_json(
        "../OAI_CONFIG_LIST",
        filter_dict={"model": ["gpt-4-0125-preview"]},
    ),
    "timeout": 120,
    "temperature": 0,
}

From `finrobot.agents.workflow` we import the `SingleAssistantRAG`, which takes a `retrieve_config` as input.
For `doc_path`, we first put our generated pdf report from [this notebook](./agent_annual_report.ipynb). 

For more configuration, refer to [autogen's documentation](https://microsoft.github.io/autogen/docs/reference/agentchat/contrib/retrieve_user_proxy_agent)

Then, lets do a simple Q&A.

In [4]:
assitant = SingleAssistantRAG(
    "Data_Analyst",
    llm_config,
    human_input_mode="ALWAYS",
    retrieve_config={
        "task": "qa",
        "docs_path": [
            "../report/Microsoft_Annual_Report_2023.pdf",
        ],
        "chunk_token_size": 2000,
        "get_or_create": True,
    },
)
assitant.chat("How's msft's 2023 income? Provide with some analysis.")

  from .autonotebook import tqdm as notebook_tqdm


Trying to create collection.


2024-05-29 16:04:17,881 - autogen.agentchat.contrib.retrieve_user_proxy_agent - INFO - Found 1 chunks.[0m
2024-05-29 16:04:17,884 - autogen.agentchat.contrib.vectordb.chromadb - INFO - No content embedding is provided. Will use the VectorDB's embedding function to generate the content embedding.[0m
Number of requested results 20 is greater than number of elements in index 1, updating n_results = 1


VectorDB returns doc_ids:  [['395fb8b9']]
[32mAdding content of doc 395fb8b9 to context.[0m
[33mUser_Proxy[0m (to Data_Analyst):

You're a retrieve augmented chatbot. You answer user's questions based on your own knowledge and the
context provided by the user.
If you can't answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.
You must give as short an answer as possible.

User's question is: How's msft's 2023 income? Provide with some analysis.

Context is: Equity Research Report: Microsoft Corporation
Income Summarization
The company experienced a 7% Year-over-Year increase in revenue, driven by
significant contributions from its Intelligent Cloud and Productivity and Business
Processes segments, indicating a strong demand for cloud-based solutions and
productivity software. Despite the revenue growth, the Cost of Goods Sold (COGS)
increased by 5%, suggesting a need for closer cost control measures to improve cost
efficiency and mainta

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


[33mData_Analyst[0m (to User_Proxy):

TERMINATE

--------------------------------------------------------------------------------


Here we come up with a more complex case, where we put the 10-k report of MSFT here.

Let' see how the agent work this out.

In [5]:
assitant = SingleAssistantRAG(
    "Data_Analyst",
    llm_config,
    human_input_mode="ALWAYS",
    retrieve_config={
        "task": "qa",
        "docs_path": [
            # "../report/Microsoft_Annual_Report_2023.pdf",
            "https://www.sec.gov/Archives/edgar/data/789019/000095017023035122/msft-20230630.htm",
        ],
        "chunk_token_size": 2000,
        "overwrite": True,
    },
)
assitant.chat("How's msft's 2023 income? Provide with some analysis.")

Trying to create collection.


max_tokens is too small to fit a single line of text. Breaking this line:
	 ...
Failed to split docs with must_break_at_empty_line being True, set to False.
max_tokens is too small to fit a single line of text. Breaking this line:
	0000789019falseFY--06-30P2YP5YP3YP1YP1Yhttp://fasb.org/us-gaap/2023#DerivativeAssetshttp://fasb.org/ ...
max_tokens is too small to fit a single line of text. Breaking this line:
	89019falseFY--06-30P2YP5YP3YP1YP1Yhttp://fasb.org/us-gaap/2023#DerivativeAssetshttp://fasb.org/us-ga ...
max_tokens is too small to fit a single line of text. Breaking this line:
	falseFY--06-30P2YP5YP3YP1YP1Yhttp://fasb.org/us-gaap/2023#DerivativeAssetshttp://fasb.org/us-gaap/20 ...
max_tokens is too small to fit a single line of text. Breaking this line:
	FY--06-30P2YP5YP3YP1YP1Yhttp://fasb.org/us-gaap/2023#DerivativeAssetshttp://fasb.org/us-gaap/2023#De ...
max_tokens is too small to fit a single line of text. Breaking this line:
	6-30P2YP5YP3YP1YP1Yhttp://fasb.org/us-gaap/2023#

VectorDB returns doc_ids:  [['eea01a55', '740b762b', '66d1c800', 'fbf08d25', '4192dee7', '84c5d64c', '66f0b48d', '9887b38f', 'd7c05c64', '965f6567', '5af22498', '25347883', '73ecd3b8', 'f926aed7', 'ffd75c82', '09235865', '31b1bc84', '86497d1b', '698a4f6f', '90cab907']]
[32mAdding content of doc eea01a55 to context.[0m
[32mSkip doc_id 740b762b as it is too long to fit in the context.[0m
[32mAdding content of doc 66d1c800 to context.[0m
[33mUser_Proxy[0m (to Data_Analyst):

You're a retrieve augmented chatbot. You answer user's questions based on your own knowledge and the
context provided by the user.
If you can't answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.
You must give as short an answer as possible.

User's question is: How's msft's 2023 income? Provide with some analysis.

Context is: msft:
| 10.19\* |  | [Microsoft Corporation Executive Incentive Plan](https://www.sec.gov/Archives/edgar/data/789019/000119312516742796/d