# Building a Financial Analyst Assistant Using Conversational RAG

In this notebook we will build a financial analyst assistant using AI21 Conversational RAG endpoint.
If you're interested on learning more about this endpoint, you can read the following [blog](https://www.ai21.com/blog/conversational-ai-with-rag) or watch a [video intro](https://www.youtube.com/watch?v=-gZ7W6E0cGc).

See below the architecture behind AI21's Conversational RAG endpoint, showing how the Execution Engine decides whether to route incoming queries to the Retrieval Engine or directly to the LLM.

![title](convrag_architecture.png)

## Imports & Setup

We will use the AI21 SDK throughout this notebook.

In [1]:
!pip install -U ai21


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.3.1[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.9 -m pip install --upgrade pip[0m


In [2]:
import os
import requests
import json
import getpass
import ai21
from ai21 import AI21Client
from ai21.models.chat import ChatMessage
from ai21.models.responses.conversational_rag_response import ConversationalRagResponse

Enter your AI21 API key. You can get it from your [account page](https://studio.ai21.com/account/api-key).

In [3]:
AI21_API_KEY = getpass.getpass(prompt='Enter your AI21 api key:')

Enter your AI21 api key:········


This notebook is using AI21 SDK for the entire process. First, we create an AI21 client.

In [4]:
client = AI21Client(api_key=AI21_API_KEY)

## Data Preparation

To simulate organizational data for finance, we will download 5 10-K forms from Amazon from Amazon’s [Investor Relations](https://ir.aboutamazon.com/sec-filings/default.aspx) page.

In [None]:
# Get the data - download 10k forms from AMZN from the last five years
os.mkdir("data")
!wget 'https://d18rn0p25nwr6d.cloudfront.net/CIK-0001018724/c7c14359-36fa-40c3-b3ca-5bf7f3fa0b96.pdf' -O 'data/amazon_2023.pdf'
!wget 'https://d18rn0p25nwr6d.cloudfront.net/CIK-0001018724/d2fde7ee-05f7-419d-9ce8-186de4c96e25.pdf' -O 'data/amazon_2022.pdf'
!wget 'https://d18rn0p25nwr6d.cloudfront.net/CIK-0001018724/f965e5c3-fded-45d3-bbdb-f750f156dcc9.pdf' -O 'data/amazon_2021.pdf'
!wget 'https://d18rn0p25nwr6d.cloudfront.net/CIK-0001018724/336d8745-ea82-40a5-9acc-1a89df23d0f3.pdf' -O 'data/amazon_2020.pdf'
!wget 'https://d18rn0p25nwr6d.cloudfront.net/CIK-0001018724/4d39f579-19d8-4119-b087-ee618abf82d6.pdf' -O 'data/amazon_2019.pdf'

Upload the files to the RAG Engine. Note that we are using a label, to organize our database and later make the search more focused, and hence more efficient and accurate.

In [None]:
file_list = [os.path.join("data", f) for f in os.listdir("data")]
for file in file_list:
    response = client.library.files.create(file_path=file, labels = ['10k_example'])

## Setting up Conversational RAG

Now let’s set up the system. The API has the same interface as the chat API, and the retrieval process is seamless to the user and only visible through the fields in the response. There are several ways to work with chat APIs. We choose to save the conversation history in a global variable. We also define a default answer in the case where the question should be answered using the documents, but the information is just not there:

In [5]:
# Global variable for maintaining the history
conversation_history = []
DEFAULT_RESPONSE = "I'm sorry, I cannot answer your questions based on the documents I have access to."

def call_convrag(message, labels):
    # Convert chat history to convrag messages format
    conversation_history.append(ChatMessage(content=message, role="user"))

    try:
        chat_response = client.beta.conversational_rag.create(
            messages=conversation_history,
            labels=labels
        )
        
    except Exception as err:
        print(f"Error occurred: {err}")
        conversation_history.pop()
        return
    
    if chat_response.context_retrieved and not chat_response.answer_in_context:
        conversation_history.append(ChatMessage(content=DEFAULT_RESPONSE, role="assistant"))
    else:
        conversation_history.append(ChatMessage(content=chat_response.choices[0].content, role="assistant"))

    return chat_response


def print_convrag_response(chat_response: ConversationalRagResponse):
    if response.context_retrieved and not response.answer_in_context:
        print(DEFAULT_RESPONSE)
    else:
        print(response.choices[0].content)

Now we want to interact with our assistant. Note that we are using a label for all our calls, to make sure the search is faster and more accurate. Starting with a simple hello:

In [6]:
message = "Hello, how are you?"

response = call_convrag(message=message, labels=['10k_example'])

print_convrag_response(response)

 Hello! I'm here and ready to assist you. How can I help you today?


As you can see, we get a generic answer. Obviously, there’s no need to retrieve any of our organizational data to answer this question. Here is the response:

In [11]:
message = "I want to do some research about Amazon in the last couple of years. Let's start with an easy one - how many employees did Amazon have by the end of 2023?"

response = call_convrag(message=message, labels=['10k_example'])

print_convrag_response(response)

 1,525,000.


As you can see, in this case the model went through the retrieval process, and generated an answer based on our organizational data. You can also look at the full response:

In [12]:
response.dict()

{'id': 'c161940a-fe82-40df-8dda-8afbe7e6cf64',
 'choices': [{'role': 'assistant', 'content': ' 1,525,000.'}],
 'search_queries': ['How many employees did Amazon have by the end of 2023?'],
 'context_retrieved': True,
 'answer_in_context': True,
 'sources': [{'text': 'Our employees are critical to our mission of being Earth’s most customer-centric company. As of December 31, 2020, we employed approximately 1,298,000 full-time and part-time employees. Additionally, we utilize independent contractors and temporary personnel to supplement our workforce. Competition for qualified personnel has historically been intense, particularly for software engineers, computer scientists, and other technical staff.\nWe focus on investment and innovation, inclusion and diversity, safety, and engagement to hire and develop the best talent. We rely on numerous and evolving initiatives to implement these objectives and invent mechanisms for talent development, including industry-leading pay and benefits, s

Interesting to look at the following fields:

**context_retrieved**: indicates whether the execution engine decided to route the message to the Retrieval Engine and use the top retrieved segments as context (_True_), or the answer is determined solely by the LLM (_False_).

**answer_in_context**: only relevant when **context_retrieved** = _True_. If False, this indicates the model cannot answer the user's question based on the provided context. This means that either your documents do not contain the needed information, or that you should adjust some of the retrieval parameters.

Let’s continue the conversation with a more complex question:

In [24]:
message = "Thanks. Any major stock events I should know about, including values and splits?"

response = call_convrag(message=message, labels=['10k_example'])

print_convrag_response(response)

 In 2022, Amazon.com, Inc. announced a 20-for-1 stock split of its common stock, which was also accompanied by an increase in the number of authorized shares of common stock. This stock split was reflected in all share, restricted stock unit ("RSU"), and per share or per RSU information throughout the company's Annual Report on Form 10-K.


Looking good! Note that this answer combines knowledge from several different sources in our database. You can look at the **sources** field in the response to see them.

Feeling emboldened by this success, we may be tempted to ask more questions:

In [25]:
message = "How does it compare to Google's stock?"

response = call_convrag(message=message, labels=['10k_example'])

print_convrag_response(response)

I'm sorry, I cannot answer your questions based on the documents I have access to.


Why does the system answer like this? Because the reports we have in our system only date back until 2019. There is a clear indication by the API that the answer is not in any of the documents, allowing for easy parsing and providing a default ‘No response’ message that you can pre define.

You can see the full conversation history below, as stored in our global variable:

In [26]:
conversation_history

[ChatMessage(role='user', content='Hello, how are you?'),
 ChatMessage(role='assistant', content=" Hello! I'm here and ready to assist you. How can I help you today?"),
 ChatMessage(role='user', content="I want to do some research about Amazon in the last couple of years. Let's start with an easy one - how many employees did Amazon have by the end of 2023?"),
 ChatMessage(role='assistant', content=' 1,525,000.'),
 ChatMessage(role='user', content='Thanks. Any major stock events I should know about, including values and splits?'),
 ChatMessage(role='assistant', content=' In 2022, Amazon.com, Inc. announced a 20-for-1 stock split of its common stock, which was also accompanied by an increase in the number of authorized shares of common stock. This stock split was reflected in all share, restricted stock unit ("RSU"), and per share or per RSU information throughout the company\'s Annual Report on Form 10-K.'),
 ChatMessage(role='user', content="How does it compare to Google's stock?"),
 C