# RAG with Unstructured, LangChain, & KDB.AI
##### Note: This example requires a KDB.AI endpoint and API key. Sign up for a free [KDB.AI account](https://kdb.ai/get-started).

> [KDB.AI](https://kdb.ai/) is a powerful knowledge-based vector database and search engine that allows you to build scalable, reliable AI applications, using real-time data, by providing advanced search, recommendation and personalization.

PDFs and other complex document types are notoriously difficult to work with, yet are the common file formats used for publishing important business related information. Since these file types are so common, it is key to have the capability to parse and ingest these documents swiftly, with accuracy, while cleanly extracting embedded entities such as images, tables, and graphs. If extracted correctly, all of the data held in a complex document like a PDF can be ingested into a RAG workflow to generate accurate and contextual responses for users and the business.

This sample will illustrate how to use Unstructured, a complex document parsing technology, to ingest complex documentation, partition it into useful elements, perform chunking and embedding, and finally store the embeddings in KDB.AI. After this, we can complete a RAG pipeline with LangChain and query the KDB.AI vector database to retrieve the most relevant elements and pass them to an LLM to generate a response.

We will focus in on how to enhance table elements with context and standardized formatting to enhance retrieval and generation.

Agenda:
1. Dependencies, Imports & Setup
2. Use Unstructured to Process Complex PDF Documentation
3. Embed Extracted Elements with OpenAI Embedding Model
4. Define KDB.AI Session
5. Create Schema and KDB.AI Table
6. Use LangChain and KDB.AI to Perform RAG!

## 1. Dependencies, Imports & Setup

In order to successfully run this sample, note the following steps depending on where you are running this notebook:

-***Run Locally / Private Environment:*** The [Setup](https://github.com/KxSystems/kdbai-samples/blob/main/README.md#setup) steps in the repository's `README.md` will guide you on prerequisites and how to run this with Jupyter.


-***Colab / Hosted Environment:*** Open this notebook in Colab and run through the cells.

In [None]:
!apt-get -qq install poppler-utils tesseract-ocr
%pip install -q --user --upgrade pillow
%pip install -q --upgrade unstructured["all-docs"]
%pip install pymupdf
%pip install kdbai_client
%pip install langchain-openai
%pip install langchain
%pip install langchain-community
%pip install --upgrade nltk

In [None]:
from unstructured.partition.pdf import partition_pdf
from unstructured.partition.auto import partition
from unstructured.embed.openai import OpenAIEmbeddingConfig, OpenAIEmbeddingEncoder
import fitz
from langchain_openai import OpenAIEmbeddings
import kdbai_client as kdbai
from langchain_community.vectorstores import KDBAI
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI
import nltk
nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

Get OpenAI API key here:
- [OpenAI](https://platform.openai.com/api-keys)

In [None]:
import os
from getpass import getpass
# Set OpenAI API
if "OPENAI_API_KEY" in os.environ:
    KDBAI_API_KEY = os.environ["OPENAI_API_KEY"]
else:
    # Prompt the user to enter the API key
    OPENAI_API_KEY = getpass("OPENAI API KEY: ")
    # Save the API key as an environment variable for the current session
    os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

OPENAI API KEY: ··········




#### Download Earnings Report

In [None]:
!wget 'https://s21.q4cdn.com/399680738/files/doc_news/Meta-Reports-Second-Quarter-2024-Results-2024.pdf' -O './doc1.pdf'

--2024-08-30 13:53:10--  https://s21.q4cdn.com/399680738/files/doc_news/Meta-Reports-Second-Quarter-2024-Results-2024.pdf
Resolving s21.q4cdn.com (s21.q4cdn.com)... 199.254.199.17, 2605:6440:8000:1:199:254:199:17
Connecting to s21.q4cdn.com (s21.q4cdn.com)|199.254.199.17|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 195613 (191K) [application/pdf]
Saving to: ‘./doc1.pdf’


2024-08-30 13:53:10 (1.72 MB/s) - ‘./doc1.pdf’ saved [195613/195613]



# 2. Use Unstructured to Process Complex PDF Documentation

1. Read in data
2. Partition using the 'hi_res' strategy
3. Chunk

In [None]:
!mkdir './tables'

In [None]:
elements = partition_pdf('./doc1.pdf',
                              strategy="hi_res",
                              chunking_strategy="by_title",
                              max_characters=2500,
                              new_after_n_chars=2300,
                              )

#### Explore the extracted elements

In [None]:
from collections import Counter
display(Counter(type(element) for element in elements))

Counter({unstructured.documents.elements.CompositeElement: 17,
         unstructured.documents.elements.Table: 10})

In [None]:
for element in elements:
  print(type(element))

In [None]:
for element in elements:
  if element.to_dict()['type'] == 'Table':
    print(element.text)

Three Months Ended June 30, In millions, except percentages and per share amounts 2024 2023 % Change Revenue Costs and expenses Income from operations Operating margin Provision for income taxes Effective tax rate Net income $ 39,071 24,224 $ 14,847 38 % $ 1,641 11 % $ 13,465 $ 5.16 $ 31,999 22,607 $ 9,392 29 % $ 1,505 16 % $ 7,788 $ 2.98 22 % 7 % 58 % 9 % 73 % Diluted earnings per share (EPS) 73 %
2024 2023 2024 2023 Revenue Costs and expenses: Cost of revenue Research and development Marketing and sales General and administrative (1) Total costs and expenses Income from operations Interest and other income (expense), net Income before provision for income taxes Provision for income taxes Net income $ 39,071 7,308 10,537 2,721 3,658 24,224 14,847 259 15,106 1,641 $ 13,465 5,945 9,344 3,154 4,164 13,948 20,515 5,285 7,114 22,607 46,862 9,392 (99) 9,293 1,505 28,665 624 29,289 3,455
Diluted Weighted-average shares used to compute earnings per share: Basic $ 5.31 $ 5.16 2,534 2,610 2,568

#### What a table element looks like after extraction:

In [None]:
print(elements[-2])

Foreign exchange effect on 2024 revenue using 2023 rates Revenue excluding foreign exchange effect GAAP revenue year-over-year change % Revenue excluding foreign exchange effect year-over-year change % GAAP advertising revenue Foreign exchange effect on 2024 advertising revenue using 2023 rates Advertising revenue excluding foreign exchange effect 2024 $ 39,071 371 $ 39,442 22 % 23 % $ 38,329 367 $ 38,696 22 % 2023 $ 31,999 $ 31,498 2024 $ 75,527 265 $ 75,792 25 % 25 % $ 73,965 261 $ 74,226 24 % 2023 GAAP advertising revenue year-over-year change % Advertising revenue excluding foreign exchange effect year-over-year change % 23 % 25 % Net cash provided by operating activities Purchases of property and equipment, net Principal payments on finance leases $ 19,370 (8,173) (299) $ 10,898 $ 17,309 (6,134) (220) $ 10,955 $ 38,616 (14,573) (614) $ 23,429


## Embed Extracted Elements with OpenAI Embedding Model


In [None]:
from unstructured.embed.openai import OpenAIEmbeddingConfig, OpenAIEmbeddingEncoder

embedding_encoder = OpenAIEmbeddingEncoder(
    config=OpenAIEmbeddingConfig(
      api_key=os.getenv("OPENAI_API_KEY"),
      model_name="text-embedding-3-small",
    )
)
elements = embedding_encoder.embed_documents(
    elements=elements
)

### Store original elements in a dataframe

In [None]:
import pandas as pd
data = []

for c in elements:
  row = {}
  row['id'] = c.id
  row['text'] = c.text
  row['metadata'] = c.metadata.to_dict()
  row['embedding'] = c.embeddings
  data.append(row)

df_non_contextualized = pd.DataFrame(data)
df_non_contextualized.head()

Unnamed: 0,id,text,metadata,embedding
0,7673dd5dd3348ca922edfeb765c4f8ec,FACEBOOK\n\nNEWS RELEASE\n\nMeta Reports Secon...,"{'filetype': 'application/pdf', 'languages': [...","[0.05783626437187195, 0.015577166341245174, 0...."
1,60c0d519ed08e8be7410846bab6d76d4,"Three Months Ended June 30, In millions, excep...","{'last_modified': '2024-07-31T20:06:06', 'file...","[0.007137244567275047, 0.017794227227568626, 0..."
2,1006ba147b4696dcfa364d82a7cc3ff9,Second Quarter 2024 Operational and Other Fina...,"{'filetype': 'application/pdf', 'languages': [...","[0.07876335829496384, 0.010785269550979137, 0...."
3,3ccaeebfca3cd0b37f89d9865ed86620,CFO Outlook Commentary\n\nWe expect third quar...,"{'filetype': 'application/pdf', 'languages': [...","[0.06496476382017136, 0.023071356117725372, -0..."
4,f83ab884e1b7f6cd22bb6fec166375de,About Meta\n\nMeta builds technologies that he...,"{'filetype': 'application/pdf', 'languages': [...","[0.04435691237449646, -0.008723972365260124, -..."


### Create contextualized descriptions and markdown formatted tables, these new chunks will be used in place of the old table descriptions

In [None]:
import os
import openai
from openai import OpenAI

# Initialize the OpenAI client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

def get_table_description(table_content, document_context):
    prompt = f"""
    Given the following table and its context from the original document,
    provide a detailed description of the table. Then, include the table in markdown format.

    Original Document Context:
    {document_context}

    Table Content:
    {table_content}

    Please provide:
    1. A comprehensive description of the table.
    2. The table in markdown format.
    """

    response = client.chat.completions.create(
        model="gpt-4o-2024-08-06",
        messages=[
            {"role": "system", "content": "You are a helpful assistant that describes tables and formats them in markdown."},
            {"role": "user", "content": prompt}
        ]
    )

    return response.choices[0].message.content

def extract_text_from_pdf(pdf_path):
    text = ""
    with fitz.open(pdf_path) as doc:
        for page in doc:
            text += page.get_text()
    return text

pdf_path = './doc1.pdf'
document_content = extract_text_from_pdf(pdf_path)

# Process each table in the directory
for element in elements:
  if element.to_dict()['type'] == 'Table':
    table_content = element.to_dict()['text']

    # Get description and markdown table from GPT-4
    result = get_table_description(table_content, document_content)
    element.text = result

print("Processing complete.")


Processing complete.


## Embed Extracted Text Elements and Updated Table Elements with OpenAI Embedding Model

In [None]:
from unstructured.embed.openai import OpenAIEmbeddingConfig, OpenAIEmbeddingEncoder

embedding_encoder = OpenAIEmbeddingEncoder(
    config=OpenAIEmbeddingConfig(
      api_key=os.getenv("OPENAI_API_KEY"),
      model_name="text-embedding-3-small",
    )
)
elements = embedding_encoder.embed_documents(
    elements=elements
)

### Take a look through the new contextualized table elements:

In [None]:
for element in elements:
  if element.to_dict()['type'] == 'Table':
    print(element.text)

### Comprehensive Description of the Table

The table presents a comparative financial summary for Meta Platforms, Inc. for the three months ended on June 30, 2024, and June 30, 2023. It includes key financial metrics such as revenue, costs and expenses, income from operations, operating margin, provision for income taxes, effective tax rate, net income, and diluted earnings per share (EPS). Additionally, the table shows the percentage change for each metric from 2023 to 2024. 

Key points from the table:
- **Revenue** increased by 22% from $31,999 million in Q2 2023 to $39,071 million in Q2 2024.
- **Costs and Expenses** saw a 7% increase from $22,607 million in Q2 2023 to $24,224 million in Q2 2024.
- **Income from Operations** surged by 58%, rising from $9,392 million in Q2 2023 to $14,847 million in Q2 2024.
- **Operating Margin** improved from 29% in Q2 2023 to 38% in Q2 2024.
- **Provision for Income Taxes** experienced a 9% increase, from $1,505 million in Q2 2023 to $1,641 mill

This markdown table provides a concise presentation of the financial data, making it easy to read and comprehend in a digital format.
### Detailed Description of the Table

The table presents segment information from Meta Platforms, Inc. for both revenue and income (loss) from operations. The data is organized into two main sections:
1. **Revenue**: This section is subdivided into two categories: "Advertising" and "Other revenue". The total revenue generated from these subcategories is then summed up for two segments: "Family of Apps" and "Reality Labs". The table provides the revenue figures for three months and six months ended June 30, for the years 2024 and 2023.
2. **Income (loss) from operations**: This section shows the income or loss from operations for the "Family of Apps" and "Reality Labs" segments, again for the same time periods.

The table allows for a comparison between the two segments of Meta's business over time, illustrating the performance of each segment in terms of revenue and operational income or loss.

### The Table in Markdown Format

```markdown
### Segment Information (In millions, Unaudited)

|                            | Three Months Ended June 30, 2024 | Three Months Ended June 30, 2023 | Six Months Ended June 30, 2024 | Six Months Ended June 30, 2023 |
|----------------------------|----------------------------------|----------------------------------|------------------------------- |-------------------------------|
| **Revenue:**               |                                  |                                  |                               |                               |
| Advertising                | $38,329                          | $31,498                          | $73,965                       | $59,599                       |
| Other revenue              | $389                             | $225                             | $769                          | $430                          |
| **Family of Apps**         | $38,718                          | $31,723                          | $74,734                       | $60,029                       |
| Reality Labs               | $353                             | $276                             | $793                          | $616                          |
| **Total revenue**          | $39,071                          | $31,999                          | $75,527                       | $60,645                       |
|                            |                                  |                                  |                               |                               |
| **Income (loss) from operations:** |                                  |                                  |                               |                               |
| Family of Apps             | $19,335                          | $13,131                          | $36,999                       | $24,351                       |
| Reality Labs               | $(4,488)                         | $(3,739)                         | $(8,334)                      | $(7,732)                      |
| **Total income from operations** | $14,847                          | $9,392                           | $28,665                       | $16,619                       |
```


### Create a Pandas dataframe to store text and updated table elements within

In [None]:
import pandas as pd
data = []

for c in elements:
  row = {}
  row['id'] = c.id
  row['text'] = c.text
  row['metadata'] = c.metadata.to_dict()
  row['embedding'] = c.embeddings
  data.append(row)

df_contextualized = pd.DataFrame(data)
df_contextualized.head()

Unnamed: 0,id,text,metadata,embedding
0,7673dd5dd3348ca922edfeb765c4f8ec,FACEBOOK\n\nNEWS RELEASE\n\nMeta Reports Secon...,"{'filetype': 'application/pdf', 'languages': [...","[0.05783402919769287, 0.015589269809424877, 0...."
1,60c0d519ed08e8be7410846bab6d76d4,### Comprehensive Description of the Table\n\n...,"{'last_modified': '2024-07-31T20:06:06', 'file...","[-0.01504612248390913, 0.0025319906417280436, ..."
2,1006ba147b4696dcfa364d82a7cc3ff9,Second Quarter 2024 Operational and Other Fina...,"{'filetype': 'application/pdf', 'languages': [...","[0.07872810959815979, 0.010767923668026924, 0...."
3,3ccaeebfca3cd0b37f89d9865ed86620,CFO Outlook Commentary\n\nWe expect third quar...,"{'filetype': 'application/pdf', 'languages': [...","[0.06507104635238647, 0.023085100576281548, -0..."
4,f83ab884e1b7f6cd22bb6fec166375de,About Meta\n\nMeta builds technologies that he...,"{'filetype': 'application/pdf', 'languages': [...","[0.044313088059425354, -0.008719196543097496, ..."


# 4. Define KDB.AI Session
KDB.AI comes in two offerings:

KDB.AI Cloud - For experimenting with smaller generative AI projects with a vector database in our cloud.
KDB.AI Server - For evaluating large scale generative AI applications on-premises or on your own cloud provider.
Depending on which you use there will be different setup steps and connection details required.

Option 1. KDB.AI Cloud
To use KDB.AI Cloud, you will need two session details - a URL endpoint and an API key. To get these you can sign up for free here.

You can connect to a KDB.AI Cloud session using kdbai.Session and passing the session URL endpoint and API key details from your KDB.AI Cloud portal.

If the environment variables KDBAI_ENDPOINTS and KDBAI_API_KEY exist on your system containing your KDB.AI Cloud portal details, these variables will automatically be used to connect. If these do not exist, it will prompt you to enter your KDB.AI Cloud portal session URL endpoint and API key details.

### Option 1. KDB.AI Cloud

Find KDB.AI API Key and Endpoint here: [KDB.AI](https://kdb.ai/)

In [None]:
# Check if KDBAI_ENDPOINT is in the environment variables
if "KDBAI_ENDPOINT" in os.environ:
    KDBAI_API_KEY = os.environ["KDBAI_ENDPOINT"]
else:
    # Prompt the user to enter the API key
    KDBAI_ENDPOINT = input("KDB.AI ENDPOINT: ")
    # Save the API key as an environment variable for the current session
    os.environ["KDBAI_ENDPOINT"] = KDBAI_ENDPOINT

In [None]:
#connect to KDB.AI
session = kdbai.Session(api_key=KDBAI_API_KEY, endpoint=KDBAI_ENDPOINT)

### Option 2. KDB.AI Server
To use KDB.AI Server, you will need download and run your own container. To do this, you will first need to sign up for free here.

You will receive an email with the required license file and bearer token needed to download your instance. Follow instructions in the signup email to get your session up and running.

Once the setup steps are complete you can then connect to your KDB.AI Server session using kdbai.Session and passing your local endpoint.

In [None]:
### start session with KDB.AI Server
#session = kdbai.Session()

# 5. Create Schema and KDB.AI Table

In [None]:
schema = [
    {'name': 'id', 'type': 'str'},
    {'name': 'text', 'type': 'bytes'},
    {'name': 'metadata', 'type': 'general'},
    {'name': 'embedding', 'type': 'float32s'}
]

indexes = [{'name': 'flat_index', 'column': 'embedding', 'type': 'flat', 'params': {'dims': 1536, 'metric': 'L2'}}]

### Here we create two tables, one containing the original table elements, the other containing the newly contextualized and formatted table elements

In [None]:
Contextualized_KDBAI_TABLE_NAME = "Contextualized_Table"
non_Contextualized_KDBAI_TABLE_NAME = "Non_Contextualized_Table"
database = session.database('default')

# First ensure the tables do not already exist
for table in database.tables:
    if table.name in [Contextualized_KDBAI_TABLE_NAME, non_Contextualized_KDBAI_TABLE_NAME]:
        table.drop()

#Create the tables
table_contextualized = database.create_table(Contextualized_KDBAI_TABLE_NAME, schema=schema, indexes=indexes)
table_non_contextualized = database.create_table(non_Contextualized_KDBAI_TABLE_NAME, schema=schema, indexes=indexes)

In [None]:
# Insert Elements into the KDB.AI Tables
table_contextualized.insert(df_contextualized)
table_non_contextualized.insert(df_non_contextualized)

'Insert successful'

In [None]:
# Check to see that the elements were inserted
table_contextualized.query()

Unnamed: 0,id,text,metadata,embedding
0,7673dd5dd3348ca922edfeb765c4f8ec,FACEBOOK\n\nNEWS RELEASE\n\nMeta Reports Secon...,"{'filetype': 'application/pdf', 'languages': [...","[0.05783403, 0.01558927, 0.012400267, 0.016885..."
1,60c0d519ed08e8be7410846bab6d76d4,### Comprehensive Description of the Table\n\n...,"{'last_modified': '2024-07-31T20:06:06', 'file...","[-0.0150461225, 0.0025319906, 0.03210723, -0.0..."
2,1006ba147b4696dcfa364d82a7cc3ff9,Second Quarter 2024 Operational and Other Fina...,"{'filetype': 'application/pdf', 'languages': [...","[0.07872811, 0.010767924, 0.032106556, 0.07415..."
3,3ccaeebfca3cd0b37f89d9865ed86620,CFO Outlook Commentary\n\nWe expect third quar...,"{'filetype': 'application/pdf', 'languages': [...","[0.06507105, 0.0230851, -0.002538579, 4.765945..."
4,f83ab884e1b7f6cd22bb6fec166375de,About Meta\n\nMeta builds technologies that he...,"{'filetype': 'application/pdf', 'languages': [...","[0.044313088, -0.008719197, -0.0012540965, 0.0..."
5,6523c2a592eaef3c9c62b13ff33414e6,Ryan Moore\n\npress@meta.com / about.fb.com/ne...,"{'filetype': 'application/pdf', 'languages': [...","[0.06850029, 0.012514475, 0.004099801, 0.05521..."
6,b288eb094301be949cb9c4d070135af5,For a discussion of limitations in the measure...,"{'filetype': 'application/pdf', 'languages': [...","[0.06132042, 0.026113747, 0.0795631, 0.0361159..."
7,69e526bdf29fb6ab25756fefa40b5439,Non-GAAP Financial Measures\n\nTo supplement o...,"{'filetype': 'application/pdf', 'languages': [...","[-0.00021458174, 0.037536662, 0.055788893, 0.0..."
8,46d3da1e2b140a713bd71105f2a371dc,For more information on our non-GAAP nancial m...,"{'filetype': 'application/pdf', 'languages': [...","[0.020250207, 0.012190874, 0.022683855, 0.0013..."
9,8bb87ad50ef536d5db255c8044036a10,### Comprehensive Description of the Table:\n\...,"{'last_modified': '2024-07-31T20:06:06', 'file...","[-0.015300479, 0.00063950155, 0.05090195, 0.00..."


# 6. Use LangChain and KDB.AI to Perform RAG!

In [None]:
# Define OpenAI embedding model for LangChain to embed the query
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# use KDBAI as vector store
vecdb_kdbai_contextualized = KDBAI(table_contextualized, embeddings)
vecdb_kdbai_non_contextualized = KDBAI(table_non_contextualized, embeddings)

In [None]:
# Define a Question/Answer LangChain chain
qabot_contextualized = RetrievalQA.from_chain_type(
    chain_type="stuff",
    llm=ChatOpenAI(model="gpt-4o"),
    retriever=vecdb_kdbai_contextualized.as_retriever(search_kwargs=dict(k=5)),
    return_source_documents=True,
)

qabot_non_contextualized = RetrievalQA.from_chain_type(
    chain_type="stuff",
    llm=ChatOpenAI(model="gpt-4o"),
    retriever=vecdb_kdbai_non_contextualized.as_retriever(search_kwargs=dict(k=5)),
    return_source_documents=True,
)

In [None]:
# Helper function to perform RAG
def RAG(query):
  print(query)
  print("-----")
  print("Contextualized")
  print("-----")
  print(qabot_contextualized.invoke(dict(query=query))["result"])
  print("-----")
  print("Non Contextualized")
  print("-----")
  print(qabot_non_contextualized.invoke(dict(query=query))["result"])


In [None]:
# Query the RAG chain!
RAG("What is the research and development costs for six months ended in June 2024")

What is the research and development costs for six months ended in June 2024
-----
Contextualized
-----
The research and development costs for Meta Platforms, Inc. for the six months ended June 30, 2024, were $20,515 million.
-----
Non Contextualized
-----
The research and development costs for the six months ended in June 2024 are $10,537 million.


In [None]:
# Query the RAG chain!
RAG("What is the research and development costs for six months ended in June 2023")

What is the research and development costs for six months ended in June 2023
-----
Contextualized
-----
The research and development costs for Meta Platforms, Inc. for the six months ended June 30, 2023, were $18,725 million.
-----
Non Contextualized
-----
The research and development costs for the six months ended June 2023 were $9.344 billion.


In [None]:
# Query the RAG chain!
RAG("what is the 2024 GAAP advertising Revenue in the three months ended June 30th? What about net cash by operating activies")

what is the 2024 GAAP advertising Revenue in the three months ended June 30th? What about net cash by operating activies
-----
Contextualized
-----
For the three months ended June 30, 2024, the GAAP advertising revenue was $38,329 million. The net cash provided by operating activities for the same period was $19,370 million.
-----
Non Contextualized
-----
The 2024 GAAP advertising revenue for the three months ended June 30th is $38,329 million. The net cash provided by operating activities for the same period is $19,370 million.


In [None]:
# Query the RAG chain!
RAG("What segment made the most money in the six months ended June 30th?")

What segment made the most money in the six months ended June 30th?
-----
Contextualized
-----
The "Family of Apps" segment made the most money in the six months ended June 30th, with a total revenue of $74,734 million.
-----
Non Contextualized
-----
Based on the provided context, the segment information for revenue is not explicitly broken down by segment for the six months ended June 30th. Therefore, it is not possible to determine which segment made the most money during that period from the given information.


In [None]:
# Query the RAG chain!
RAG("what is the three month costs and expensis for 2023?")

what is the three month costs and expensis for 2023?
-----
Contextualized
-----
The three-month costs and expenses for Meta Platforms, Inc. for the period ended June 30, 2023, were $22,607 million.
-----
Non Contextualized
-----
The three-month costs and expenses for 2023 are $22,607 million.


In [None]:
# Query the RAG chain!
RAG("At the end of 2023, what was the value of Meta's Goodwill assets?")

At the end of 2023, what was the value of Meta's Goodwill assets?
-----
Contextualized
-----
At the end of 2023, the value of Meta's Goodwill assets was $20,654 million.
-----
Non Contextualized
-----
At the end of 2023, the value of Meta's Goodwill assets was $20,654 million.


In [None]:
# Query the RAG chain!
RAG("Given a sentiment score between 1 and 10 for the outlook? Explain your reasoning")

Given a sentiment score between 1 and 10 for the outlook? Explain your reasoning
-----
Contextualized
-----
Based on the provided financial metrics and their changes over the reported periods, I would assign a sentiment score of **8** for the outlook of Meta Platforms, Inc. Here’s the reasoning:

### Positive Indicators:
1. **Revenue Growth**: Revenue increased by 22% from Q2 2023 to Q2 2024, indicating strong top-line growth.
2. **Income from Operations**: This surged by 58%, showing improved operational efficiency and higher profitability.
3. **Operating Margin**: Improved significantly from 29% to 38%, suggesting better cost management and higher profitability per dollar of revenue.
4. **Net Income**: Increased by 73%, reflecting robust bottom-line growth.
5. **Diluted EPS**: Similarly, this metric rose by 73%, indicating higher earnings per share for shareholders.
6. **Total Stockholders' Equity**: Increased from $153,168 million at the end of 2023 to $156,763 million by mid-2024, 

### Conclusion: We see that there are several situations where the non-contextualized response is incorrect and the contextualized response is correct. We also see there are some situations where they are both correct. In general, the more complex your tables and the more tables you have, the more advantageous this method becomes.

### Delete the KDB.AI Tables
Once finished with the table, it is best practice to drop it.

In [None]:
table_contextualized.drop()
table_non_contextualized.drop()

#### Take Our Survey
We hope you found this sample helpful! Your feedback is important to us, and we would appreciate it if you could take a moment to fill out our brief survey. Your input helps us improve our content.

Take the [Survey](https://delighted.com/t/U2RoT32R)