# Building a Natively Multimodal RAG Pipeline (over a Slide Deck)

<a href="https://colab.research.google.com/github/run-llama/llama_parse/blob/main/examples/multimodal/multimodal_rag_slide_deck.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this cookbook we show you how to build a multimodal RAG pipeline over a slide deck, with text, tables, images, diagrams, and complex layouts.

A gap of text-based RAG is that they struggle with purely text-based representations of complex documents. For instance, if a page contains a lot of images and diagrams, a text parser would need to rely on raw OCR to extract out text. You can also use a multimodal model (e.g. gpt-4o and up) to do text extraction, but this is inherently a lossy conversion.

Instead a **native multimodal pipeline** stores both a text and image representation of a document chunk. They are indexed via embeddings (text or image), and during synthesis both text and image are directly fed to the multimodal model for synthesis.

This can have the following advantages:
- **Robustness**: This solution is more robust than a pure text or even a pure image-based approach. In a pure text RAG approach, the parsing piece can be lossy. In a pure image-based approach, multimodal OCR is not perfect and may lose out against text parsing for text-heavy documents.
- **Cost Optimization**: You may choose to dynamically include text-only, or text + image depending on the content of the page.

![mm_rag_diagram](./multimodal_rag_slide_deck_img.png)

## Setup

In [1]:
import nest_asyncio

nest_asyncio.apply()

### Setup Observability

We setup an integration with LlamaTrace (integration with Arize).

If you haven't already done so, make sure to create an account here: https://llamatrace.com/login. Then create an API key and put it in the `PHOENIX_API_KEY` variable below.

In [2]:
# !pip install -U llama-index-callbacks-arize-phoenix

In [3]:
# setup Arize Phoenix for logging/observability
import llama_index.core
import os

PHOENIX_API_KEY = os.getenv('PHOENIX_API_KEY')
LLAMA_API_KEY = os.getenv('LLAMA_API_KEY')
os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = f"api_key={PHOENIX_API_KEY}"
llama_index.core.set_global_handler(
    "arize_phoenix", endpoint="https://llamatrace.com/v1/traces"
)

### Load Data

Here we load the [Conoco Phillips 2023 investor meeting slide deck](https://static.conocophillips.com/files/2023-conocophillips-aim-presentation.pdf).

In [4]:
!mkdir data
!mkdir data_images
# !wget "https://static.conocophillips.com/files/2023-conocophillips-aim-presentation.pdf" -O data/conocophillips.pdf

A subdirectory or file data already exists.


### Model Setup

Setup models that will be used for downstream orchestration.

In [5]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

embed_model = OpenAIEmbedding(model="text-embedding-3-large")
llm = OpenAI(model="gpt-4o")

Settings.embed_model = embed_model
Settings.llm = llm

## Use LlamaParse to Parse Text and Images

In this example, use LlamaParse to parse both the text and images from the document.

We parse out the text in two ways: 
- in regular `text` mode using our default text layout algorithm
- in `markdown` mode using GPT-4o (`gpt4o_mode=True`). This also allows us to capture page screenshots

In [6]:
from llama_parse import LlamaParse


parser_text = LlamaParse(result_type="text",api_key=LLAMA_API_KEY)
parser_gpt4o = LlamaParse(result_type="markdown", gpt4o_mode=True,api_key=LLAMA_API_KEY)

In [7]:
print(f"Parsing text...")
docs_text = parser_text.load_data("data/conocophillips.pdf")
print(f"Parsing PDF file...")
md_json_objs = parser_gpt4o.get_json_result("data/conocophillips.pdf")
md_json_list = md_json_objs[0]["pages"]

Parsing text...


INFO:httpx:HTTP Request: POST https://api.cloud.llamaindex.ai/api/parsing/upload "HTTP/1.1 200 OK"


Started parsing the file under job_id dd0db25d-bd48-4225-ad45-f3392a2c59f7


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/dd0db25d-bd48-4225-ad45-f3392a2c59f7 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/dd0db25d-bd48-4225-ad45-f3392a2c59f7 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/dd0db25d-bd48-4225-ad45-f3392a2c59f7 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/dd0db25d-bd48-4225-ad45-f3392a2c59f7 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/dd0db25d-bd48-4225-ad45-f3392a2c59f7 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/dd0db25d-bd48-4225-ad45-f3392a2c59f7 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/dd0db25d-bd48-4225-ad45-f3392a2c59f7 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job

Parsing PDF file...


INFO:httpx:HTTP Request: POST https://api.cloud.llamaindex.ai/api/parsing/upload "HTTP/1.1 200 OK"


Started parsing the file under job_id 56ab6860-efb8-4007-9f62-2485d261a049


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049 "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job

In [8]:
print(docs_text[10].get_content())

Commitment to Disciplined Reinvestment Rate
                         Industry                    ConocoPhillips
                                                     Strategy Reset                   Disciplined Reinvestment Rate is the Foundation for Superior
                      Growth Focus                                                    Returns on and of Capital, while Driving Durable CFO Growth
                             100%                           <60%                                        50%                 6%         at $60/BBL WTI
                       Reinvestment Rate               Reinvestment Rate                          Reinvestment Rate10-YearCFO CAGR          Planning PriceMid-Cycle
                                                                                                                         2024-2032
    2   100%
    1    75%
    1    50%
    1                                                                                                          

In [9]:
print(md_json_list[10]["md"])

# Commitment to Disciplined Reinvestment Rate

| Year Range   | ConocoPhillips Average Annual Reinvestment Rate (%) | Reinvestment Rate at $60/BBL WTI | Reinvestment Rate at $80/BBL WTI | WTI Average Price |
|--------------|-----------------------------------------------------|---------------------------------|---------------------------------|-------------------|
| 2012-2016    | >100%                                               |                                 |                                 | ~$75/BBL          |
| 2017-2022    | <60%                                                |                                 |                                 | ~$63/BBL          |
| 2023E        |                                                     |                                 |                                 | at $80/BBL        |
| 2024-2028    |                                                     | ~50%                            |                                 | at $60/BBL       

In [10]:
print(md_json_list[1].keys())

dict_keys(['page', 'md', 'images', 'items'])


In [11]:
md_json_list[0]

{'page': 1,
 'md': '# ConocoPhillips\n\n## 2023 Analyst & Investor Meeting',
 'images': [{'name': 'page-0.jpg',
   'height': 0,
   'width': 0,
   'x': 0,
   'y': 0,
   'type': 'full_page_screenshot'}],
 'items': [{'type': 'heading',
   'lvl': 1,
   'value': 'ConocoPhillips',
   'md': '# ConocoPhillips'},
  {'type': 'heading',
   'lvl': 2,
   'value': '2023 Analyst & Investor Meeting',
   'md': '## 2023 Analyst & Investor Meeting'}]}

In [12]:
image_dicts = parser_gpt4o.get_images(md_json_objs, download_path="data_images")

> Image for page 1: [{'name': 'page-0.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-0.jpg "HTTP/1.1 200 OK"


> Image for page 2: [{'name': 'page-1.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-1.jpg "HTTP/1.1 200 OK"


> Image for page 3: [{'name': 'page-2.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-2.jpg "HTTP/1.1 200 OK"


> Image for page 4: [{'name': 'page-3.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-3.jpg "HTTP/1.1 200 OK"


> Image for page 5: [{'name': 'page-4.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-4.jpg "HTTP/1.1 200 OK"


> Image for page 6: [{'name': 'page-5.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-5.jpg "HTTP/1.1 200 OK"


> Image for page 7: [{'name': 'page-6.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-6.jpg "HTTP/1.1 200 OK"


> Image for page 8: [{'name': 'page-7.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-7.jpg "HTTP/1.1 200 OK"


> Image for page 9: [{'name': 'page-8.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-8.jpg "HTTP/1.1 200 OK"


> Image for page 10: [{'name': 'page-9.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-9.jpg "HTTP/1.1 200 OK"


> Image for page 11: [{'name': 'page-10.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-10.jpg "HTTP/1.1 200 OK"


> Image for page 12: [{'name': 'page-11.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-11.jpg "HTTP/1.1 200 OK"


> Image for page 13: [{'name': 'page-12.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-12.jpg "HTTP/1.1 200 OK"


> Image for page 14: [{'name': 'page-13.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-13.jpg "HTTP/1.1 200 OK"


> Image for page 15: [{'name': 'page-14.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-14.jpg "HTTP/1.1 200 OK"


> Image for page 16: [{'name': 'page-15.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-15.jpg "HTTP/1.1 200 OK"


> Image for page 17: [{'name': 'page-16.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-16.jpg "HTTP/1.1 200 OK"


> Image for page 18: [{'name': 'page-17.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-17.jpg "HTTP/1.1 200 OK"


> Image for page 19: [{'name': 'page-18.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-18.jpg "HTTP/1.1 200 OK"


> Image for page 20: [{'name': 'page-19.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-19.jpg "HTTP/1.1 200 OK"


> Image for page 21: [{'name': 'page-20.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-20.jpg "HTTP/1.1 200 OK"


> Image for page 22: [{'name': 'page-21.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-21.jpg "HTTP/1.1 200 OK"


> Image for page 23: [{'name': 'page-22.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-22.jpg "HTTP/1.1 200 OK"


> Image for page 24: [{'name': 'page-23.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-23.jpg "HTTP/1.1 200 OK"


> Image for page 25: [{'name': 'page-24.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-24.jpg "HTTP/1.1 200 OK"


> Image for page 26: [{'name': 'page-25.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-25.jpg "HTTP/1.1 200 OK"


> Image for page 27: [{'name': 'page-26.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-26.jpg "HTTP/1.1 200 OK"


> Image for page 28: [{'name': 'page-27.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-27.jpg "HTTP/1.1 200 OK"


> Image for page 29: [{'name': 'page-28.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-28.jpg "HTTP/1.1 200 OK"


> Image for page 30: [{'name': 'page-29.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-29.jpg "HTTP/1.1 200 OK"


> Image for page 31: [{'name': 'page-30.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-30.jpg "HTTP/1.1 200 OK"


> Image for page 32: [{'name': 'page-31.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-31.jpg "HTTP/1.1 200 OK"


> Image for page 33: [{'name': 'page-32.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-32.jpg "HTTP/1.1 200 OK"


> Image for page 34: [{'name': 'page-33.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-33.jpg "HTTP/1.1 200 OK"


> Image for page 35: [{'name': 'page-34.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-34.jpg "HTTP/1.1 200 OK"


> Image for page 36: [{'name': 'page-35.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-35.jpg "HTTP/1.1 200 OK"


> Image for page 37: [{'name': 'page-36.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-36.jpg "HTTP/1.1 200 OK"


> Image for page 38: [{'name': 'page-37.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-37.jpg "HTTP/1.1 200 OK"


> Image for page 39: [{'name': 'page-38.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-38.jpg "HTTP/1.1 200 OK"


> Image for page 40: [{'name': 'page-39.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-39.jpg "HTTP/1.1 200 OK"


> Image for page 41: [{'name': 'page-40.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-40.jpg "HTTP/1.1 200 OK"


> Image for page 42: [{'name': 'page-41.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-41.jpg "HTTP/1.1 200 OK"


> Image for page 43: [{'name': 'page-42.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-42.jpg "HTTP/1.1 200 OK"


> Image for page 44: [{'name': 'page-43.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-43.jpg "HTTP/1.1 200 OK"


> Image for page 45: [{'name': 'page-44.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-44.jpg "HTTP/1.1 200 OK"


> Image for page 46: [{'name': 'page-45.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-45.jpg "HTTP/1.1 200 OK"


> Image for page 47: [{'name': 'page-46.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-46.jpg "HTTP/1.1 200 OK"


> Image for page 48: [{'name': 'page-47.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-47.jpg "HTTP/1.1 200 OK"


> Image for page 49: [{'name': 'page-48.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-48.jpg "HTTP/1.1 200 OK"


> Image for page 50: [{'name': 'page-49.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-49.jpg "HTTP/1.1 200 OK"


> Image for page 51: [{'name': 'page-50.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-50.jpg "HTTP/1.1 200 OK"


> Image for page 52: [{'name': 'page-51.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-51.jpg "HTTP/1.1 200 OK"


> Image for page 53: [{'name': 'page-52.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-52.jpg "HTTP/1.1 200 OK"


> Image for page 54: [{'name': 'page-53.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-53.jpg "HTTP/1.1 200 OK"


> Image for page 55: [{'name': 'page-54.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-54.jpg "HTTP/1.1 200 OK"


> Image for page 56: [{'name': 'page-55.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-55.jpg "HTTP/1.1 200 OK"


> Image for page 57: [{'name': 'page-56.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-56.jpg "HTTP/1.1 200 OK"


> Image for page 58: [{'name': 'page-57.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-57.jpg "HTTP/1.1 200 OK"


> Image for page 59: [{'name': 'page-58.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-58.jpg "HTTP/1.1 200 OK"


> Image for page 60: [{'name': 'page-59.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-59.jpg "HTTP/1.1 200 OK"


> Image for page 61: [{'name': 'page-60.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-60.jpg "HTTP/1.1 200 OK"


> Image for page 62: [{'name': 'page-61.jpg', 'height': 0, 'width': 0, 'x': 0, 'y': 0, 'type': 'full_page_screenshot'}]


INFO:httpx:HTTP Request: GET https://api.cloud.llamaindex.ai/api/parsing/job/56ab6860-efb8-4007-9f62-2485d261a049/result/image/page-61.jpg "HTTP/1.1 200 OK"


## Build Multimodal Index

In this section we build the multimodal index over the parsed deck. 

We do this by creating **text** nodes from the document that contain metadata referencing the original image path.

In this example we're indexing the text node for retrieval. The text node has a reference to both the parsed text as well as the image screenshot.

#### Get Text Nodes

In [13]:
from llama_index.core.schema import TextNode
from typing import Optional

In [14]:
# get pages loaded through llamaparse
import re
from pathlib import Path

def get_page_number(file_name):
    match = re.search(r"-page-(\d+)\.jpg$", str(file_name))
    if match:
        return int(match.group(1))
    return 0


def _get_sorted_image_files(image_dir):
    """Get image files sorted by page."""
    raw_files = [f for f in list(Path(image_dir).iterdir()) if f.is_file()]
    sorted_files = sorted(raw_files, key=get_page_number)
    return sorted_files

In [15]:
from copy import deepcopy

# attach image metadata to the text nodes
def get_text_nodes(docs, image_dir=None, json_dicts=None):
    """Split docs into nodes, by separator."""
    nodes = []

    image_files = _get_sorted_image_files(image_dir) if image_dir is not None else None
    md_texts = [d["md"] for d in json_dicts] if json_dicts is not None else None

    doc_chunks = [docs[i].text.split("---") for i in range(len(docs))]
    for idx, doc_chunk in enumerate(doc_chunks):
        chunk_metadata = {"page_num": idx + 1}
        if image_files is not None:
            image_file = image_files[idx]
            chunk_metadata["image_path"] = str(image_file)
        if md_texts is not None:
            chunk_metadata["parsed_text_markdown"] = md_texts[idx]
        chunk_metadata["parsed_text"] = doc_chunk
        node = TextNode(
            text="",
            metadata=chunk_metadata,
        )
        nodes.append(node)

    return nodes

In [16]:
docs_text[0].text

'ConocoPhillips\n                2023 Analyst & Investor Meeting'

In [17]:
len(md_json_list)

62

In [18]:
# this will split into pages
text_nodes = get_text_nodes(docs_text, image_dir="data_images", json_dicts=md_json_list)

In [19]:
len(text_nodes)

62

In [20]:
text_nodes[10]

TextNode(id_='6b514efe-9cf8-4e06-970c-e32f7357cd0c', embedding=None, metadata={'page_num': 11, 'image_path': 'data_images\\56ab6860-efb8-4007-9f62-2485d261a049-page-10.jpg', 'parsed_text_markdown': '# Commitment to Disciplined Reinvestment Rate\n\n| Year Range   | ConocoPhillips Average Annual Reinvestment Rate (%) | Reinvestment Rate at $60/BBL WTI | Reinvestment Rate at $80/BBL WTI | WTI Average Price |\n|--------------|-----------------------------------------------------|---------------------------------|---------------------------------|-------------------|\n| 2012-2016    | >100%                                               |                                 |                                 | ~$75/BBL          |\n| 2017-2022    | <60%                                                |                                 |                                 | ~$63/BBL          |\n| 2023E        |                                                     |                                 |      

In [21]:
print(text_nodes[10].get_content(metadata_mode="all"))

page_num: 11
image_path: data_images\56ab6860-efb8-4007-9f62-2485d261a049-page-10.jpg
parsed_text_markdown: # Commitment to Disciplined Reinvestment Rate

| Year Range   | ConocoPhillips Average Annual Reinvestment Rate (%) | Reinvestment Rate at $60/BBL WTI | Reinvestment Rate at $80/BBL WTI | WTI Average Price |
|--------------|-----------------------------------------------------|---------------------------------|---------------------------------|-------------------|
| 2012-2016    | >100%                                               |                                 |                                 | ~$75/BBL          |
| 2017-2022    | <60%                                                |                                 |                                 | ~$63/BBL          |
| 2023E        |                                                     |                                 |                                 | at $80/BBL        |
| 2024-2028    |                                

#### Build Index

Once the text nodes are ready, we feed into our vector store index abstraction, which will index these nodes into a simple in-memory vector store (of course, you should definitely check out our 40+ vector store integrations!)

In [22]:
import os
from llama_index.core import (
    StorageContext,
    VectorStoreIndex,
    load_index_from_storage,
)

if not os.path.exists("storage_nodes"):
    index = VectorStoreIndex(text_nodes, embed_model=embed_model)
    # save index to disk
    index.set_index_id("vector_index")
    index.storage_context.persist("./storage_nodes")
else:
    # rebuild storage context
    storage_context = StorageContext.from_defaults(persist_dir="storage_nodes")
    # load index
    index = load_index_from_storage(storage_context, index_id="vector_index")

retriever = index.as_retriever()

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


## Build Multimodal Query Engine

We now use LlamaIndex abstractions to build a **custom query engine**. In contrast to a standard RAG query engine that will retrieve the text node and only put that into the prompt (response synthesis module), this custom query engine will also load the image document, and put both the text and image document into the response synthesis module.

In [99]:
from pydantic import BaseModel, Field
from typing import Annotated,List
from llama_index.core.schema import ImageNode, NodeWithScore, MetadataMode

class RetrieverResponse(BaseModel):
    response: Annotated[str, Field(description="Response from LLM")]
    source_nodes: Annotated[List[NodeWithScore], Field(description="List of source nodes")]
    metadata: Annotated[dict, Field(description="List of metadata containing text_nodes and image_nodes as keys")]



In [114]:
from llama_index.core.query_engine import CustomQueryEngine, SimpleMultiModalQueryEngine
from llama_index.core.retrievers import BaseRetriever
from llama_index.multi_modal_llms.openai import OpenAIMultiModal
from llama_index.core.schema import ImageNode, NodeWithScore, MetadataMode
from llama_index.core.prompts import PromptTemplate
from llama_index.core.base.response.schema import Response
from typing import Optional


gpt_4o = OpenAIMultiModal(model="gpt-4o", max_new_tokens=4096)

QA_PROMPT_TMPL = """\
Below we give parsed text from slides in two different formats, as well as the image.

We parse the text in both 'markdown' mode as well as 'raw text' mode. Markdown mode attempts \
to convert relevant diagrams into tables, whereas raw text tries to maintain the rough spatial \
layout of the text.

Use the image information first and foremost. ONLY use the text/markdown information 
if you can't understand the image.

---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, answer the query. Explain whether you got the answer
from the parsed markdown or raw text or image, and if there's discrepancies, add your reasoning for the final answer.

Query: {query_str}
Answer: """

QA_PROMPT = PromptTemplate(QA_PROMPT_TMPL)


class MultimodalQueryEngine(CustomQueryEngine):
    """Custom multimodal Query Engine.

    Takes in a retriever to retrieve a set of document nodes.
    Also takes in a prompt template and multimodal model.

    """

    qa_prompt: PromptTemplate
    retriever: BaseRetriever
    multi_modal_llm: OpenAIMultiModal
    results: List[Response] = []

    def __init__(self, qa_prompt: Optional[PromptTemplate] = None, **kwargs) -> None:
        """Initialize."""
        super().__init__(qa_prompt=qa_prompt or QA_PROMPT, **kwargs)
        # self.results = []

    def custom_query(self, query_str: str) -> RetrieverResponse:
        # retrieve text nodes
        nodes = self.retriever.retrieve(query_str)
        print(f"Retrieved Nodes: {nodes}")  # Debug print
        
        # Check if any nodes are retrieved
        if not nodes:
            print("No nodes retrieved.")
            return Response(response="No relevant data found.", source_nodes=[], metadata=None)
        
        # create ImageNode items from text nodes
        image_nodes = [
            NodeWithScore(node=ImageNode(image_path=n.metadata.get("image_path", "")))
            for n in nodes if "image_path" in n.metadata
        ]
        print(f"Image Nodes: {image_nodes}")  # Debug print
    
        # create context string from text nodes, dump into the prompt
        context_str = "\n\n".join(
            [r.get_content(metadata_mode=MetadataMode.LLM) for r in nodes]
        )
        fmt_prompt = self.qa_prompt.format(context_str=context_str, query_str=query_str)
    
        # synthesize an answer from formatted text and images
        llm_response = self.multi_modal_llm.complete(
            prompt=fmt_prompt,
            image_documents=[image_node.node for image_node in image_nodes],
        )
        
        res = Response(
        response=str(llm_response),
        source_nodes=nodes,
        metadata={"text_nodes": nodes, "image_nodes": image_nodes},
        )
        print(f"Response Text: {res.response}")
        print(f"Source Nodes: {res.source_nodes}")
        print(f"Metadata: {res.metadata}")
        self.results.append(res)

        return res


In [115]:
query_engine = MultimodalQueryEngine(
    retriever=index.as_retriever(similarity_top_k=9), multi_modal_llm=gpt_4o
)

### Define Baseline

In addition, we define a "baseline" where we rely only on text-based indexing. Here we define an index using only the nodes that are parsed in text-mode from LlamaParse. 

**NOTE**: We don't currently include the markdown-parsed text because that was parsed with GPT-4o, so already uses a multimodal model during the text extraction phase.

It is of course a valid experiment to compare RAG where multimodal extraction only happens during indexing, vs. the current multimodal RAG implementation where images are fed during synthesis to the LLM. 

In [116]:
def get_nodes(docs):
    """Split docs into nodes, by separator."""
    nodes = []
    for doc in docs:
        doc_chunks = doc.text.split("\n---\n")
        for doc_chunk in doc_chunks:
            node = TextNode(
                text=doc_chunk,
                metadata=deepcopy(doc.metadata),
            )
            nodes.append(node)

    return nodes

In [117]:
base_nodes = get_nodes(docs_text)

In [118]:
len(base_nodes)

62

In [119]:
print(base_nodes[13].get_content(metadata_mode="all"))

Our Differentiated Portfolio: Deep; Durable and Diverse
                              20 BBOE of Resource                                           Diverse Production Base
                            Under $40/BBL Cost of Supply                              10-Year Plan Cumulative Production (BBOE)
      S50                   S32/BBL                                                Lower 48                           Alaska
                    Average Cost of Supply
  3 $40                                                                                                                       GKA        GWA
                                                                                                                      GPA     WNS
      $30                                                                                                             EMENA
  3                                                                                                                              Norway
 

In [120]:
base_index = VectorStoreIndex(base_nodes, embed_model=embed_model)
base_query_engine = base_index.as_query_engine(llm=llm, similarity_top_k=9)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
ERROR:opentelemetry.sdk.trace.export:Exception while exporting Span.
urllib3.exceptions.SSLError: EOF occurred in violation of protocol (_ssl.c:2426)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\Users\goldr\anaconda3\envs\myenv\lib\site-packages\requests\adapters.py", line 667, in send
    resp = conn.urlopen(
  File "c:\Users\goldr\anaconda3\envs\myenv\lib\site-packages\urllib3\connectionpool.py", line 843, in urlopen
    retries = retries.increment(
  File "c:\Users\goldr\anaconda3\envs\myenv\lib\site-packages\urllib3\util\retry.py", line 519, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='llamatrace.com', port=443): Max retries exceeded with url: /v1/traces (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of p

## Build a Multimodal Agent

Build an agent around the multimodal query engine. This gives you agent capabilities like query planning/decomposition and memory around a central QA interface.

In [121]:
from llama_index.core.tools import QueryEngineTool
from llama_index.core.agent import FunctionCallingAgentWorker


vector_tool = QueryEngineTool.from_defaults(
    query_engine=query_engine,
    name="vector_tool",
    description=(
        "Useful for retrieving specific context from the data. Do NOT select if question asks for a summary of the data."
    ),
)
agent = FunctionCallingAgentWorker.from_tools(
    [vector_tool], llm=llm, verbose=True
).as_agent()

In [122]:
# define a similar agent for the baseline
base_vector_tool = QueryEngineTool.from_defaults(
    query_engine=base_query_engine,
    name="vector_tool",
    description=(
        "Useful for retrieving specific context from the data. Do NOT select if question asks for a summary of the data."
    ),
)
base_agent = FunctionCallingAgentWorker.from_tools(
    [base_vector_tool], llm=llm, verbose=True
).as_agent()

## Try out Queries

Let's try out queries against these documents and compare against each other.

In [123]:
# response = agent.query("Tell me about the different regions and subregions where Conoco Phillips has a production base.")
response = agent.query(
    "How does the Conoco Phillips capex/EUR in the delaware basin compare against other competitors?"
)
print(str(response))

Added user message to memory: How does the Conoco Phillips capex/EUR in the delaware basin compare against other competitors?


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


=== Calling Function ===
Calling function: vector_tool with args: {"input": "Conoco Phillips capex/EUR in the delaware basin"}


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


Retrieved Nodes: [NodeWithScore(node=TextNode(id_='5e2f3432-30a1-42df-9b2c-0d8ee6d29239', embedding=None, metadata={'page_num': 38, 'image_path': 'data_images\\56ab6860-efb8-4007-9f62-2485d261a049-page-37.jpg', 'parsed_text_markdown': '# Delaware: Vast Inventory with Proven Track Record of Performance\n\n## Prolific Acreage Spanning Over ~659,000 Net Acres¹\n\n![Map of Texas and New Mexico highlighting Delaware Basin and Midland Basin]\n\n### Total 10-Year Operated Permian Inventory\n\n| Basin          | Percentage |\n|----------------|-------------|\n| Delaware Basin | 65%         |\n| Midland Basin  |             |\n\n### 12-Month Cumulative Production³ (BOE/FT)\n\n| Months | 2019 | 2020 | 2021 | 2022 |\n|--------|------|------|------|------|\n| 1      | 0    | 0    | 0    | 0    |\n| 2      | 5    | 6    | 7    | 8    |\n| 3      | 10   | 12   | 14   | 16   |\n| 4      | 15   | 18   | 21   | 24   |\n| 5      | 20   | 24   | 28   | 32   |\n| 6      | 25   | 30   | 35   | 40   |\n| 7 

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Response Text: ConocoPhillips' capex/EUR in the Delaware Basin is $10/BOE.

This information was obtained from the image on page 38, which shows a bar chart comparing the capex/EUR of ConocoPhillips with other competitors in the Delaware Basin. The parsed markdown text also confirms this value.
Source Nodes: [NodeWithScore(node=TextNode(id_='5e2f3432-30a1-42df-9b2c-0d8ee6d29239', embedding=None, metadata={'page_num': 38, 'image_path': 'data_images\\56ab6860-efb8-4007-9f62-2485d261a049-page-37.jpg', 'parsed_text_markdown': '# Delaware: Vast Inventory with Proven Track Record of Performance\n\n## Prolific Acreage Spanning Over ~659,000 Net Acres¹\n\n![Map of Texas and New Mexico highlighting Delaware Basin and Midland Basin]\n\n### Total 10-Year Operated Permian Inventory\n\n| Basin          | Percentage |\n|----------------|-------------|\n| Delaware Basin | 65%         |\n| Midland Basin  |             |\n\n### 12-Month Cumulative Production³ (BOE/FT)\n\n| Months | 2019 | 2020 | 2021 |

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


Retrieved Nodes: [NodeWithScore(node=TextNode(id_='5e2f3432-30a1-42df-9b2c-0d8ee6d29239', embedding=None, metadata={'page_num': 38, 'image_path': 'data_images\\56ab6860-efb8-4007-9f62-2485d261a049-page-37.jpg', 'parsed_text_markdown': '# Delaware: Vast Inventory with Proven Track Record of Performance\n\n## Prolific Acreage Spanning Over ~659,000 Net Acres¹\n\n![Map of Texas and New Mexico highlighting Delaware Basin and Midland Basin]\n\n### Total 10-Year Operated Permian Inventory\n\n| Basin          | Percentage |\n|----------------|-------------|\n| Delaware Basin | 65%         |\n| Midland Basin  |             |\n\n### 12-Month Cumulative Production³ (BOE/FT)\n\n| Months | 2019 | 2020 | 2021 | 2022 |\n|--------|------|------|------|------|\n| 1      | 0    | 0    | 0    | 0    |\n| 2      | 5    | 6    | 7    | 8    |\n| 3      | 10   | 12   | 14   | 16   |\n| 4      | 15   | 18   | 21   | 24   |\n| 5      | 20   | 24   | 28   | 32   |\n| 6      | 25   | 30   | 35   | 40   |\n| 7 

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Response Text: The capex/EUR of competitors in the Delaware Basin is provided in the image on page 38. The relevant information is displayed in a bar chart comparing ConocoPhillips to its competitors. Here is the data extracted from the image:

### Delaware Basin Well Capex/EUR ($/BOE)
- **ConocoPhillips**: $10
- **Competitor 1**: $12
- **Competitor 2**: $14
- **Competitor 3**: $16
- **Competitor 4**: $18
- **Competitor 5**: $20
- **Competitor 6**: $22
- **Competitor 7**: $24
- **Competitor 8**: $26

This information was confirmed from the image provided, which clearly shows the capex/EUR values for ConocoPhillips and its competitors in the Delaware Basin.
Source Nodes: [NodeWithScore(node=TextNode(id_='5e2f3432-30a1-42df-9b2c-0d8ee6d29239', embedding=None, metadata={'page_num': 38, 'image_path': 'data_images\\56ab6860-efb8-4007-9f62-2485d261a049-page-37.jpg', 'parsed_text_markdown': '# Delaware: Vast Inventory with Proven Track Record of Performance\n\n## Prolific Acreage Spanning Ove

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


=== LLM Response ===
ConocoPhillips' capex per EUR (Expected Ultimate Recovery) in the Delaware Basin is $10 per BOE (Barrel of Oil Equivalent). When compared to its competitors, ConocoPhillips has a lower capex/EUR, indicating more efficient capital expenditure relative to the amount of oil expected to be recovered. Here is the comparison:

### Delaware Basin Well Capex/EUR ($/BOE)
- **ConocoPhillips**: $10
- **Competitor 1**: $12
- **Competitor 2**: $14
- **Competitor 3**: $16
- **Competitor 4**: $18
- **Competitor 5**: $20
- **Competitor 6**: $22
- **Competitor 7**: $24
- **Competitor 8**: $26

This data shows that ConocoPhillips has the lowest capex/EUR among the listed competitors in the Delaware Basin.
ConocoPhillips' capex per EUR (Expected Ultimate Recovery) in the Delaware Basin is $10 per BOE (Barrel of Oil Equivalent). When compared to its competitors, ConocoPhillips has a lower capex/EUR, indicating more efficient capital expenditure relative to the amount of oil expected t

In [77]:
response

Response(response="ConocoPhillips' capital expenditure per estimated ultimate recovery (capex/EUR) in the Delaware Basin is $10 per barrel of oil equivalent (BOE). In comparison, its competitors have a capex/EUR ranging from $12 to $26 per BOE. This indicates that ConocoPhillips has a more cost-efficient operation in the Delaware Basin relative to its competitors.", source_nodes=[], metadata=None)

In [124]:
query_engine.results

[Response(response="ConocoPhillips' capex/EUR in the Delaware Basin is $10/BOE.\n\nThis information was obtained from the image on page 38, which shows a bar chart comparing the capex/EUR of ConocoPhillips with other competitors in the Delaware Basin. The parsed markdown text also confirms this value.", source_nodes=[NodeWithScore(node=TextNode(id_='5e2f3432-30a1-42df-9b2c-0d8ee6d29239', embedding=None, metadata={'page_num': 38, 'image_path': 'data_images\\56ab6860-efb8-4007-9f62-2485d261a049-page-37.jpg', 'parsed_text_markdown': '# Delaware: Vast Inventory with Proven Track Record of Performance\n\n## Prolific Acreage Spanning Over ~659,000 Net Acres¹\n\n![Map of Texas and New Mexico highlighting Delaware Basin and Midland Basin]\n\n### Total 10-Year Operated Permian Inventory\n\n| Basin          | Percentage |\n|----------------|-------------|\n| Delaware Basin | 65%         |\n| Midland Basin  |             |\n\n### 12-Month Cumulative Production³ (BOE/FT)\n\n| Months | 2019 | 2020 

In [127]:
query_engine.results[0].source_nodes[0].get_content(metadata_mode="all")

'page_num: 38\nimage_path: data_images\\56ab6860-efb8-4007-9f62-2485d261a049-page-37.jpg\nparsed_text_markdown: # Delaware: Vast Inventory with Proven Track Record of Performance\n\n## Prolific Acreage Spanning Over ~659,000 Net Acres¹\n\n![Map of Texas and New Mexico highlighting Delaware Basin and Midland Basin]\n\n### Total 10-Year Operated Permian Inventory\n\n| Basin          | Percentage |\n|----------------|-------------|\n| Delaware Basin | 65%         |\n| Midland Basin  |             |\n\n### 12-Month Cumulative Production³ (BOE/FT)\n\n| Months | 2019 | 2020 | 2021 | 2022 |\n|--------|------|------|------|------|\n| 1      | 0    | 0    | 0    | 0    |\n| 2      | 5    | 6    | 7    | 8    |\n| 3      | 10   | 12   | 14   | 16   |\n| 4      | 15   | 18   | 21   | 24   |\n| 5      | 20   | 24   | 28   | 32   |\n| 6      | 25   | 30   | 35   | 40   |\n| 7      | 30   | 36   | 42   | 48   |\n| 8      | 35   | 42   | 49   | 56   |\n| 9      | 40   | 48   | 56   | 64   |\n| 10    

In [None]:
# print(response.source_nodes[0].get_content(metadata_mode="all"))

In [129]:
# base_response = base_agent.query("Tell me about the different regions and subregions where Conoco Phillips has a production base.")
base_response = base_agent.query(
    "How does the Conoco Phillips capex/EUR in the delaware basin compare against other competitors?"
)
print(str(base_response))

Added user message to memory: How does the Conoco Phillips capex/EUR in the delaware basin compare against other competitors?


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


=== Calling Function ===
Calling function: vector_tool with args: {"input": "Conoco Phillips capex/EUR in the Delaware Basin compared to competitors"}


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


=== Function Output ===
ConocoPhillips' capex/EUR in the Delaware Basin is lower compared to its competitors.


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


=== LLM Response ===
ConocoPhillips' capital expenditure per estimated ultimate recovery (capex/EUR) in the Delaware Basin is lower compared to its competitors.
ConocoPhillips' capital expenditure per estimated ultimate recovery (capex/EUR) in the Delaware Basin is lower compared to its competitors.


In [131]:
base_response

Response(response="ConocoPhillips' capital expenditure per estimated ultimate recovery (capex/EUR) in the Delaware Basin is lower compared to its competitors.", source_nodes=[], metadata=None)

In [None]:
base_query_engine.results[0].source_nodes[0].get_content(metadata_mode="all")

In [134]:
print(base_response)

ConocoPhillips' capital expenditure per estimated ultimate recovery (capex/EUR) in the Delaware Basin is lower compared to its competitors.
