# LlamaParse JSON Mode + Multimodal RAG

<a href="https://colab.research.google.com/github/run-llama/llama_parse/blob/main/examples/demo_json.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook shows you how to use LlamaParse JSON mode with LlamaIndex to build a simple multimodal RAG pipeline.

Using JSON mode gives you back a list of json dictionaries, which contains both text and images. You can then download these images and use a multimodal model to extract information and index them.

## Setup

Define imports, env variables, global LLM/embedding models.

In [None]:
!pip install llama-index
!pip install llama-index-core
!pip install llama-index-llms-anthropic llama-index-multi-modal-llms-anthropic
!pip install llama-index-embeddings-huggingface
!pip install llama-parse

In [None]:
# llama-parse is async-first, running the async code in a notebook requires the use of nest_asyncio
import nest_asyncio

nest_asyncio.apply()

import os

# API access to llama-cloud
os.environ["LLAMA_CLOUD_API_KEY"] = "llx-"

# Using Anthropic API for embeddings/LLMs
os.environ["ANTHROPIC_API_KEY"] = "sk-"

In [None]:
from llama_index.llms.anthropic import Anthropic

llm = Anthropic(model="claude-3-opus-20240229", temperature=0.0)

In [None]:
from llama_index.core import Settings

Settings.llm = llm
Settings.embed_model = "local:BAAI/bge-small-en-v1.5"

## Load Data

Let's load in the Uber 10Q report.

In [None]:
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10q/uber_10q_march_2022.pdf' -O './uber_10q_march_2022.pdf'

## Using LlamaParse in JSON Mode for PDF Reading

We show you how to run LlamaParse in JSON mode for PDF reading.

In [None]:
from llama_parse import LlamaParse

parser = LlamaParse(verbose=True)
json_objs = parser.get_json_result("./uber_10q_march_2022.pdf")
json_list = json_objs[0]["pages"]

Started parsing the file under job_id cf5a4f51-1af8-47f7-9b3d-80a905d06b89


In [None]:
from llama_index.core.schema import TextNode
from typing import List


def get_text_nodes(json_list: List[dict]):
    text_nodes = []
    for idx, page in enumerate(json_list):
        text_node = TextNode(text=page["text"], metadata={"page": page["page"]})
        text_nodes.append(text_node)
    return text_nodes

In [None]:
text_nodes = get_text_nodes(json_list)

## Extract/Index images from image dicts

Here we use a multimodal model to extract and index images from image dictionaries.

In [None]:
# call get_images on parser, convert to ImageDocuments
!mkdir llama2_images

from llama_index.core.schema import ImageDocument
from llama_index.multi_modal_llms.anthropic import AnthropicMultiModal


def get_image_text_nodes(json_objs: List[dict]):
    """Extract out text from images using a multimodal model."""
    anthropic_mm_llm = AnthropicMultiModal(max_tokens=300)
    image_dicts = parser.get_images(json_objs, download_path="llama2_images")
    image_documents = []
    img_text_nodes = []
    for image_dict in image_dicts:
        image_doc = ImageDocument(image_path=image_dict["path"])
        response = anthropic_mm_llm.complete(
            prompt="Describe the images as alt text",
            image_documents=[image_doc],
        )
        text_node = TextNode(text=str(response), metadata={"path": image_dict["path"]})
        img_text_nodes.append(text_node)
    return img_text_nodes

mkdir: llama2_images: File exists


In [None]:
image_text_nodes = get_image_text_nodes(json_objs)

In [None]:
image_text_nodes[0].get_content()

'The image shows a bar graph titled "Monthly Active Platform Consumers (in millions)". The graph displays data from Q2 2020 to Q1 2022 over 8 quarters. The number of monthly active platform consumers starts at 55 million in Q2 2020 and steadily increases each quarter, reaching 115 million by Q1 2022. The graph illustrates consistent quarter-over-quarter growth in this metric over the nearly 2 year time period shown.'

## Build Index across image and text nodes

Here we build a vector index across both text nodes and text nodes extracted from images.

In [None]:
from llama_index.core import VectorStoreIndex

index = VectorStoreIndex(text_nodes + image_text_nodes)

In [None]:
query_engine = index.as_query_engine()

In [None]:
# ask question over image!
response = query_engine.query(
    "What does the bar graph titled 'Monthly Active Platform Consumers' show?"
)
print(str(response))

The bar graph titled "Monthly Active Platform Consumers (in millions)" shows the number of monthly active consumers on Uber's platform over a period of 8 quarters from Q2 2020 to Q1 2022. 

The graph indicates steady quarter-over-quarter growth in this metric, starting at 55 million monthly active platform consumers in Q2 2020 and increasing each quarter to reach 115 million by Q1 2022. This represents consistent growth in Uber's user base on their platform over the nearly 2 year period shown in the graph.


In [None]:
# ask question over text!
response = query_engine.query("What are the main risk factors for Uber?")
print(str(response))

Based on the context provided, some of the main risk factors for Uber include:

- A significant percentage of Uber's bookings come from large metropolitan areas, which could be negatively impacted by various economic, social, weather, regulatory and other conditions, including COVID-19.

- Uber may fail to successfully offer autonomous vehicle technologies on its platform or these technologies may not perform as expected. 

- Retaining and attracting high-quality personnel is important for Uber's business and continued attrition could adversely impact the company.

- Security breaches, data privacy issues, cyberattacks and unauthorized access to Uber's proprietary data and systems pose risks.

- Uber is subject to climate change risks, both physical and transitional, that could adversely impact its business if not managed properly. 

- Uber relies on third parties for open marketplaces to distribute its platform and software, and interference from these third parties could harm its bus