#  **P.A.R.S.E.R**

### Source : **https://docs.cloud.llamaindex.ai/llamaparse/features/multimodal**

## **What is Multimodal Parsing?**

Multimodal parsing refers to the ability of a model to process and understand multiple forms of data simultaneously. In the context of LlamaParse, this means the model can handle not just text but also images, tables, and other document elements. This approach is particularly useful for documents like PDFs where the information is presented in various formats (text, images, charts, etc.).

## **How it works ?**

When using this mode, LlamaParse's regular parsing is bypassed and instead the following process is used:

- A screenshot of every page of your document is taken
- Each page screenshot is sent to the multimodal with instruction to extract as markdown
- The resulting markdown of each page is consolidated into the final result.

=> More expensive than LlamaParse's regular parsing

## **Example of code**

### A - Example of multimodal with GPT-40

In [1]:
import os
import nest_asyncio
from dotenv import load_dotenv
from llama_parse import LlamaParse
from llama_index.core import SimpleDirectoryReader

# Allow nested event loops
nest_asyncio.apply()

# Load environment variables
load_dotenv()

# Initialize the parser with multimodal settings
parser = LlamaParse(
    api_key=os.getenv("LLAMA_CLOUD_API_KEY"),
    use_vendor_multimodal_model=True,
    vendor_multimodal_model_name="openai-gpt4o"
)

# Use SimpleDirectoryReader to parse the file
file_extractor = {".pdf": parser}
documents = SimpleDirectoryReader(input_files=['/data/doc_1pdf.pdf'], file_extractor=file_extractor).load_data()

# Save the parsed result to a markdown file
with open('parsed_result_gpt.md', 'w') as result_file:
    for doc in documents:
        result_file.write(doc.text)  


Started parsing the file under job_id cac11eca-46ae-40e4-9614-edc8189c7b32


### B - Example of multimodal with CLAUDE 

In [2]:
import os
import nest_asyncio
from dotenv import load_dotenv
from llama_parse import LlamaParse
from llama_index.core import SimpleDirectoryReader

# Allow nested event loops
nest_asyncio.apply()

# Load environment variables
load_dotenv()

# Initialize the parser with multimodal settings
parser = LlamaParse(
    api_key=os.getenv("LLAMA_CLOUD_API_KEY"),
    use_vendor_multimodal_model=True,
    vendor_multimodal_model_name="anthropic-sonnet-3.5"
)

# Use SimpleDirectoryReader to parse the file
file_extractor = {".pdf": parser}
documents = SimpleDirectoryReader(input_files=['doc_2pdf.pdf'], file_extractor=file_extractor).load_data()

# Save the parsed result to a markdown file
with open('parsed_result_claude.md', 'w') as result_file:
    for doc in documents:
        result_file.write(doc.text)  

Started parsing the file under job_id 8e450f4c-1ba4-4b64-8d37-1666b5b15d75
