# 🧠 Agentic Document Extraction with LandingAI

This notebook demonstrates how to use the `agentic_doc` Python package to extract structured information from documents using LandingAI's Agentic Document Extraction (ADE) service.

We'll walk through:
- Parsing documents with ADE
- Defining a custom schema using `pydantic`
- Viewing structured field extractions
- Saving results to CSV

> 📎 Supported formats: `.pdf`, `.png`, `.jpg`, `.jpeg`

## 📦 Setup & Imports

Import necessary packages and utility functions. Ensure you have installed `agentic_doc`, Pillow, and other dependencies:

```bash
pip install agentic-doc pillow
```

Obtain your API Key from the Visial Playground at https://va.landing.ai/settings/api-key

Read about options for setting your API at https://docs.landing.ai/ade/agentic-api-key

The video that accompnies this notebook uses a .env file in the same directory as the notebook.


In [1]:
# Standard libraries
import os
import json
from datetime import date
from pathlib import Path

# Agentic Document Extraction from LandingAI
from agentic_doc.parse import parse

[2m2025-07-24 00:54:06[0m [info   [0m] [1mSettings loaded: {
  "endpoint_host": "https://api.va.landing.ai",
  "vision_agent_api_key": "cmI5a[REDACTED]",
  "batch_size": 4,
  "max_workers": 5,
  "max_retries": 100,
  "max_retry_wait_time": 60,
  "retry_logging_style": "log_msg",
  "pdf_to_image_dpi": 96,
  "split_size": 10
}[0m [[0m[1m[34magentic_doc.config[0m][0m (config.py:84)


## 📁 Define Input and Output Directories

Specify where your documents are located and where results will be saved.


In [2]:
# Define input and output directory paths
base_dir = Path(os.getcwd())
input_folder = base_dir / "input_folder"
results_folder = base_dir / "results_folder"
groundings_folder = base_dir / "groundings_folder"

# Create output folders if they don't exist
input_folder.mkdir(parents=True, exist_ok=True)
results_folder.mkdir(parents=True, exist_ok=True)
groundings_folder.mkdir(parents=True, exist_ok=True)

## 🗂️ Collect Document File Paths

This block filters input files for supported formats.


In [3]:
file_names = [
    p.name
    for p in input_folder.iterdir()
    if p.suffix.lower() in [".pdf", ".png", ".jpg", ".jpeg"]
]

file_names

['CME_Mendez_ex5.png',
 'CME_Mendez_ex4.png',
 'CME_Mendez_ex1.png',
 'CME_Mendez_ex3.png',
 'CME_Mendez_ex2.png']

In [4]:
# Collect all document file paths in input folder with supported extensions
# Convert each Path object to a string to ensure compatibility with parse()

file_paths = [
    str(p)
    for p in input_folder.iterdir()
    if p.suffix.lower() in [".pdf", ".png", ".jpg", ".jpeg"]
]

file_paths

['/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/input_folder/CME_Mendez_ex5.png',
 '/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/input_folder/CME_Mendez_ex4.png',
 '/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/input_folder/CME_Mendez_ex1.png',
 '/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/input_folder/CME_Mendez_ex3.png',
 '/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/input_folder/CME_Mendez_ex2.png']

## 🚀 Run Agentic Document Extraction

Call the `parse()` function from `agentic_doc` to extract structured data and save results to the output folders.

See https://docs.landing.ai/ade/ade-parse-docs for details

In [5]:
# Parse documents using LandingAI ADE

result = parse(
    documents=file_paths,
    result_save_dir=str(results_folder),
    grounding_save_dir=str(groundings_folder),
    include_marginalia=True,
    include_metadata_in_markdown=True,      
    )

[2m2025-07-24 00:54:32[0m [info   [0m] [1mAPI key is valid.             [0m [[0m[1m[34magentic_doc.utils[0m][0m (utils.py:44)
[2m2025-07-24 00:54:32[0m [info   [0m] [1mParsing 5 documents           [0m [[0m[1m[34magentic_doc.parse[0m][0m (parse.py:322)


Parsing documents:   0%|          | 0/5 [00:00<?, ?it/s]

HTTP Request: POST https://api.va.landing.ai/v1/tools/agentic-document-analysis "HTTP/1.1 200 OK" (_client.py:1025)
[2m2025-07-24 00:54:37[0m [info   [0m] [1mTime taken to successfully parse a document chunk: 4.91 seconds[0m [[0m[1m[34magentic_doc.parse[0m][0m (parse.py:683)
[2m2025-07-24 00:54:37[0m [info   [0m] [1mSaving 11 chunks as images to '/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/groundings_folder/CME_Mendez_ex1_20250723_175437'[0m [[0m[1m[34magentic_doc.utils[0m][0m [36mfile_path[0m=[35mPosixPath('/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/input_folder/CME_Mendez_ex1.png')[0m [36mfile_type[0m=[35mimage[0m (utils.py:84)
[2m2025-07-24 00:54:37[0m [info   [0m] [1mSaved the parsed result to '/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/results_folder/CME_Mendez_ex1_20250723_175437.json'[0m [[0m[1m[34magentic_doc.parse[0m][0m (parse.py:407)
HTTP Request: POST https://api.va.landing.a

Parsing documents:  20%|██        | 1/5 [00:04<00:19,  5.00s/it]

HTTP Request: POST https://api.va.landing.ai/v1/tools/agentic-document-analysis "HTTP/1.1 200 OK" (_client.py:1025)
[2m2025-07-24 00:54:37[0m [info   [0m] [1mTime taken to successfully parse a document chunk: 5.04 seconds[0m [[0m[1m[34magentic_doc.parse[0m][0m (parse.py:683)
[2m2025-07-24 00:54:37[0m [info   [0m] [1mSaving 4 chunks as images to '/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/groundings_folder/CME_Mendez_ex3_20250723_175437'[0m [[0m[1m[34magentic_doc.utils[0m][0m [36mfile_path[0m=[35mPosixPath('/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/input_folder/CME_Mendez_ex3.png')[0m [36mfile_type[0m=[35mimage[0m (utils.py:84)
[2m2025-07-24 00:54:37[0m [info   [0m] [1mSaved the parsed result to '/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/results_folder/CME_Mendez_ex3_20250723_175437.json'[0m [[0m[1m[34magentic_doc.parse[0m][0m (parse.py:407)
HTTP Request: POST https://api.va.landing.ai

Parsing documents:  40%|████      | 2/5 [00:05<00:07,  2.35s/it]

HTTP Request: POST https://api.va.landing.ai/v1/tools/agentic-document-analysis "HTTP/1.1 200 OK" (_client.py:1025)
[2m2025-07-24 00:54:42[0m [info   [0m] [1mTime taken to successfully parse a document chunk: 5.00 seconds[0m [[0m[1m[34magentic_doc.parse[0m][0m (parse.py:683)
[2m2025-07-24 00:54:42[0m [info   [0m] [1mSaving 9 chunks as images to '/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/groundings_folder/CME_Mendez_ex2_20250723_175442'[0m [[0m[1m[34magentic_doc.utils[0m][0m [36mfile_path[0m=[35mPosixPath('/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/input_folder/CME_Mendez_ex2.png')[0m [36mfile_type[0m=[35mimage[0m (utils.py:84)
[2m2025-07-24 00:54:42[0m [info   [0m] [1mSaved the parsed result to '/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/results_folder/CME_Mendez_ex2_20250723_175442.json'[0m [[0m[1m[34magentic_doc.parse[0m][0m (parse.py:407)


Parsing documents: 100%|██████████| 5/5 [00:09<00:00,  1.99s/it]


## 📑 Define Custom Schema for Field Extraction

Using `pydantic`, we define a schema to extract specific fields (e.g., product name) from the document.

See https://docs.landing.ai/ade/ade-extract-library

In [7]:
# Import pydantic for schema definition
from pydantic import BaseModel, Field

# Define schema for structured extraction
class CME(BaseModel):
    recipient_name: str = Field(description="Full name of the individual who received the certificate. Only the name. Remove any prefixes such as Mr. Mrs. or Dr. Also remove any credentials that may appear after the name such as BS, MD, DDS, RN")
    issuing_org: str = Field(description="Full name of the organization issuing the certificate.")
    activity_title: str = Field(description="Title of the CME activity or material completed by the recipient.")
    date_awarded: date = Field(description="Date when the certificate or credit was awarded.")
    credit_awarded: str = Field(description="Amount and type of CME credit awarded to the recipient.")
    credit_numeric: float = Field(description="Amount of CME credit awarded.")
    ama_pra_cat1: bool = Field(description="True if the CME credits awarded qualify for AMA PRA Category 1.")
    ama_pra_cat2: bool = Field(description="True if the CME credits awarded qualify for AMA PRA Category 2.")


## 🚀 Run Agentic Document Extraction with Schema

Call the `parse()` function from `agentic_doc` to extract structured data and save results to the output folders.

Pass the `extraction_model` as an input to `parse()`.

To learn more about parsing visit [https://docs.landing.ai/ade/ade-parse-docs](https://docs.landing.ai/ade/ade-parse-docs).

In [8]:
# Run ADE using the custom Product schema for structured field extraction
result_fe = parse(
    documents=file_paths, 
    grounding_save_dir=str(groundings_folder), 
    extraction_model=CME  # This line is new
    )

[2m2025-07-24 00:55:30[0m [info   [0m] [1mAPI key is valid.             [0m [[0m[1m[34magentic_doc.utils[0m][0m (utils.py:44)
[2m2025-07-24 00:55:30[0m [info   [0m] [1mParsing 5 documents           [0m [[0m[1m[34magentic_doc.parse[0m][0m (parse.py:261)


Parsing documents:   0%|          | 0/5 [00:00<?, ?it/s]

HTTP Request: POST https://api.va.landing.ai/v1/tools/agentic-document-analysis "HTTP/1.1 200 OK" (_client.py:1025)
[2m2025-07-24 00:55:37[0m [info   [0m] [1mTime taken to successfully parse a document chunk: 6.80 seconds[0m [[0m[1m[34magentic_doc.parse[0m][0m (parse.py:683)
[2m2025-07-24 00:55:37[0m [info   [0m] [1mSaving 11 chunks as images to '/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/groundings_folder/CME_Mendez_ex1_20250723_175537'[0m [[0m[1m[34magentic_doc.utils[0m][0m [36mfile_path[0m=[35mPosixPath('/Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/input_folder/CME_Mendez_ex1.png')[0m [36mfile_type[0m=[35mimage[0m (utils.py:84)
HTTP Request: POST https://api.va.landing.ai/v1/tools/agentic-document-analysis "HTTP/1.1 200 OK" (_client.py:1025)
[2m2025-07-24 00:55:37[0m [info   [0m] [1mTime taken to successfully parse a document chunk: 6.96 seconds[0m [[0m[1m[34magentic_doc.parse[0m][0m (parse.py:683)
[2m202

Parsing documents: 100%|██████████| 5/5 [00:20<00:00,  4.05s/it]


In [9]:
# View results
result_fe

[ParsedDocument(markdown='CONTINUING MEDICAL EDUCATION CERTIFICATE <!-- text, from page 0 (l=0.181,t=0.070,r=0.816,b=0.130), with ID 9c114bcf-f890-4fb6-875a-e6d200481936 -->\n\nMedscape <!-- text, from page 0 (l=0.391,t=0.159,r=0.602,b=0.248), with ID 9f011437-38e5-485e-a4a8-369a86b0432b -->\n\ncertifies that <!-- text, from page 0 (l=0.438,t=0.259,r=0.560,b=0.300), with ID 7842b8fe-d069-4fc4-a99a-96110b13894b -->\n\nManoel Cortes Mendez <!-- text, from page 0 (l=0.397,t=0.313,r=0.601,b=0.350), with ID badc98f3-ecdc-4b3d-9f73-0c012d638bc9 -->\n\nhas participated in the enduring material titled <!-- text, from page 0 (l=0.298,t=0.364,r=0.699,b=0.407), with ID 38de8ffa-caa1-4acc-b920-9e7ed45b95fd -->\n\nPatient Case: To Screen or Not to Screen for Chronic Kidney Disease <!-- text, from page 0 (l=0.117,t=0.419,r=0.880,b=0.468), with ID d413ebf1-e760-4e1a-9235-7c30c26325af -->\n\nOctober 18, 2022 <!-- text, from page 0 (l=0.415,t=0.482,r=0.583,b=0.521), with ID d36835bf-1f2e-42a1-80ba-0f4e

## 🔍 Explore Field Extraction Outputs

Dive into the result to understand the contents and structure.

In [10]:
# Access one document from the results
doc = result_fe[0] # Choose index based on available docs

In [11]:
# Extract various outputs
markdown_output = doc.markdown
chunk_output = doc.chunks
doc_type = doc.doc_type
result_path = str(doc.result_path)

# Print metadata
print("Document Type:", doc_type)
print("Result Path:", result_path)
print("Markdown Summary (first 100 chars):")
print(markdown_output[:100])

# Access and iterate through chunks
print(f"Total Chunks: {len(doc.chunks)}")

for i, chunk in enumerate(doc.chunks):
    print(f"\n--- Chunk {i+1} ---")
    print("Chunk ID:", chunk.chunk_id)
    print("Chunk Type:", chunk.chunk_type.value)  # e.g., 'text', 'figure', etc.
    print("Text (shortened):", chunk.text[:100].replace("\n", " "), "...")

    # Access grounding (box and image path)
    for grounding in chunk.grounding:
        box = grounding.box
        print("  Page:", grounding.page)
        print(f"  Box (l, t, r, b): ({box.l:.3f}, {box.t:.3f}, {box.r:.3f}, {box.b:.3f})")
        print("  Image Path:", str(grounding.image_path))


Document Type: image
Result Path: None
Markdown Summary (first 100 chars):
CONTINUING MEDICAL EDUCATION CERTIFICATE <!-- text, from page 0 (l=0.181,t=0.070,r=0.816,b=0.130), w
Total Chunks: 10

--- Chunk 1 ---
Chunk ID: 9c114bcf-f890-4fb6-875a-e6d200481936
Chunk Type: text
Text (shortened): CONTINUING MEDICAL EDUCATION CERTIFICATE ...
  Page: 0
  Box (l, t, r, b): (0.181, 0.070, 0.816, 0.130)
  Image Path: /Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/groundings_folder/CME_Mendez_ex5_20250723_175550/page_0/text_9c114bcf-f890-4fb6-875a-e6d200481936_0.png

--- Chunk 2 ---
Chunk ID: 9f011437-38e5-485e-a4a8-369a86b0432b
Chunk Type: text
Text (shortened): Medscape ...
  Page: 0
  Box (l, t, r, b): (0.391, 0.159, 0.602, 0.248)
  Image Path: /Users/andreakropp/Documents/Demos/ADE Demos/Notebooks/CME_Demo/groundings_folder/CME_Mendez_ex5_20250723_175550/page_0/text_9f011437-38e5-485e-a4a8-369a86b0432b_0.png

--- Chunk 3 ---
Chunk ID: 7842b8fe-d069-4fc4-a99a-96110b13894b
Chunk

In [12]:
# print the field extractions
doc.extraction

CME(recipient_name='Manoel Cortes Mendez', issuing_org='Medscape', activity_title='Patient Case: To Screen or Not to Screen for Chronic Kidney Disease', date_awarded=datetime.date(2022, 10, 18), credit_awarded='0.25 AMA PRA Category 1 Credit(s)™', credit_numeric=0.25, ama_pra_cat1=True, ama_pra_cat2=False)

In [13]:
# print the metadata for the field extractions
doc.extraction_metadata

CMEMetadata(recipient_name={'chunk_references': ['badc98f3-ecdc-4b3d-9f73-0c012d638bc9']}, issuing_org={'chunk_references': ['9f011437-38e5-485e-a4a8-369a86b0432b']}, activity_title={'chunk_references': ['d413ebf1-e760-4e1a-9235-7c30c26325af']}, date_awarded={'chunk_references': ['d36835bf-1f2e-42a1-80ba-0f4e9edd1dac']}, credit_awarded={'chunk_references': ['370723e8-c2fa-494f-98c3-b43c4a88e711']}, credit_numeric={'chunk_references': ['370723e8-c2fa-494f-98c3-b43c4a88e711']}, ama_pra_cat1={'chunk_references': ['370723e8-c2fa-494f-98c3-b43c4a88e711']}, ama_pra_cat2={'chunk_references': []})

In [14]:
# print the extracted product name 
doc.extraction.recipient_name

'Manoel Cortes Mendez'

In [15]:
# print the chunk from which the product name is extracted
# note that there can be more than one, so this is returned as a list
doc.extraction_metadata.recipient_name['chunk_references']

['badc98f3-ecdc-4b3d-9f73-0c012d638bc9']

In [16]:
# print the page number and bounding box location for the chunk
target_id = 'badc98f3-ecdc-4b3d-9f73-0c012d638bc9'  #Update this value based on the response above

# Search through chunks to find the one with the matching ID
for chunk in doc.chunks:
    if chunk.chunk_id == target_id:
        print("Chunk type:", chunk.chunk_type)
        print("Chunk text:", chunk.text)
        for grounding in chunk.grounding:
            box = grounding.box
            print("Page:", grounding.page)
            print(f"Box Coordinates:")
            print(f"  Left (l):   {box.l}")
            print(f"  Top (t):    {box.t}")
            print(f"  Right (r):  {box.r}")
            print(f"  Bottom (b): {box.b}")
        break
else:
    print("Chunk ID not found.")

Chunk type: ChunkType.text
Chunk text: Manoel Cortes Mendez
Page: 0
Box Coordinates:
  Left (l):   0.3972756862640381
  Top (t):    0.3128066062927246
  Right (r):  0.6005247235298157
  Bottom (b): 0.35027849674224854


In [17]:
# print specific fields and the associated metadata
print(f"The recipient name is: {doc.extraction.recipient_name}. This is extracted from chunk {doc.extraction_metadata.recipient_name['chunk_references']}")
print(f"The CME activity is described as: {doc.extraction.activity_title}. This is extracted from chunk {doc.extraction_metadata.activity_title['chunk_references']}")
print(f"The activity is eligible for AMA PRA Category 1: {doc.extraction.ama_pra_cat1}. This is extracted from chunk {doc.extraction_metadata.ama_pra_cat1['chunk_references']}")
print(f"The certificate awards credits in the amount of: {doc.extraction.credit_numeric}. This is extracted from chunk {doc.extraction_metadata.credit_numeric['chunk_references']}")

The recipient name is: Manoel Cortes Mendez. This is extracted from chunk ['badc98f3-ecdc-4b3d-9f73-0c012d638bc9']
The CME activity is described as: Patient Case: To Screen or Not to Screen for Chronic Kidney Disease. This is extracted from chunk ['d413ebf1-e760-4e1a-9235-7c30c26325af']
The activity is eligible for AMA PRA Category 1: True. This is extracted from chunk ['370723e8-c2fa-494f-98c3-b43c4a88e711']
The certificate awards credits in the amount of: 0.25. This is extracted from chunk ['370723e8-c2fa-494f-98c3-b43c4a88e711']


## 💾 Convert to Table and Save

Convert the field extractions to a pandas dataframe. Save it to the results folder created earlier.

In [19]:
import pandas as pd

# Assume result_fe is your list of ParsedDocument objects
# Example: result_fe = [ParsedDocument(...), ParsedDocument(...), ...]

# Extract the product data
#Note that only 3 fields have chunk_references included for this example
records = []
for i in range(len(result_fe)):
    doc = result_fe[i]
    body=doc.extraction
    meta=doc.extraction_metadata
    cme_dict = {
        "document_name": file_names[i],
        "recipient_name": body.recipient_name,
        "issuing_org": body.issuing_org,
        "activity_title": body.activity_title,
        "date_awarded": body.date_awarded,
        "credit_awarded": body.credit_awarded,
        "credit_numeric": body.credit_numeric,
        "ama_pra_cat1": body.ama_pra_cat1,
        "ama_pra_cat2": body.ama_pra_cat2,
        "recipient_name_ref":meta.recipient_name['chunk_references'],
        "issuing_org_ref":meta.issuing_org['chunk_references'],        
        "activity_title_ref":meta.activity_title['chunk_references'],        
        "date_awarded_ref":meta.date_awarded['chunk_references'],        
        "credit_awarded_ref":meta.credit_awarded['chunk_references'],
        "credit_numeric_ref":meta.credit_numeric['chunk_references'],
        "ama_pra_cat1_ref":meta.ama_pra_cat1['chunk_references'],
        "ama_pra_cat2_ref":meta.ama_pra_cat2['chunk_references']       
    }
    records.append(cme_dict)

# Create DataFrame
df = pd.DataFrame(records)
df


Unnamed: 0,document_name,recipient_name,issuing_org,activity_title,date_awarded,credit_awarded,credit_numeric,ama_pra_cat1,ama_pra_cat2,recipient_name_ref,issuing_org_ref,activity_title_ref,date_awarded_ref,credit_awarded_ref,credit_numeric_ref,ama_pra_cat1_ref,ama_pra_cat2_ref
0,CME_Mendez_ex5.png,Manoel Cortes Mendez,Medscape,Patient Case: To Screen or Not to Screen for C...,2022-10-18,0.25 AMA PRA Category 1 Credit(s)™,0.25,True,False,[badc98f3-ecdc-4b3d-9f73-0c012d638bc9],[9f011437-38e5-485e-a4a8-369a86b0432b],[d413ebf1-e760-4e1a-9235-7c30c26325af],[d36835bf-1f2e-42a1-80ba-0f4e9edd1dac],[370723e8-c2fa-494f-98c3-b43c4a88e711],[370723e8-c2fa-494f-98c3-b43c4a88e711],[370723e8-c2fa-494f-98c3-b43c4a88e711],[]
1,CME_Mendez_ex4.png,Manoel Cortes Mendez,The University of Texas MD Anderson Cancer Center,Cancer Survivorship Series: Module 1 - Overvie...,2022-10-18,0.75 AMA PRA Category 1 Credit(s)™,0.75,True,False,[83cd274f-d42f-4e6f-864e-fbf8c0135842],[c7c64e3a-a18f-4e27-804c-0328c0dbed33],[cabeb27a-8585-45ec-964e-8ce6e15b2fc1],[cabeb27a-8585-45ec-964e-8ce6e15b2fc1],[ed113d16-79b0-455d-936d-12cf9f2d655a],[ed113d16-79b0-455d-936d-12cf9f2d655a],[ed113d16-79b0-455d-936d-12cf9f2d655a],[]
2,CME_Mendez_ex1.png,Manoel Cortes Mendez,Warren Alpert Medical School,"Fears, Bias and Discrimination - Substance Use...",2022-08-19,1.00 AMA PRA Category 1 Credits™,1.0,True,False,[b775d30d-cdb3-4860-8fc0-a844ccedeed7],[1a4a95e3-2abf-4cf9-b123-bcdd474d5911],[fc6cdb59-ad5f-4c26-ac05-43b488422edf],[4f2be294-1bcc-44ca-a746-80e84dca97f7],[e460b48a-3c8e-4384-911a-7c5f17028bb8],[e460b48a-3c8e-4384-911a-7c5f17028bb8],[e460b48a-3c8e-4384-911a-7c5f17028bb8],[]
3,CME_Mendez_ex3.png,Manoel Cortes Mendez,Johns Hopkins University School of Medicine,Pain Medicine Management - Pain Management of ...,2022-08-18,1.00 AMA PRA Category 1 Credit(s)™,1.0,True,False,[14311a7e-eabe-4973-a5bf-5c194b8d1417],[b7a51ca8-5c88-4811-9a1d-e6f0ef111a03],[14311a7e-eabe-4973-a5bf-5c194b8d1417],[14311a7e-eabe-4973-a5bf-5c194b8d1417],[14311a7e-eabe-4973-a5bf-5c194b8d1417],[14311a7e-eabe-4973-a5bf-5c194b8d1417],[14311a7e-eabe-4973-a5bf-5c194b8d1417],[]
4,CME_Mendez_ex2.png,Manoel Cortes Mendez,Stanford University School of Medicine,Introduction to Food and Health,2022-06-27,2.50 AMA PRA Category 1 Credit(s),2.5,True,False,[6e7f43ec-eaf7-491d-96c7-b2c582a1b204],[6e7f43ec-eaf7-491d-96c7-b2c582a1b204],[46d11c92-c3e7-4ee4-bf24-98cafb9f5062],[d7e5c2bb-f056-461b-b81b-59f396883c9c],[0a3d4736-ef6f-4822-893f-445c742d495a],[0a3d4736-ef6f-4822-893f-445c742d495a],[0a3d4736-ef6f-4822-893f-445c742d495a],[]


In [20]:
# Save the DataFrame to a CSV file inside the results_folder
csv_path = results_folder / "cme_output.csv"
df.to_csv(csv_path, index=False)

## ✅ Wrap-Up

You’ve now used LandingAI’s ADE to:
- Parse and extract data from images or PDFs
- Define custom fields using `pydantic`
- Export structured results to a table

To learn more, visit the [LandingAI Documentation](https://docs.landing.ai/ade/ade-overview).