# Using the Raw API

This notebook walks through how to use the raw API to parse documents.

Status:
| Last Executed | Version | State      |
|---------------|---------|------------|
| Aug-18-2025   | N/A     | Maintained |

In [None]:
!wget "https://arxiv.org/pdf/1706.03762.pdf" -O "./attention.pdf"

In [None]:
api_key = "llx-..."

In [None]:
import mimetypes
import requests
import time

headers = {"Authorization": f"Bearer {api_key}"}
file_path = "./attention.pdf"
base_url = "https://api.cloud.llamaindex.ai/api/parsing"

with open(file_path, "rb") as f:
    mime_type = mimetypes.guess_type(file_path)[0]
    files = {"file": (f.name, f, mime_type)}
    body = {
        "parse_mode": "parse_page_with_agent",
        "model": "openai-gpt-4-1-mini",
        "high_res_ocr": True,
        "adaptive_long_table": True,
        "outlined_table_extraction": True,
        "output_tables_as_HTML": True,
    }

    # send the request, upload the file
    url = f"{base_url}/upload"
    response = requests.post(url, headers=headers, files=files, data=body)

response.raise_for_status()
# get the job id for the result_url
job_id = response.json()["id"]
result_type = "json"  # or "markdown" or "json"
result_url = f"{base_url}/job/{job_id}/result/{result_type}"

# check for the result until its ready
while True:
    response = requests.get(result_url, headers=headers)
    if response.status_code == 200:
        break

    time.sleep(2)

# download the result
result = response.json()

In [None]:
print(result.keys())

dict_keys(['pages', 'job_metadata'])


In [None]:
print(result["pages"][0].keys())

dict_keys(['page', 'text', 'md', 'images', 'charts', 'items', 'status', 'originalOrientationAngle', 'links', 'width', 'height', 'triggeredAutoMode', 'parsingMode', 'structuredData', 'noStructuredContent', 'noTextContent', 'pageHeaderMarkdown', 'pageFooterMarkdown', 'confidence'])


In [None]:
print(result["pages"][0]["md"][:2000])


Provided proper attribution is provided, Google hereby grants permission to reproduce the tables and figures in this paper solely for use in journalistic or scholarly works.

# Attention Is All You Need

**Ashish Vaswani***  
Google Brain  
avaswani@google.com  

**Noam Shazeer***  
Google Brain  
noam@google.com  

**Niki Parmar***  
Google Research  
nikip@google.com  

**Jakob Uszkoreit***  
Google Research  
usz@google.com  

**Llion Jones***  
Google Research  
llion@google.com  

**Aidan N. Gomez* †**  
University of Toronto  
aidan@cs.toronto.edu  

**Łukasz Kaiser***  
Google Brain  
lukaszkaiser@google.com  

**Illia Polosukhin* ‡**  
illia.polosukhin@gmail.com  

## Abstract

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, 