# Using the Contextual AI Document Parser

This notebook demonstrates how to use `/parse` with the [Contextual API](https://docs.contextual.ai/api-reference/parse/parse-file) directly and our [Python SDK](https://github.com/ContextualAI/contextual-client-python/tree/main). We'll use the same doc, [Attention is All You Need](https://arxiv.org/pdf/1706.03762) for both. Please see our [blog post](https://contextual.ai/blog/...) for more details on its comparative advantages to other parsers.

This notebook has 6 major sections:

0. Fetch doc and API key
1. REST API implementation
2. Contextual SDK
3. Parse UI
4. Output Types
5. Hierarchy Metadata
6. Table Extraction

You can run this notebook entirely in Colab:

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ContextualAI/examples/blob/main/03-standalone-api/04-parse/parse.ipynb)

## 2. Contextual SDK

In [1]:
api_key="YOUR_API_KEY"

In [2]:
try:
  from contextual import ContextualAI
except:
  %pip install --upgrade --quiet contextual-client
  from contextual import ContextualAI

# Setup Contextual Python SDK
client = ContextualAI(api_key=api_key)

### 2.4 Get `/parse` Job Results

Here i'm fetching `/parse` results for jobs in the UI playground on this tenant:

https://app.contextual.ai/akash-contextual-ai/components/parse

In [132]:
job_id = "55bd4791-560a-46f7-b66f-fbe11bfa8b36"    # DeepSeek scaling report
# job_id = "2e9c1615-c293-4475-b9d8-f9f6536bdf86"  # Qwen 3 tech report

In [133]:
parsed_document = client.parse.job_results(job_id, output_types=['markdown-per-page', 'blocks-per-page'])
# parsed_document

### 2.5 Display `/parse` Results

The `parsed_document` is a Pydantic model with `.pages` and `.document_metadata` top-level fields.

In [125]:
from IPython.display import display, Markdown

In [None]:
display(Markdown(parsed_document.pages[0].markdown))

In [134]:
# human readable markdown version of .document_metadata.hierarchy.blocks 
display(Markdown(parsed_document.document_metadata.hierarchy.table_of_contents))

# Document Hierarchy

- Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures [(Page 0)](#insights-into-deepseek-v3-scaling-challenges-and-reflections-on-hardware-for-ai-architectures)
  - Abstract [(Page 0)](#abstract)
  - CCS Concepts [(Page 0)](#ccs-concepts)
  - Keywords [(Page 0)](#keywords)
  - ACM Reference Format: [(Page 0)](#acm-reference-format)
  - 1 Introduction [(Page 0)](#1-introduction)
    - 1.1 Background [(Page 0)](#11-background)
    - 1.2 Objectives [(Page 1)](#12-objectives)
    - 1.3 Structure of this Paper [(Page 1)](#13-structure-of-this-paper)
  - 2 Design Principles for DeepSeek Models [(Page 1)](#2-design-principles-for-deepseek-models)
    - 2.1 Memory Efficiency [(Page 1)](#21-memory-efficiency)
      - 2.1.1 Low-Precision Models [(Page 1)](#211-low-precision-models)
      - 2.1.2 Reducing KV Cache with MLA [(Page 1)](#212-reducing-kv-cache-with-mla)
    - 2.2 Cost-Effectiveness of MoE Models [(Page 2)](#22-cost-effectiveness-of-moe-models)
    - 2.3 Increasing Inference Speed [(Page 3)](#23-increasing-inference-speed)
    - 2.4 Technique Validation Methodology [(Page 4)](#24-technique-validation-methodology)
  - 3 Low-Precision Driven Design [(Page 4)](#3-low-precision-driven-design)
    - 3.1 FP8 Mix-Precision Training [(Page 4)](#31-fp8-mix-precision-training)
    - 3.2 LogFMT: Communication Compression [(Page 5)](#32-logfmt-communication-compression)
  - 4 Interconnection Driven Design [(Page 5)](#4-interconnection-driven-design)
    - 4.1 Current Hardware Architecture [(Page 5)](#41-current-hardware-architecture)
    - 4.2 Hardware-Aware Parallelism [(Page 5)](#42-hardware-aware-parallelism)
    - 4.3 Model Co-Design: Node-Limited Routing [(Page 6)](#43-model-co-design-node-limited-routing)
    - 4.4 Scale-Up and Scale-Out Convergence [(Page 6)](#44-scale-up-and-scale-out-convergence)
      - 4.4.1 Limitations of Current Implementations [(Page 6)](#441-limitations-of-current-implementations)
      - 4.4.2 Suggestions [(Page 6)](#442-suggestions)
    - 4.5 Bandwidth Contention and Latency [(Page 7)](#45-bandwidth-contention-and-latency)
  - 5 Large Scale Network Driven Design [(Page 7)](#5-large-scale-network-driven-design)
    - 5.1 Network Co-Design: Multi-Plane Fat-Tree [(Page 7)](#51-network-co-design-multi-plane-fat-tree)
    - 5.2 Low Latency Networks [(Page 8)](#52-low-latency-networks)
  - 6 Discussion and Insights for Future Hardware Architecture Design [(Page 10)](#6-discussion-and-insights-for-future-hardware-architecture-design)
    - 6.1 Robustness Challenges [(Page 10)](#61-robustness-challenges)
    - 6.2 CPU Bottlenecks and Interconnects [(Page 10)](#62-cpu-bottlenecks-and-interconnects)
    - 6.3 Toward Intelligent Networks for AI [(Page 10)](#63-toward-intelligent-networks-for-ai)
    - 6.4 Discussion on Memory-Semantic Communication and Ordering Issue [(Page 11)](#64-discussion-on-memory-semantic-communication-and-ordering-issue)
    - 6.5 In-Network Computation and Compression [(Page 11)](#65-in-network-computation-and-compression)
    - 6.6 Memory-Centric Innovations [(Page 11)](#66-memory-centric-innovations)
      - 6.6.1 Limitations of Memory Bandwidth [(Page 11)](#661-limitations-of-memory-bandwidth)
      - 6.6.2 Suggestions: [(Page 11)](#662-suggestions)
  - 7 Conclusion [(Page 11)](#7-conclusion)
  - References [(Page 12)](#references)
  - References [(Page 13)](#references)

We inspect a particular hierarchy heading node "### 1.2 Objectives", note the structure with `.id`, `.parent_ids`, `.markdown`, `.page_index` and more.

In [136]:
# parsed_document.document_metadata.hierarchy.blocks
parsed_document.document_metadata.hierarchy.blocks[7]

DocumentMetadataHierarchyBlock(id='9f6de97d-ce68-1ecf-a589-df984eb75d6a', bounding_box=DocumentMetadataHierarchyBlockBoundingBox(x0=0.08790523553985397, x1=0.21134562274209814, y0=0.10427671490293561, y1=0.12276158188328598), markdown='### 1.2 Objectives', type='heading', confidence_level=None, hierarchy_level=2, page_index=1, parent_ids=['a30ad33b-eb79-1e24-f714-b65521d53876', '56e22499-de21-20c0-ef93-4ed41f58e85a'])

## Agent-navigable document

In [149]:
class ParsedDocumentForAgent:
    """
    This class wraps `/parse` output exposing tool functions allowing an LLM agent to 
    navigate and interact with the parsed document.

    0. read_document() -> str
    1. read_hierarchy() -> str, list[dict[id, level, markdown, page_index]]
    2. read_pages(page_indexes) -> str
    3. read_hierarchy_section(heading_block_id) -> str
    """
    def __init__(self, parsed_document):
        self.parsed_document = parsed_document
        self.block_map = {block.id: block for page in self.parsed_document.pages for block in page.blocks}
        self.heading_block_map = {block.id: block for block in self.parsed_document.document_metadata.hierarchy.blocks}
    
    def read_document(self) -> str:
        """
        Read contents of the entire document as markdown (may be large)
        """
        return self.parsed_document.markdown_document
    
    def read_hierarchy(self) -> tuple[str, list[dict]]:
        """
        Read the outline structure of the document as:
            (i) human/LLM readable markdown nested list
            (ii) LLM referenceable list of structured dicts

        Could provide either (ii) or both as context to an LLM to navigate the document and reference specific sections
        """
        hierarchy_markdown = self.parsed_document.document_metadata.hierarchy.table_of_contents

        hierarchy_list = []
        for block in self.parsed_document.document_metadata.hierarchy.blocks:
            hierarchy_list.append({
                "block_id": block.id, # might need to translate the uuid to a LLM-friendly integer index instead
                "hierarchy_level": block.hierarchy_level,
                "markdown": block.markdown,
                "page_index": block.page_index
            })
        return hierarchy_markdown, hierarchy_list
    
    def read_pages(self, page_indexes: list[int]) -> str:
        """
        Read the contents of the document for the provided page indexes
        """
        page_separator = "\n\n---\nPage index: {page_index}\n\n"
        content = ""
        for page_index in page_indexes:
            content += page_separator.format(page_index=page_index) + self.parsed_document.pages[page_index].markdown
        return content
        
    def read_hierarchy_section(self, heading_block_id: str) -> str:
        """
        Read the contents of the document that are children of the given heading block referenced by `heading_block_id`
        """
        heading_block = self.heading_block_map[heading_block_id]
        parent_path_prefix = heading_block.parent_ids + [heading_block_id]

        section_blocks = []
        for page in self.parsed_document.pages:
            for block in page.blocks:
                # filter for blocks that share the same parent path
                if block.parent_ids[:len(parent_path_prefix)] == parent_path_prefix:
                    section_blocks.append(block)
        
        section_content = "\n".join([block.markdown for block in section_blocks])
        return section_content

In [150]:
navigable_document = ParsedDocumentForAgent(parsed_document)

In [151]:
# block_id = "9f6de97d-ce68-1ecf-a589-df984eb75d6a"
heading_block = parsed_document.document_metadata.hierarchy.blocks[7]
heading_block_id = heading_block.id

print(heading_block.markdown)
heading_block

### 1.2 Objectives


DocumentMetadataHierarchyBlock(id='9f6de97d-ce68-1ecf-a589-df984eb75d6a', bounding_box=DocumentMetadataHierarchyBlockBoundingBox(x0=0.08790523553985397, x1=0.21134562274209814, y0=0.10427671490293561, y1=0.12276158188328598), markdown='### 1.2 Objectives', type='heading', confidence_level=None, hierarchy_level=2, page_index=1, parent_ids=['a30ad33b-eb79-1e24-f714-b65521d53876', '56e22499-de21-20c0-ef93-4ed41f58e85a'])

In [152]:
navigated_markdown = navigable_document.read_hierarchy_section(heading_block_id)

Markdown(navigated_markdown)

This paper does not aim to reiterate the detailed architectural and algorithmic specifics of DeepSeek-V3, which are extensively documented in its technical report [26]. Instead, it adopts a dual perspective—spanning hardware architecture and model design—to explore the intricate interplay between them in achieving cost-efficient large-scale training and inference. By examining this synergy, we aim to provide actionable insights for scaling LLMs efficiently without sacrificing performance or accessibility.
Specifically, the paper focuses on:
- Hardware-Driven Model Design: Analyze how hardware features, such as FP8 low-precision computation and scale-up/scale-out network properties, informed the architectural choices in DeepSeek-V3.
- Mutual Dependencies Between Hardware and Models: Investigate how hardware capabilities shape model innovation and how the evolving demands of LLMs drive the need for next-generation hardware.
- Future Directions for Hardware Development: Derive actionable insights from DeepSeek-V3 to guide the co-design of future hardware and model architectures, paving the way for scalable, cost-efficient AI systems.

In [None]:
navigated_markdown = navigable_document.read_pages([0, 1])

Markdown(navigated_markdown)