# 006: Operation Graphs - Academic Claim Validation

Building on ReAct patterns from tutorial 005, this demonstrates sequential coordination using Operation Graphs to orchestrate ReaderTool workflows for academic claim validation.

## What You'll Learn

1. **Sequential Coordination**: Building context step-by-step through dependent operations
2. **ReaderTool Integration**: Document chunking and progressive reading strategies  
3. **Structured Extraction**: Using Pydantic models for reliable claim extraction
4. **Operation Dependencies**: How operations build on previous results

## Use Case: Validating a Theoretical Framework Paper

We'll validate claims in an academic paper about capability-based security by:
- Reading document chunks progressively with ReaderTool
- Building understanding through sequential analysis
- Extracting verifiable claims with structured output formats
- Demonstrating how each operation builds on the previous one's results

In [1]:
# Setup and imports
from typing import Literal
from pathlib import Path
from pydantic import BaseModel, Field

from lionagi import Branch, Session, Builder, types, iModel
from lionagi.tools.types import ReaderTool

# Target document - complex theoretical framework
here = Path().cwd()
document_path = here / "data" / "006_lion_proof_ch2.md"

print("✅ Environment setup complete")
print(f"📄 Target: {document_path.name}")
print("🎯 Goal: Validate academic claims using coordinated ReAct workflows")

✅ Environment setup complete
📄 Target: 006_lion_proof_ch2.md
🎯 Goal: Validate academic claims using coordinated ReAct workflows


In [2]:
# Data models for structured responses
class Claim(BaseModel):
    claim: str
    type: Literal["citation", "performance", "technical", "other"]
    location: str = Field(..., description="Section/paragraph reference")
    verifiability: Literal["high", "medium", "low"]
    search_strategy: str = Field(..., description="How to verify this claim")


class ClaimExtraction(BaseModel):
    claims: list[Claim]


print("✅ Data models defined")

✅ Data models defined


In [3]:
from enum import Enum

from pydantic import Field, HttpUrl

from lionagi.models import HashableModel


class Source(HashableModel):
    """
    Represents a citation or external source, such as:
     - a website,
     - documentation link,
     - research paper,
     - or any external resource.
    """

    title: str = Field(
        ...,
        description="Short label or title for the reference (e.g. 'Pydantic Docs', 'RFC 3986').",
    )

    url: str | HttpUrl | None = Field(
        None,
        description="Full URL or local path pointing to the resource. Must conform to standard URL format.",
    )

    note: str | None = Field(
        default=None,
        description=(
            "Optional additional note explaining why this reference is relevant or what it contains."
        ),
    )


class SnippetType(str, Enum):
    TEXT = "text"
    CODE = "code"


class TextSnippet(HashableModel):
    """
    Specialized snippet for textual/prose content.
    """

    type: SnippetType = Field(
        SnippetType.TEXT,
        description=(
            "Must be 'text' for textual snippets. Ensures explicit type distinction."
        ),
    )
    content: str = Field(
        ...,
        description=(
            "The actual text. Can be a paragraph, bullet list, or any narrative content."
        ),
    )


class CodeSnippet(HashableModel):
    """
    Specialized snippet for source code or command-line examples.
    """

    type: SnippetType = Field(
        SnippetType.CODE,
        description=(
            "Must be 'code' for code snippets. Allows separate handling or formatting."
        ),
    )
    content: str = Field(
        ...,
        description=(
            "The actual code or command sequence. Should be well-formatted so it can be rendered properly."
        ),
    )


class Section(HashableModel):
    """
    A single section of a document or article. Each section has:
     - A title
     - A sequential list of content snippets (text or code),
       which appear in the intended reading order.
     - Optional sources specifically cited in this section.
    """

    title: str = Field(
        ...,
        description=(
            "The section heading or label, e.g., 'Introduction', 'Implementation Steps'."
        ),
    )
    snippets: list[TextSnippet | CodeSnippet] = Field(
        default_factory=list,
        description=(
            "Ordered list of content snippets. Could be multiple text blocks, code examples, etc."
        ),
    )

    sources: list[Source] = Field(
        default_factory=list,
        description=(
            "References specifically cited in this section. "
            "If sources are stored at the doc-level, this can be omitted."
        ),
    )


class OutlineItem(HashableModel):
    """
    Represents a single outline item, which could become a full section later.
    """

    heading: str = Field(
        ...,
        description="Short name or label for this item, e.g., 'Chapter 1: Basics'.",
    )
    summary: str | None = Field(
        default=None,
        description=(
            "A brief description of what this section will cover, if known."
        ),
    )


class Outline(HashableModel):
    """
    A top-level outline for a document or article.
    """

    topic: str = Field(
        ..., description="Working title or overarching topic of the document."
    )
    items: list[OutlineItem] = Field(
        default_factory=list,
        description="List of major outline points or sections planned.",
    )
    notes: str | None = Field(
        default=None,
        description="Any additional remarks, questions, or brainstorming notes for the outline.",
    )

## Pattern 1: Sequential Document Analysis

Build understanding step-by-step: Open → Analyze → Extract claims

In [4]:
from lionagi.fields import Instruct


async def sequential_analysis():
    """Sequential workflow: open → analyze structure → extract claims."""

    # Create branch with ReaderTool
    branch = Branch(
        tools=[ReaderTool], chat_model=iModel(model="openai/gpt-4.1-mini")
    )
    session = Session(default_branch=branch)
    builder = Builder()

    # Step 1: Open and understand document
    doc_reader = builder.add_operation(
        "ReAct",
        node_id="open_document",
        instruct=Instruct(
            instruction="Use ReaderTool to open and analyze the theoretical framework document. Understand its structure and identify sections containing verifiable claims.",
            context={"document_path": str(document_path)},
        ),
        tools=["reader_tool"],
        max_extensions=2,
        verbose=True,
        verbose_length=1000,
    )

    # Step 2: Progressive content analysis
    content_analyzer = builder.add_operation(
        "ReAct",
        node_id="analyze_content",
        depends_on=[doc_reader],
        instruct=Instruct(
            instruction="Read through key sections to identify citations, technical claims, and performance metrics that can be verified."
        ),
        response_format=Outline,
        tools=["reader_tool"],
        max_extensions=3,
        verbose=True,
        verbose_length=1000,
        inherit_context=True,
    )

    # Step 3: Extract specific claims
    claim_extractor = builder.add_operation(
        "ReAct",
        node_id="extract_claims",
        depends_on=[content_analyzer],
        instruct=types.Instruct(
            instruction="Extract 5-7 specific, verifiable claims. Prioritize citations, performance metrics, and technical assertions."
        ),
        response_format=ClaimExtraction,
        tools=["reader_tool"],
        max_extensions=3,
        verbose=True,
        verbose_length=1000,
        inherit_context=True,
    )

    # Execute workflow
    graph = builder.get_graph()
    print("🔗 Executing sequential analysis...")

    result = await session.flow(graph, parallel=False, verbose=True)

    return result


# Execute sequential analysis
result = await sequential_analysis()

🔗 Executing sequential analysis...
Pre-allocated 2 branches
Executing operation: b42e456a


2025-10-15 13:31:53,665 - INFO - detected formats: [<InputFormat.MD: 'md'>]
2025-10-15 13:31:53,667 - INFO - Going to convert document batch...
2025-10-15 13:31:53,668 - INFO - Initializing pipeline for SimplePipeline with options hash 995a146ad601044538e6a923bea22f4e
2025-10-15 13:31:53,679 - INFO - Loading plugin 'docling_defaults'
2025-10-15 13:31:53,681 - INFO - Registered picture descriptions: ['vlm', 'api']
2025-10-15 13:31:53,681 - INFO - Processing document 006_lion_proof_ch2.md
2025-10-15 13:31:54,843 - INFO - Finished converting document 006_lion_proof_ch2.md in 1.18 sec.


### ReAct Round No.1 Analysis:

```yaml
analysis: To understand the theoretical framework document /Users/lion/projects/lionagi/cookbooks/data/006_lion_proof_ch2.md, the crucial first step is to open the document and examine its structure. This will allow identification of major sections, headings, and where verifiable claims—such as hypothesis statements, referenced theories, or empirical results—are presented. This initial opening is required before any partial reading or deeper analysis can happen.
planned_actions:
  - action_type: reader_tool
    description: Open the theoretical framework document located at /Users/lion/projects/lionagi/cookbooks/data/006_lion_proof_ch2.md to analyze its structure and identify sections containing verifiable claims.
extension_needed: True
milestone: Document opened and initial overview obtained.
action_strategy: sequential
action_required: True
reason:
  title: Need to open document before detailed reading
  content: Without opening the document, no structural or content analysis...

[Truncated output]
```

### ReAct Round No.2 Analysis:

```yaml
analysis: The document is large, with 53,001 characters and 12,245 tokens. To understand its structure and identify sections containing verifiable claims, the next step is to read the table of contents or the initial portion (e.g., the first 2000 characters) to extract headings, subheadings and any explicit summary sections. This will provide an overview of the framework's organization and highlight areas likely to contain verifiable claims such as hypotheses, evidence, or citations. Reading the initial segment is necessary before deeper content analysis and claim verification can proceed.
planned_actions:
  - action_type: reader_tool
    description: Read the first 2000 characters of the document DOC_9143726367203047801 to capture the structure including table of contents and section headings.
extension_needed: True
milestone: Obtained document structure and identified candidate sections for verifiable claims.
action_strategy: sequential
action_required: True
reason:
  title: ...

[Truncated output]
```

### ReAct Round No.3 Analysis:

```yaml
analysis: |
  The initial 2000 characters reveal a clear and organized structure for the theoretical framework document. It starts with the introduction (2.1 Introduction) including motivation and contribution overview subsections. The contribution overview lists four main theorems (2.1.2 Contribution Overview), each with a formal proof and related lemmas. These theorems represent key verifiable claims in the text. This introduction also references supporting mechanized models (TLA+ and Lean).
  
  To systematically identify all sections with verifiable claims, the next logical step is to read the table of contents or the next portion of the document that likely contains the section headings for the entire chapter 2. This will allow me to pinpoint all theorem statements and related proofs scattered throughout the document.
  
  Therefore, I will read the next 2000 characters to capture the document outline or table of contents and early parts of the chapter sections beyond intr...

[Truncated output]
```

### ReAct Final Answer:

```yaml
answer: |
  The theoretical framework document is structured into well-defined sections and subsections, enabling a clear understanding of its content and locating verifiable claims. The major structure includes:
  
  1. **2.1 Introduction**
     - 2.1.1 Motivation: Discusses fundamental security challenges in distributed systems.
     - 2.1.2 Contribution Overview: Lists four main theorems representing key verifiable claims:
       - Theorem 2.1 (Cross-Component Capability Flow)
       - Theorem 2.2 (Security Composition)
       - Theorem 2.3 (Confused Deputy Prevention)
       - Theorem 2.4 (Automatic POLA Enforcement)
  
  2. **2.2 System Model and Formal Definitions**
     - 2.2.1 Lion Ecosystem Architecture: Describes core components of the Lion ecosystem.
     - 2.2.2 Formal System Definition: Contains formal definitions such as Definition 2.1 (Lion Capability System as a 7-tuple) and Definition 2.2 (Cross-Component Capability as a 5-tuple).
  
  The sections beginning at...

[Truncated output]
```

Completed operation: b42e456a
Operation 5b2324ec waiting for b42e456a
Operation 5b2324ec inherited context from b42e456a
Executing operation: 5b2324ec


### ReAct Round No.1 Analysis:

```yaml
analysis: |
  To identify citations, technical claims, and performance metrics that can be verified in the theoretical framework document, I will focus on key sections that contain formal theorems, definitions, proofs, and any results or evaluation discussions typically found in a chapter on theory and implementation.
  
  From prior analysis, the document features:
  - Section 2.1 with introduction and theorem overview
  - Section 2.2 with the system model and formal definitions
  
  Since the contribution overview listed four main theorems, I will read the sections containing these theorem statements and their supporting lemmas and proofs. Additionally, I will check for any empirical or performance evaluation sections if present, as they may contain verifiable performance metrics.
  
  Given the size of the document, I propose reading the next key parts where Theorem 2.1 and Theorem 2.2 should be discussed and proven. These sections are likely immediately after the formal def...

[Truncated output]
```

### ReAct Round No.2 Analysis:

```yaml
analysis: |
  The examined section (offsets 4000-8000) reveals detailed formal definitions, theorem statements, and proofs that contain technical claims suitable for verification. Notable content includes:
  
  - Definitions 2.3 to 2.5, which formalize capability authority, component composition compatibility, and security properties respectively. These set the ground for rigorous reasoning about system security.
  
  - The full statement of Theorem 2.1 (Cross-Component Capability Flow Preservation), which asserts that capability authority and unforgeability are preserved across component communications.
  
  - The beginning of Theorem 2.1's proof, outlining lemmas such as Lemma 2.1.1 on WebAssembly isolation preserving capability references. This includes technical claims about host memory separation and cryptographically secure handles.
  
  This content contains concrete formal claims and some references to verified security properties (e.g., unforgeability via cryptographic...

[Truncated output]
```

### ReAct Round No.3 Analysis:

```yaml
analysis: |
  The content from offsets 8000 to 12000 continues detailing the proof of Theorem 2.1 through lemmas that make explicit technical claims and include a concrete Rust code snippet for capability serialization and deserialization. The claims here are highly verifiable, given the code and cryptographic guarantees discussed. Lemma 2.1.2 demonstrates the authority preservation via cryptographic HMAC signatures on capability handles during transfer. Lemma 2.1.3 outlines policy compliance during capability transfers, reinforcing security constraints. This section contains technical claims grounded in formal methods and implementation details but lacks explicit citations or performance metrics.
  
  The next logical step is to proceed reading further segments (offsets 12000-16000) to locate Theorem 2.2 security composition proofs, look for any citations supporting the theoretical claims, and search for performance evaluation or metrics that demonstrate practical impacts.
  
...

[Truncated output]
```

### ReAct Round No.4 Analysis:

```yaml
analysis: |
  The content from offsets 12000 to 16000 covers the conclusion of Theorem 2.1 and introduces Theorem 2.2 on Security Composition. It details the theorem statement and an outline of the proof with associated lemmas, describing exactly how security properties such as unforgeable references, authority confinement, least privilege, and policy compliance are preserved when components are composed. The section includes formal statements and reasoning but does not explicitly include citations or performance metrics.
  
  Mechanization notes for Theorem 2.1 mention the use of TLA+ for model checking, providing confidence in the formal verification process. However, explicit bibliographic citations or performance evaluations have not appeared so far.
  
  To continue identifying verifiable claims, citations, and performance metrics, the next logical step is to read on from character offset 16000 onward. This may include further lemma proofs, any explicit references to relat...

[Truncated output]
```

### ReAct Final Answer:

```yaml
topic: Analysis of Verifiable Citations, Technical Claims, and Performance Metrics in the Lion Ecosystem Theoretical Framework
items:
  - heading: Theorem 2.1: Cross-Component Capability Flow
    summary: Contains technical claims about preservation of capability authority and unforgeability across component boundaries, supported by formal lemmas. Includes proof sketches utilizing WebAssembly memory model and cryptographic HMAC to ensure authority integrity. Code snippets in Rust demonstrate capability serialization and deserialization with proof of integrity. No explicit bibliographic citations or performance metrics present.
  - heading: Theorem 2.2: Security Composition
    summary: Establishes that composing secure Lion components via compatible capability interfaces preserves overall system security properties (unforgeability, authority confinement, least privilege, and policy compliance). The proof uses lemmas about compositional security and interface compatibility. Mech...

[Truncated output]
```

Completed operation: 5b2324ec
Operation 80cba960 waiting for 5b2324ec
Operation 80cba960 inherited context from 5b2324ec
Executing operation: 80cba960


### ReAct Round No.1 Analysis:

```yaml
analysis: |
  Based on the prior comprehensive reading and analysis of the document sections covering the introduction, formal system definitions, theorem statements and proofs, I have identified several specific, verifiable claims that include technical assertions and mechanized proof notes. While explicit bibliographic citations were scarce in the analyzed text, the document references mechanized models (TLA+ and Lean) which serve as verifiable formal methods evidence. No explicit performance metrics have been encountered so far; these may appear later in evaluation chapters but are not part of the theoretical framework document analyzed.
  
  The specific verifiable claims extracted are:
  
  1. **Theorem 2.1: Cross-Component Capability Flow Preservation**
     - The capability authority and unforgeability are preserved precisely across component boundaries during communication.
     - Mechanized verification via TLA+ confirms state-based authority consistency.
  
  2. **Rus...

[Truncated output]
```

### ReAct Final Answer:

```yaml
claims:
  - claim: Theorem 2.1 states that capability authority and unforgeability are preserved precisely across component boundaries during communication in the Lion ecosystem.
    type: technical
    location: Section 2.3.1 Theorem 2.1: Cross-Component Capability Flow
    verifiability: high
    search_strategy: Review formal statement of Theorem 2.1 and mechanized TLA+ model checking described in Appendix A.2 to confirm preservation of authority and unforgeability.
  - claim: Rust code implementing capability serialization and deserialization guarantees integrity by using cryptographic HMAC signatures to protect capability handles against forgery or tampering.
    type: technical
    location: Section 2.3.2 Lemma 2.1.2: Capability Transfer Protocol Preserves Authority
    verifiability: high
    search_strategy: Audit the Rust functions serialize_capability and deserialize_capability and verify the use of HMAC signatures for integrity checks as described in the proof.
  - c...

[Truncated output]
```

Completed operation: 80cba960


In [5]:
from IPython.display import display, Markdown

# Display results
for node_id, data in result["operation_results"].items():
    if isinstance(data, Outline):
        md_content = f"""
## 📄 Document Structure ({node_id})

**Topic:** {data.topic}

### Key Sections:
"""
        for item in data.items[:3]:  # Show first 3
            md_content += f"- **{item.heading}**: {item.summary}\n"

        display(Markdown(md_content))

    elif isinstance(data, ClaimExtraction):
        md_content = f"""
## 📑 Extracted Claims ({node_id})

Found **{len(data.claims)}** verifiable claims:

"""
        for i, claim in enumerate(data.claims, 1):
            md_content += f"""
### {i}. [{claim.type.upper()}] {claim.claim}

- **Location:** {claim.location}  
- **Verifiability:** {claim.verifiability}
- **Search Strategy:** {claim.search_strategy}

"""
        display(Markdown(md_content))

display(Markdown("## ✅ Sequential analysis completed"))


## 📄 Document Structure (5b2324ec-2cf2-438c-baf4-756b664c3d71)

**Topic:** Analysis of Verifiable Citations, Technical Claims, and Performance Metrics in the Lion Ecosystem Theoretical Framework

### Key Sections:
- **Theorem 2.1: Cross-Component Capability Flow**: Contains technical claims about preservation of capability authority and unforgeability across component boundaries, supported by formal lemmas. Includes proof sketches utilizing WebAssembly memory model and cryptographic HMAC to ensure authority integrity. Code snippets in Rust demonstrate capability serialization and deserialization with proof of integrity. No explicit bibliographic citations or performance metrics present.
- **Theorem 2.2: Security Composition**: Establishes that composing secure Lion components via compatible capability interfaces preserves overall system security properties (unforgeability, authority confinement, least privilege, and policy compliance). The proof uses lemmas about compositional security and interface compatibility. Mechanized proofs are noted in Lean for formal verification. No explicit external citations or performance data yet.
- **Theorem 2.3: Confused Deputy Prevention**: Formalizes prevention of confused deputy attacks by requiring explicit capability transfers for any privileged action. The theorem states that no component can perform unauthorized actions on behalf of others without proper capabilities. This section begins formal statements and proof outlines but details are truncated at the read limit. No citations or performance metrics identified yet.



## 📑 Extracted Claims (80cba960-cd4d-407d-b06d-0d692826328b)

Found **7** verifiable claims:


### 1. [TECHNICAL] Theorem 2.1 states that capability authority and unforgeability are preserved precisely across component boundaries during communication in the Lion ecosystem.

- **Location:** Section 2.3.1 Theorem 2.1: Cross-Component Capability Flow  
- **Verifiability:** high
- **Search Strategy:** Review formal statement of Theorem 2.1 and mechanized TLA+ model checking described in Appendix A.2 to confirm preservation of authority and unforgeability.


### 2. [TECHNICAL] Rust code implementing capability serialization and deserialization guarantees integrity by using cryptographic HMAC signatures to protect capability handles against forgery or tampering.

- **Location:** Section 2.3.2 Lemma 2.1.2: Capability Transfer Protocol Preserves Authority  
- **Verifiability:** high
- **Search Strategy:** Audit the Rust functions serialize_capability and deserialize_capability and verify the use of HMAC signatures for integrity checks as described in the proof.


### 3. [TECHNICAL] Theorem 2.2 demonstrates that composing individually secure Lion components via compatible capability interfaces preserves overall system security properties such as unforgeability, authority confinement, least privilege, and policy compliance.

- **Location:** Section 2.4 Theorem 2.2: Security Composition  
- **Verifiability:** high
- **Search Strategy:** Examine the formal theorem statement, supporting lemmas 2.2.1 and 2.2.2, and the mechanized proof in Lean that checks compositional security invariants.


### 4. [TECHNICAL] WebAssembly isolation enforces memory separation, ensuring that capability references stored in host memory are isolated from WebAssembly module memory, preventing unauthorized tampering; handles are injective, unguessable, and unforgeable due to cryptographic properties.

- **Location:** Section 2.3.2 Lemma 2.1.1: WebAssembly Isolation Preserves Capability References  
- **Verifiability:** high
- **Search Strategy:** Validate the formal definitions of memory separation, properties of handles, and supporting argument that WebAssembly cannot forge capability references.


### 5. [TECHNICAL] Theorem 2.3 formally claims that no component can perform actions on behalf of another without explicit capability transfer, thus preventing classic confused deputy attacks.

- **Location:** Section 2.5 Theorem 2.3: Confused Deputy Prevention  
- **Verifiability:** medium
- **Search Strategy:** Check the formal statement and ensuing proof that any privileged action requires a corresponding capability explicitly held by the performing component.


### 6. [TECHNICAL] Capability transfers are mediated by a policy engine which enforces system policies, permitting transfers only when policy_allows(source, target, capability) is true, ensuring all capability transfers comply with security policies.

- **Location:** Section 2.3.2 Lemma 2.1.3: Policy Compliance During Transfer  
- **Verifiability:** medium
- **Search Strategy:** Review the formal modeling of send events and policy evaluation as described; verify system implementation for policy mediation during capability transfer.


### 7. [OTHER] Mechanized proofs of key theorems are encoded using formal tools TLA+ (for Theorem 2.1) and Lean (for Theorem 2.2), which provide algorithmic verification beyond textual argumentation.

- **Location:** Sections 2.3.1 Mechanization note; 2.4 Mechanization note  
- **Verifiability:** high
- **Search Strategy:** Access and review the mechanized models in TLA+ and Lean referenced, examining proof scripts and model checking results for confirmation.



## ✅ Sequential analysis completed