# SOTA Architecture Comparison

This notebook analyzes and compares the architectural patterns of State-of-the-Art (SOTA) Deep Research frameworks: **FlowSearch**, **RhinoInsight**, and **LangChain Open Deep Research**.

We will explore their different mental models for:
1.  **State Management**: How they track progress.
2.  **Control Flow**: How they decide what to do next.
3.  **Failure Recovery**: How they handle dead ends.


## 1. State Management Comparison

Different frameworks visualize the "Unit of Work" differently.

In [None]:
from typing import TypedDict, List, Optional, Any
from dataclasses import dataclass, field

# --- LangChain / Standard Model ---
# State is a linear list of messages and some accumulated artifacts.
class StandardState(TypedDict):
    messages: List[str]
    documents: List[str]
    
# --- RhinoInsight Model ---
# State is a verified checklist. Progress is boolean (Checked/Unchecked).
@dataclass
class ChecklistItem:
    goal: str
    criteria: str
    verified: bool = False
    evidence: Optional[str] = None

@dataclass
class RhinoState:
    checklist: List[ChecklistItem]
    
# --- FlowSearch Model ---
# State is a Graph. Progress is topological.
@dataclass
class FlowNode:
    id: str
    description: str
    dependencies: List[str]
    status: str = "pending" # pending, active, done, failed
    result: Any = None

@dataclass
class FlowState:
    nodes: dict[str, FlowNode]
    frontier: List[str]


### Analysis

*   **Standard**: Simple, but prone to getting lost in long contexts. Hard to tell "how done" it is.
*   **RhinoInsight**: Very clear definition of done. Excellent for transparency. Rigid if the checklist was wrong.
*   **FlowSearch**: Most flexible. Can handle dependencies ("I need to find X before I can search for Y"). Complex to manage.

## 2. Control Flow Simulation

Let's simulate how each architecture handles a **Dead End**.

**Scenario**: The agent tries to find the "release date of GPT-6", but it hasn't been released.

In [None]:
class MockSearchTool:
    def search(self, query):
        if "GPT-6" in query:
            return "No official release date for GPT-6 has been announced."
        return "Some other info."

tool = MockSearchTool()

# --- Standard Approach ---
def standard_loop(query):
    print("--- Standard ---")
    result = tool.search(query)
    # Standard agent might hallucinate or just say "I found this".
    # It doesn't inherently know it FAILED unless the LLM catches it.
    print(f"Result: {result}")
    print("Action: Summarize finding (Risk: Hallucination)")

# --- RhinoInsight Approach ---
def rhino_loop(query):
    print("\n--- RhinoInsight ---")
    item = ChecklistItem(goal=query, criteria="Official date from OpenAI")
    result = tool.search(query)
    
    # Explicit Verification Step
    if "No official release date" in result:
        print("Verification: FAILED (Criteria not met)")
        print("Action: Mark as Unverified. User knows info is missing.")
    else:
        item.verified = True

# --- FlowSearch Approach ---
def flow_loop(query):
    print("\n--- FlowSearch ---")
    node = FlowNode(id="1", description=query, dependencies=[])
    result = tool.search(query)
    
    if "No official release date" in result:
        node.status = "failed"
        print("Status: Node Failed.")
        print("Action: Trigger Backtracking. Generate ALTERNATIVE node.")
        print("New Plan: Search for 'GPT-5 release schedule' to extrapolate.")
    else:
        node.status = "done"

standard_loop("release date of GPT-6")
rhino_loop("release date of GPT-6")
flow_loop("release date of GPT-6")

### Key Insight

*   **RhinoInsight** is safe: it fails gracefully and reports the failure.
*   **FlowSearch** is resilient: it fails, then *adapts* (Backtracking) to try a different angle.
*   **Standard** is risky: it relies entirely on the LLM's inherent logical consistency, which degrades over long contexts.

## 3. Conclusion & Recommendation

For the `deep-research` repository, we recommend a hybrid approach:

1.  **Use FlowSearch's Graph State** as the underlying data structure (`OverallState`).
2.  **Use RhinoInsight's Verification Logic** as the transition condition between nodes (only mark a node `done` if verified).
3.  **Use LangGraph** to orchestrate the execution.

This gives us the resiliency of FlowSearch with the safety of RhinoInsight.