# Parallel Processing for Street View Analysis

## Overview
This notebook implements parallel processing for analyzing thousands of Street View images using LangGraph with multiple concurrent nodes.

## Problem Statement
- **Scale**: Thousands of locations × multiple orientations (horizon/ground/sky) = 10k+ images
- **Bottleneck**: Sequential AI model calls would take hours
- **Solution**: Parallel processing with bounded concurrency

## Architecture
- **Input**: CSV with Street View URLs (from `google_apis.ipynb`)
- **Processing**: LangGraph parallel nodes for vision analysis
- **Output**: Graded/analyzed results per image

## Concurrency Strategy
- **Measure**: Single call latency (L) for vision model
- **Calculate**: k ≈ T × L (where T = target throughput)
- **Start**: k=16 nodes, scale up based on provider limits
- **Monitor**: 429 errors, latency percentiles, success rates

## Key Considerations
- **Provider Limits**: API rate limits and concurrent request caps
- **Rate Limiting**: Client-side throttling to avoid 429s
- **Retry Logic**: Exponential backoff with jitter
- **Checkpointing**: Idempotent tasks to avoid duplicate billing
- **Budget Guards**: Max requests per minute/hour

## Implementation Plan
1. Load CSV and create work queue
2. Implement bounded concurrency with LangGraph
3. Add retry logic and rate limiting
4. Monitor metrics and scale accordingly

In [1]:
import sys
from pathlib import Path

sys.path.append(str(Path().absolute().parent))  # This makes the parent directory available so you can use clean absolute imports like from src.graph import ...

In [None]:
from src.state import MultiState
from src.prompts import multimodal_prompt
from src.models import get_multimodal_model
from langgraph.graph import StateGraph, START, END
from langchain.agents import create_agent
from typing import Literal
from langgraph.types import Command
from src.utils import prepare_multimodal_message

multimodal_model = get_multimodal_model()
multimodal_agent = create_agent(
    model=multimodal_model,
    tools=[],
    system_prompt=multimodal_prompt
)

def get_data_node(state: MultiState):
    """
    Splits the csv into chunks and passes a single chunk to each agent parallel node
    """
    # do we need this? Im thinking: pass just a portion of csvs to each agent parallel node
    return 

async def multimodal_node(state: MultiState) -> Command[Literal["__end__"]]:   # after multimodal -> stop (could change later)
    """
    Handles multimodal inputs with multimodal model
    """

    # construct multimodal input message
    multimodal_msg = prepare_multimodal_message(state)  # returns HumanMessage

    # clear history of last message to swap last one with the new, multimodal one
    history = state.get("messages", [])[:-1] if state.get("messages", []) else []
    updated_history = history + [multimodal_msg]  # LG wants lists to concatenate messages

    result = await multimodal_agent.ainvoke({"messages": updated_history})
    last_msg = result["messages"][-1]

    return Command(
        update={
            "messages" : [last_msg],  # must be a list
            "images" : [],  # clearing images after invocation, keep memory lightweight
        },
        goto=END
    )

def join_node(state: MultiState):
    """
    Join the results from the multimodal agents
    """
    # do we need this?
    return 

def build_parallel_graph(checkpointer, n_nodes=2,save_display=False) -> StateGraph:
    """
    Get the builder for the graph
    """
    builder = StateGraph(MultiState)
    # nodes
    builder.add_node("get_data", get_data_node)
    builder.add_node("join", join_node)
    for i in range(n_nodes):
        builder.add_node(f"multimodal_agent_{i}", multimodal_node)
    # edges
    # Need to investigate more on parallel edges: do i need to join? Do i use `Send`?
    builder.add_edge(START, "get_data")
    for i in range(n_nodes):
        builder.add_edge(f"get_data", f"multimodal_agent_{i}")
        builder.add_edge(f"multimodal_agent_{i}", "join")
    builder.add_edge("join", END)

    graph = builder.compile(checkpointer=checkpointer)

    if save_display:
        # save the graph display to file
        img = graph.get_graph().draw_mermaid_png() # returns bytes
        # save the bytes to file 
        with open("./graph.png", "wb") as f:
            f.write(img)
        print("Graph display saved to ./src/graph.png")

    return graph

Using OPENAI model
