# Multi-Layer LangGraph Log Analysis

This notebook provides an interactive interface for running multi-layer log analysis using LangGraph.

## Architecture

- **Layer 1: Router** - Routes log files directly to their corresponding tools (no keyword searching needed)
- **Layer 2: Tool Nodes** - web_tool, app_tool, db_tool (separate nodes)
- **Layer 3: Aggregator** - Collects all tool results
- **Layer 4: Summarizer** - Creates final comprehensive summary

## Direct File Routing

- `web.log` → `web_tool`
- `app.log` → `app_tool`
- `db.log` → `db_tool`


## Setup and Imports


In [None]:
import os
import sys
from pathlib import Path

# Set OpenAI API key if not already set
if "OPENAI_API_KEY" not in os.environ:
    import getpass
    os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key: ")

# Read path from config
from config import BACKEND_DIR

# Add backend parent to Python path
backend_parent = BACKEND_DIR.parent
if str(backend_parent) not in sys.path:
    sys.path.insert(0, str(backend_parent))

print(f"Backend directory: {BACKEND_DIR}")

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

from backend.analysis.tools import create_web_rag_tool, create_app_rag_tool, create_db_rag_tool
from backend.analysis.graph import (
    create_router_node,
    create_web_tool_node,
    create_app_tool_node,
    create_db_tool_node,
    create_aggregator_node,
    create_summarizer_node,
    build_multi_layer_graph,
    MultiLayerState
)

print("Imports successful!")

Backend directory: c:\Users\Chandu\Documents\AIOps_Solution\backend
Imports successful!


## Load Log Files Function


In [None]:
def load_logs(scenario="scenario1_web_issue"):
    """
    Load log files separately from logs directory.
    Returns a dictionary with separate log contents - NOT joined/combined.
    Each log file stays separate: web.log, app.log, db.log
    
    Args:
        scenario: Scenario directory name (default: "scenario1_web_issue")
    
    Returns:
        dict: Dictionary with keys 'web', 'app', 'db' containing log file contents
    """
    # Read path from config
    from config import LOGS_DIR
    
    # Logs are in backend/data/logs/scenario/
    logs_dir = LOGS_DIR / scenario
    
    if not logs_dir.exists():
        raise FileNotFoundError(
            f"Logs directory not found: {logs_dir}\n"
            f"Available scenarios: {list(LOGS_DIR.iterdir()) if LOGS_DIR.exists() else 'No logs directory found'}"
        )
    
    # Load each log file separately - keep them separate, don't combine
    logs = {}
    for tier in ["web", "app", "db"]:
        path = logs_dir / f"{tier}.log"
        if path.exists():
            with open(path, encoding="utf-8") as f:
                logs[tier] = f.read()
        else:
            logs[tier] = ""
            print(f"Warning: {path} not found")
    
    return logs


## Initialize System

### Step 1: System Overview


In [None]:
print("Initializing Multi-Layer LangGraph System...")
print("=" * 80)
print("Layer 1: Router - Decides which tools to use")
print("Layer 2: Tool Nodes - web_tool, app_tool, db_tool (separate nodes)")
print("Layer 3: Aggregator - Collects all tool results")
print("Layer 4: Summarizer - Creates final summary")
print("=" * 80)


Initializing Multi-Layer LangGraph System...
Layer 1: Router - Decides which tools to use
Layer 2: Tool Nodes - web_tool, app_tool, db_tool (separate nodes)
Layer 3: Aggregator - Collects all tool results
Layer 4: Summarizer - Creates final summary


### Step 2: Initialize LLM and Create RAG Tools


In [None]:
# Initialize LLM
llm = ChatOpenAI(model="gpt-4o-mini")

# Create RAG tools
print("\nCreating RAG tools...")
web_rag_tool = create_web_rag_tool()
app_rag_tool = create_app_rag_tool()
db_rag_tool = create_db_rag_tool()
print("All RAG tools created")



Creating RAG tools...
All RAG tools created


### Step 3: Create Graph Nodes


In [None]:
# Create nodes for each layer
print("\nCreating graph nodes...")

# Layer 1: Router (no LLM needed - simple routing logic)
router_node = create_router_node()
print("  Router node created")

# Layer 2: Tool nodes
web_tool_node = create_web_tool_node(web_rag_tool)
app_tool_node = create_app_tool_node(app_rag_tool)
db_tool_node = create_db_tool_node(db_rag_tool)
print("  Tool nodes created (web, app, db)")

# Layer 3: Aggregator
aggregator_node = create_aggregator_node()
print("  Aggregator node created")

# Layer 4: Summarizer
summarizer_node = create_summarizer_node(llm)
print("  Summarizer node created")



Creating graph nodes...
  Router node created
  Tool nodes created (web, app, db)
  Aggregator node created
  Summarizer node created


### Step 4: Build the Graph


In [None]:
# Build graph
print("\nBuilding multi-layer graph...")
compiled_graph = build_multi_layer_graph(
    router_node,
    web_tool_node,
    app_tool_node,
    db_tool_node,
    aggregator_node,
    summarizer_node
)
print("Graph built successfully")



Building multi-layer graph...
Graph built successfully


## Load and Prepare Log Files


In [None]:
# Load logs separately - no need to combine
print("\nLoading logs...")
logs = load_logs()  # Change scenario name if needed: load_logs("scenario2_app_issue")

# Keep logs separate - web.log to web_tool, app.log to app_tool, db.log to db_tool
web_log = logs.get("web", "")
app_log = logs.get("app", "")
db_log = logs.get("db", "")

total_chars = len(web_log) + len(app_log) + len(db_log)
if total_chars == 0:
    print("Warning: No logs found.")
else:
    print(f"Loaded logs: Web={len(web_log)} chars, App={len(app_log)} chars, DB={len(db_log)} chars")



Loading logs...
Loaded logs: Web=2598 chars, App=1898 chars, DB=1651 chars


## Create Initial State


In [None]:
# Create initial state with separate log files
initial_state: MultiLayerState = {
    "messages": [HumanMessage(content="Analyzing system logs from web, app, and db tiers")],
    "web_log": web_log,      # web.log goes directly to web_tool
    "app_log": app_log,      # app.log goes directly to app_tool
    "db_log": db_log,        # db.log goes directly to db_tool
    "web_result": "",
    "app_result": "",
    "db_result": "",
    "next": "",
    "tool_results": {}
}

print("Initial state created with separate log files")


Initial state created with separate log files


## Run Analysis

### Stream Execution (See Each Layer)


In [None]:
# Check if compiled_graph exists
if 'compiled_graph' not in globals():
    raise NameError(
        "compiled_graph is not defined. "
        "Please run the 'Build the Graph' cell (Step 4) first to create compiled_graph."
    )

# Run the graph
print("\nRunning multi-layer analysis...")
print("=" * 80)

try:
    # Stream results to see each layer
    for step in compiled_graph.stream(initial_state, {"recursion_limit": 20}):
        for node_name, node_output in step.items():
            if node_name != "__end__":
                print(f"\n[Layer: {node_name}]")
                if "messages" in node_output and node_output["messages"]:
                    last_msg = node_output["messages"][-1]
                    if hasattr(last_msg, "content"):
                        content = str(last_msg.content)
                        # Show first 300 chars
                        preview = content[:300] + "..." if len(content) > 300 else content
                        print(preview)
                if "next" in node_output:
                    print(f"  Next: {node_output['next']}")
                print("-" * 80)
except Exception as e:
    print(f"Error during streaming: {e}")
    import traceback
    traceback.print_exc()



Running multi-layer analysis...

[Layer: router]
  Next: web_tool
--------------------------------------------------------------------------------

[Layer: web_tool]
Web Tier Analysis:
The logs analyzed contain multiple ERROR entries indicating issues with the web tier. Here’s a summary of the findings:

### 1. **Error Detection:**
- The logs include multiple entries with [ERROR] markers, alongside server responses indicating **502 Bad Gateway**, **504 Gateway T...
  Next: router
--------------------------------------------------------------------------------

[Layer: router]
  Next: app_tool
--------------------------------------------------------------------------------

[Layer: app_tool]
App Tier Analysis:
The provided logs primarily consist of [INFO] level messages that indicate successful operations of various services. There are no [ERROR] or [WARN] messages present.

### Analysis of Logs:
- All entries in the logs are at the INFO level.
- Each log entry indicates a successful r

### Get Final Result


In [None]:
try:
    # Get final result
    final_result = compiled_graph.invoke(initial_state, {"recursion_limit": 20})
    
    print("\nAnalysis Complete!")
    print("=" * 80)
    
    # Print final summary
    if "messages" in final_result:
        print("\nFINAL SUMMARY")
        print("=" * 80)
        for msg in final_result["messages"]:
            if hasattr(msg, "name") and msg.name == "summarizer":
                if hasattr(msg, "content"):
                    print(msg.content)
                break
except Exception as e:
    print(f"Error during analysis: {e}")
    import traceback
    traceback.print_exc()



Analysis Complete!

FINAL SUMMARY
# FINAL INCIDENT SUMMARY

### Final Incident Summary

#### 1. Executive Summary
On January 15, 2024, a series of issues were identified within the web tier, leading to significant HTTP errors including `502 Bad Gateway`, `503 Service Unavailable`, and `504 Gateway Timeout`. The application tier and database tier, however, remained healthy throughout the incident, demonstrating stability and efficiency in their operations. This incident underscores the importance of ensuring robust communication and performance monitoring between the web and upstream services.

#### 2. Tier Analysis
- **Web Tier**:
  - **Status**: Unhealthy
  - **Errors**: Frequent `502`, `503`, and `504` errors indicating upstream service communication failures and server overload.
  - **Severity**: High
  - **Key Findings**: The web tier is compromised due to upstream server responsiveness issues and potential overload. Immediate investigation into backend service health is required.

## View Individual Tool Results

You can also inspect individual tool results from the final state:


In [None]:
# Uncomment to view individual results:
#print("\nWeb Tier Results:")
#print(final_result.get("web_result", "No results"))
#print("\nApp Tier Results:")
#print(final_result.get("app_result", "No results"))
#print("\nDB Tier Results:")
#print(final_result.get("db_result", "No results"))


In [None]:
# Visualize Graph Structure (Mermaid Diagram)
print("Graph Structure (Mermaid Diagram):")
print("=" * 80)
mermaid_diagram = compiled_graph.get_graph().draw_mermaid()
print(mermaid_diagram)

# Save to file
with open("graph_structure.mmd", "w", encoding="utf-8") as f:
    f.write(mermaid_diagram)
print("\nGraph saved to graph_structure.mmd")
print("You can view it at: https://mermaid.live/")

Graph Structure (Mermaid Diagram):


NameError: name 'compiled_graph' is not defined