# Notebook 3: Results Analysis & Visualization

**Objective:** To evaluate the performance of our trained Synapse agent and compare it against baseline routing strategies.

This notebook will:
1. Load the trained PPO model.
2. Define evaluation logic for different agents (PPO, OSPF, Random).
3. Run a series of evaluation episodes for each agent to collect performance metrics.
4. Generate plots comparing the agents on key metrics like latency, path length, and success rate. These plots are intended for the research paper.

### 1. Imports and Setup

In [None]:
import sys
import os
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from tqdm import tqdm
from stable_baselines3 import PPO

# Add the project root to the Python path
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), os.pardir)))
from synapse.network_env import NetworkRoutingEnv
from synapse.baselines import ospf_routing

### 2. Load Model and Environment

In [None]:
TOPOLOGY_FILE = '../data/topologies/nsfnet.gml'
MODEL_SAVE_PATH = '../models/synapse_ppo_nsfnet.zip'
NUM_EVAL_EPISODES = 200 # Use a significant number of episodes for stable results

# Create the evaluation environment
eval_env = NetworkRoutingEnv(graph_file=TOPOLOGY_FILE)

# Load the trained agent
try:
    agent = PPO.load(MODEL_SAVE_PATH)
    print("Trained PPO agent loaded successfully.")
except FileNotFoundError:
    print(f"ERROR: Model not found at {MODEL_SAVE_PATH}")
    print("Please run Notebook 2 to train and save the model first.")
    agent = None

### 3. Define Evaluation Logic

We will create a standardized evaluation loop that can run any given policy (our RL agent, OSPF, or random) and collect metrics.

In [None]:
def evaluate_policy(agent_type, model, env, num_episodes):
    results = []
    
    for _ in tqdm(range(num_episodes), desc=f"Evaluating {agent_type}"):
        obs, info = env.reset()
        terminated = False
        episode_latency = 0
        path_len = 0
        
        # For OSPF, calculate the full path once at the beginning
        if agent_type == 'OSPF':
            source = obs['current_node']
            dest = obs['destination_node']
            ospf_path = ospf_routing(env.graph, source, dest, weight='weight')
        
        while not terminated:
            if agent_type == 'PPO':
                action, _ = model.predict(obs, deterministic=True)
            elif agent_type == 'Random':
                action = env.action_space.sample()
            elif agent_type == 'OSPF':
                current_node = obs['current_node']
                if not ospf_path or current_node not in ospf_path:
                    action = env.action_space.sample() # Fallback if path is broken
                else:
                    current_idx = ospf_path.index(current_node)
                    next_hop = ospf_path[current_idx + 1]
                    neighbors = list(env.graph.neighbors(current_node))
                    action = neighbors.index(next_hop)
            
            obs, reward, terminated, truncated, info = env.step(action)
            if terminated:
                success = (obs['current_node'] == obs['destination_node'])
                episode_latency = info['total_latency']
                path_len = len(info['path']) -1
                results.append({'latency': episode_latency, 'path_length': path_len, 'success': success})

    return pd.DataFrame(results)


### 4. Run Evaluations

Now we run the evaluation for all three strategies. This may take a few minutes.

In [None]:
all_results = pd.DataFrame()

# Evaluate PPO Agent
if agent:
    ppo_results = evaluate_policy('PPO', agent, eval_env, NUM_EVAL_EPISODES)
    ppo_results['Agent'] = 'PPO (Synapse)'
    all_results = pd.concat([all_results, ppo_results])

# Evaluate Random Agent
random_results = evaluate_policy('Random', None, eval_env, NUM_EVAL_EPISODES)
random_results['Agent'] = 'Random'
all_results = pd.concat([all_results, random_results])

# Evaluate OSPF Agent
ospf_results = evaluate_policy('OSPF', None, eval_env, NUM_EVAL_EPISODES)
ospf_results['Agent'] = 'OSPF (Shortest Path)'
all_results = pd.concat([all_results, ospf_results])

print("\n--- Evaluation Complete ---")
all_results.head()

### 5. Analyze and Plot Results

This is where we generate the final figures for the paper. We'll compare the agents on three key metrics:
1.  **Average Latency:** Lower is better. This is our primary optimization goal.
2.  **Average Path Length:** The number of hops. We want to see if our agent finds longer but faster paths.
3.  **Success Rate:** The percentage of packets that successfully reach their destination.

In [None]:
# Calculate summary statistics
summary = all_results.groupby('Agent').agg(
    avg_latency=('latency', 'mean'),
    avg_path_length=('path_length', 'mean'),
    success_rate=('success', 'mean')
).reset_index()

summary['success_rate'] = summary['success_rate'] * 100 # Convert to percentage

print("--- Performance Summary ---")
print(summary)

# --- Generate Plots ---
sns.set_theme(style="whitegrid")
fig, axes = plt.subplots(1, 3, figsize=(20, 6))
fig.suptitle('Routing Performance Comparison', fontsize=16)

# Plot 1: Average Latency
sns.barplot(x='Agent', y='avg_latency', data=summary, ax=axes[0], palette='viridis')
axes[0].set_title('Average Packet Latency')
axes[0].set_ylabel('Latency (simulation units)')

# Plot 2: Average Path Length
sns.barplot(x='Agent', y='avg_path_length', data=summary, ax=axes[1], palette='plasma')
axes[1].set_title('Average Path Length (Hops)')
axes[1].set_ylabel('Number of Hops')

# Plot 3: Success Rate
sns.barplot(x='Agent', y='success_rate', data=summary, ax=axes[2], palette='magma')
axes[2].set_title('Packet Delivery Success Rate')
axes[2].set_ylabel('Success Rate (%)')
axes[2].set_ylim(0, 101)

plt.tight_layout(rect=[0, 0, 1, 0.96])
plt.show()

### 6. Discussion of Results

**Expected Outcome:**
*   **Random:** Will perform the worst across all metrics, with very high latency, long paths (due to loops), and a low success rate.
*   **OSPF:** Will have the shortest path length by definition. However, because it is unaware of traffic, it will frequently send packets into congested areas, leading to high latency.
*   **PPO (Synapse):** The trained agent should achieve the **lowest average latency**. It will likely accomplish this by sometimes choosing slightly longer paths (higher hop count than OSPF) to deliberately avoid congested links. Its success rate should be near 100%.

This result—sacrificing the shortest path for a faster overall journey—is the core demonstration of an intelligent, traffic-aware routing system and will be a key point in your paper's analysis.