# Student Simulation: Exploring p_kaia and p_pentagon

**Authors:** Brage Bromset Bestvold and Gabriel RÃ¸er

This notebook explores how the probabilities of entering Pentagon (`p_pentagon`) and Kaia (`p_kaia`) dormitories affect the simulation outcomes for student Alex walking home from AudMax.

In [None]:
import sys
import os

# Add src directory to path for imports
sys.path.insert(0, os.path.abspath(os.path.join(os.getcwd(), '..', 'src')))

import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
from student_sim.sim_factory import create_simulation

sns.set(style="whitegrid")

# Suppress verbose output from simulation
import io
from contextlib import redirect_stdout

print("Libraries loaded successfully!")

## Simulation Setup

We will run simulations with different combinations of `p_kaia` and `p_pentagon` values to observe:
1. How many students end up at each destination
2. How many steps and how much time it takes to reach a destination
3. How these outcomes depend on the entry probabilities

In [None]:
def run_simulation_silent(n_students, p_pentagon, p_kaia, n_runs=10):
    """
    Run multiple simulations with given entry probabilities.
    Returns aggregated statistics.
    """
    results = []
    
    for run in range(n_runs):
        # Suppress print output
        f = io.StringIO()
        with redirect_stdout(f):
            world, sim = create_simulation(
                n_students=n_students,
                endpoints=[
                    {"name": "Pentagon", "position": 10, "entry_prob": p_pentagon, "max_capacity": None},
                    {"name": "Kaia", "position": 90, "entry_prob": p_kaia, "max_capacity": None}
                ]
            )
            sim.run()
        
        stats = sim.get_stats()
        results.append({
            'run': run,
            'p_pentagon': p_pentagon,
            'p_kaia': p_kaia,
            'pentagon_count': stats['destinations'].get('Pentagon', 0),
            'kaia_count': stats['destinations'].get('Kaia', 0),
            'avg_steps': stats['avg_steps'],
            'avg_time': stats['avg_time']
        })
    
    return pd.DataFrame(results)

## Experiment 1: Varying p_pentagon with fixed p_kaia

First, let's see how the number of students ending up at Pentagon changes when we vary `p_pentagon` while keeping `p_kaia` constant at 0.5.

In [None]:
# Define probability values to test
p_values = [0.2, 0.4, 0.6, 0.8, 1.0]
n_students = 20
n_runs = 5

# Run simulations with varying p_pentagon
print("Running simulations with varying p_pentagon (p_kaia fixed at 0.5)...")
results_pentagon = []
for p_pent in p_values:
    print(f"  Testing p_pentagon = {p_pent}")
    df = run_simulation_silent(n_students, p_pent, 0.5, n_runs)
    results_pentagon.append(df)

df_pentagon = pd.concat(results_pentagon, ignore_index=True)
print("Done!")

In [None]:
# Plot: Destination distribution vs p_pentagon
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Average destination counts
avg_by_p_pent = df_pentagon.groupby('p_pentagon')[['pentagon_count', 'kaia_count']].mean()
avg_by_p_pent.plot(kind='bar', ax=axes[0], color=['steelblue', 'coral'])
axes[0].set_title('Average Students per Destination vs p_pentagon\n(p_kaia = 0.5)')
axes[0].set_xlabel('p_pentagon')
axes[0].set_ylabel('Number of Students')
axes[0].legend(['Pentagon', 'Kaia'])
axes[0].tick_params(axis='x', rotation=0)

# Average time and steps
avg_time_steps = df_pentagon.groupby('p_pentagon')[['avg_steps', 'avg_time']].mean()
avg_time_steps['avg_time_normalized'] = avg_time_steps['avg_time'] / 100  # Scale for visibility
avg_time_steps[['avg_steps', 'avg_time_normalized']].plot(kind='bar', ax=axes[1], color=['green', 'purple'])
axes[1].set_title('Average Steps and Time vs p_pentagon\n(p_kaia = 0.5)')
axes[1].set_xlabel('p_pentagon')
axes[1].set_ylabel('Count')
axes[1].legend(['Avg Steps', 'Avg Time / 100'])
axes[1].tick_params(axis='x', rotation=0)

plt.tight_layout()
plt.show()

## Experiment 2: Varying p_kaia with fixed p_pentagon

Now let's vary `p_kaia` while keeping `p_pentagon` constant at 0.5.

In [None]:
# Run simulations with varying p_kaia
print("Running simulations with varying p_kaia (p_pentagon fixed at 0.5)...")
results_kaia = []
for p_k in p_values:
    print(f"  Testing p_kaia = {p_k}")
    df = run_simulation_silent(n_students, 0.5, p_k, n_runs)
    results_kaia.append(df)

df_kaia = pd.concat(results_kaia, ignore_index=True)
print("Done!")

In [None]:
# Plot: Destination distribution vs p_kaia
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Average destination counts
avg_by_p_kaia = df_kaia.groupby('p_kaia')[['pentagon_count', 'kaia_count']].mean()
avg_by_p_kaia.plot(kind='bar', ax=axes[0], color=['steelblue', 'coral'])
axes[0].set_title('Average Students per Destination vs p_kaia\n(p_pentagon = 0.5)')
axes[0].set_xlabel('p_kaia')
axes[0].set_ylabel('Number of Students')
axes[0].legend(['Pentagon', 'Kaia'])
axes[0].tick_params(axis='x', rotation=0)

# Average time and steps
avg_time_steps_k = df_kaia.groupby('p_kaia')[['avg_steps', 'avg_time']].mean()
avg_time_steps_k['avg_time_normalized'] = avg_time_steps_k['avg_time'] / 100
avg_time_steps_k[['avg_steps', 'avg_time_normalized']].plot(kind='bar', ax=axes[1], color=['green', 'purple'])
axes[1].set_title('Average Steps and Time vs p_kaia\n(p_pentagon = 0.5)')
axes[1].set_xlabel('p_kaia')
axes[1].set_ylabel('Count')
axes[1].legend(['Avg Steps', 'Avg Time / 100'])
axes[1].tick_params(axis='x', rotation=0)

plt.tight_layout()
plt.show()

## Experiment 3: Heatmap of Different Combinations

Let's explore a grid of different `p_pentagon` and `p_kaia` combinations to see how they interact.

In [None]:
# Create a grid of p_pentagon and p_kaia values
p_grid = [0.2, 0.5, 0.8]

print("Running simulations for all combinations...")
results_grid = []
for p_pent in p_grid:
    for p_k in p_grid:
        print(f"  Testing p_pentagon = {p_pent}, p_kaia = {p_k}")
        df = run_simulation_silent(n_students, p_pent, p_k, n_runs)
        results_grid.append(df)

df_grid = pd.concat(results_grid, ignore_index=True)
print("Done!")

In [None]:
# Create heatmaps
fig, axes = plt.subplots(2, 2, figsize=(14, 12))

# Pivot tables for heatmaps
avg_grid = df_grid.groupby(['p_pentagon', 'p_kaia']).mean().reset_index()

# Heatmap: Pentagon count
pivot_pentagon = avg_grid.pivot(index='p_pentagon', columns='p_kaia', values='pentagon_count')
sns.heatmap(pivot_pentagon, annot=True, fmt='.1f', cmap='Blues', ax=axes[0, 0])
axes[0, 0].set_title('Average Students at Pentagon')
axes[0, 0].set_xlabel('p_kaia')
axes[0, 0].set_ylabel('p_pentagon')

# Heatmap: Kaia count
pivot_kaia = avg_grid.pivot(index='p_pentagon', columns='p_kaia', values='kaia_count')
sns.heatmap(pivot_kaia, annot=True, fmt='.1f', cmap='Oranges', ax=axes[0, 1])
axes[0, 1].set_title('Average Students at Kaia')
axes[0, 1].set_xlabel('p_kaia')
axes[0, 1].set_ylabel('p_pentagon')

# Heatmap: Average steps
pivot_steps = avg_grid.pivot(index='p_pentagon', columns='p_kaia', values='avg_steps')
sns.heatmap(pivot_steps, annot=True, fmt='.0f', cmap='Greens', ax=axes[1, 0])
axes[1, 0].set_title('Average Steps Taken')
axes[1, 0].set_xlabel('p_kaia')
axes[1, 0].set_ylabel('p_pentagon')

# Heatmap: Average time
pivot_time = avg_grid.pivot(index='p_pentagon', columns='p_kaia', values='avg_time')
sns.heatmap(pivot_time, annot=True, fmt='.0f', cmap='Purples', ax=axes[1, 1])
axes[1, 1].set_title('Average Time Elapsed (seconds)')
axes[1, 1].set_xlabel('p_kaia')
axes[1, 1].set_ylabel('p_pentagon')

plt.tight_layout()
plt.show()

## Experiment 4: Pentagon/Kaia Ratio Analysis

Let's look at the ratio of students ending up at Pentagon vs Kaia for different probability combinations.

In [None]:
# Calculate ratio and percentage
avg_grid['pentagon_pct'] = avg_grid['pentagon_count'] / n_students * 100
avg_grid['kaia_pct'] = avg_grid['kaia_count'] / n_students * 100

# Create a grouped bar chart
fig, ax = plt.subplots(figsize=(12, 6))

x = range(len(avg_grid))
width = 0.35

bars1 = ax.bar([i - width/2 for i in x], avg_grid['pentagon_pct'], width, label='Pentagon', color='steelblue')
bars2 = ax.bar([i + width/2 for i in x], avg_grid['kaia_pct'], width, label='Kaia', color='coral')

# Labels
ax.set_xlabel('(p_pentagon, p_kaia)')
ax.set_ylabel('Percentage of Students')
ax.set_title('Destination Distribution for Different Probability Combinations')
ax.set_xticks(x)
ax.set_xticklabels([f'({row["p_pentagon"]}, {row["p_kaia"]})' for _, row in avg_grid.iterrows()], rotation=45)
ax.legend()

plt.tight_layout()
plt.show()

## Summary Statistics

In [None]:
# Display summary table
summary_cols = ['p_pentagon', 'p_kaia', 'pentagon_count', 'kaia_count', 'avg_steps', 'avg_time']
summary_df = avg_grid[summary_cols].copy()
summary_df.columns = ['p_pentagon', 'p_kaia', 'Pentagon Students', 'Kaia Students', 'Avg Steps', 'Avg Time (s)']
print("\nSummary of Simulation Results:")
print("="*80)
print(summary_df.to_string(index=False))

## Observations and Conclusions

Based on our simulations, we can make the following observations:

### 1. Entry Probability Effects
- **Higher `p_pentagon`** increases the likelihood of students ending up at Pentagon.
- **Higher `p_kaia`** increases the likelihood of students ending up at Kaia.
- When both probabilities are equal, the distribution between destinations is roughly equal due to the symmetric random walk.

### 2. Time and Steps
- **Lower entry probabilities** result in more steps and more time, as students may pass by a dorm multiple times before entering.
- **Higher entry probabilities** lead to faster completion of the simulation.

### 3. Symmetric Walk Effect
- Since Alex starts at position 50 (AudMax) and both Pentagon (position 10) and Kaia (position 90) are equidistant, with equal movement probabilities (50% east, 50% west), the walk itself doesn't favor either destination.
- The entry probabilities become the deciding factor in determining where students end up.

### 4. Random Nature
- Due to the stochastic nature of the simulation, results vary between runs, which is why we averaged over multiple runs.
- The variation is larger for extreme probability combinations (very low or very high values).

### Key Insight
The entry probabilities `p_kaia` and `p_pentagon` directly influence where students end up. When one probability is significantly higher than the other, that destination attracts more students. The time to reach a destination is also affected, with lower probabilities causing longer walks as students may need multiple attempts to enter a dormitory.