# Bridge Criticality Analysis

This notebook documents the process of analyzing bridge criticality based on two key metrics:
1. **Structural Vulnerability** (measured by functionality percentage)
2. **Network Importance** (measured by global efficiency drop)

We'll analyze bridges across multiple buffer distances (5km to 60km) to identify the most critical structures that have both low functionality and high impact on network efficiency.

## 1. Import Libraries and Load Data

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.colors import LinearSegmentedColormap
import os

# Set plot style
plt.style.use('seaborn-whitegrid')
sns.set_context("notebook", font_scale=1.2)

# Create output directory
os.makedirs('output', exist_ok=True)

In [None]:
# Load the datasets
print("Loading datasets...")
functionality_df = pd.read_csv('Functionality_475years.csv')
efficiency_df = pd.read_csv('efficiency_results_all_bridges.csv')

# Display the first few rows of each dataset
print("\nFunctionality dataset:")
display(functionality_df.head())

print("\nEfficiency dataset:")
display(efficiency_df.head())

## 2. Data Preprocessing

In [None]:
# Clean up the functionality column (remove % sign and convert to float)
functionality_df['Functionality'] = functionality_df['Functionality'].str.rstrip('%').astype(float)

# Rename IOP column to match with efficiency_df
functionality_df = functionality_df.rename(columns={'IOP': 'Bridge IOP'})

# Get unique buffer distances
buffer_distances = sorted(efficiency_df['Buffer Distance (km)'].unique())
print(f"Found {len(buffer_distances)} unique buffer distances: {buffer_distances}")

# Get number of unique bridges
unique_bridges_count = efficiency_df['Bridge IOP'].nunique()
print(f"Found {unique_bridges_count} unique bridges in efficiency dataset")
print(f"Found {len(functionality_df)} bridges in functionality dataset")

## 3. Exploratory Data Analysis

In [None]:
# Analyze functionality distribution
plt.figure(figsize=(10, 6))
sns.histplot(functionality_df['Functionality'], bins=20, kde=True)
plt.xlabel('Functionality (%)')
plt.ylabel('Number of Bridges')
plt.title('Distribution of Bridge Functionality')
plt.grid(True, linestyle='--', alpha=0.7)
plt.savefig('output/functionality_distribution.png', dpi=300, bbox_inches='tight')
plt.show()

In [None]:
# Analyze efficiency change distribution for a specific buffer (e.g., 5km)
buffer_5km = efficiency_df[efficiency_df['Buffer Distance (km)'] == 5]

plt.figure(figsize=(10, 6))
sns.histplot(buffer_5km['Change in Efficiency'].abs(), bins=20, kde=True)
plt.xlabel('Absolute Change in Efficiency (5km Buffer)')
plt.ylabel('Number of Bridges')
plt.title('Distribution of Efficiency Change Magnitude (5km Buffer)')
plt.grid(True, linestyle='--', alpha=0.7)
plt.savefig('output/efficiency_change_distribution_5km.png', dpi=300, bbox_inches='tight')
plt.show()

## 4. Calculate Bridge Criticality Scores

We'll calculate criticality scores for each bridge at each buffer distance using the following methodology:

1. **Normalize Functionality**: Lower functionality = higher criticality
2. **Normalize Efficiency Change**: Higher efficiency drop = higher criticality
3. **Combine Scores**: Equal weighting of both metrics

This approach gives us a comprehensive criticality score that considers both structural vulnerability and network importance.

In [None]:
# Create a dictionary to store results for each buffer distance
results = {}
all_criticality_scores = []

# Process each buffer distance
for buffer in buffer_distances:
    print(f"\nProcessing buffer distance: {buffer} km")
    
    # Filter efficiency data for current buffer
    buffer_df = efficiency_df[efficiency_df['Buffer Distance (km)'] == buffer]
    
    # Merge with functionality data
    merged_df = pd.merge(buffer_df, functionality_df[['Bridge IOP', 'Functionality']], 
                         on='Bridge IOP', how='inner')
    
    print(f"Found {len(merged_df)} bridges with both functionality and efficiency data for buffer {buffer} km")
    
    # Normalize efficiency change to positive values (absolute value)
    merged_df['Normalized Efficiency Change'] = merged_df['Change in Efficiency'].abs()
    
    # Calculate criticality score (higher is more critical)
    # Normalize both metrics to 0-1 range for fair combination
    min_func = merged_df['Functionality'].min()
    max_func = merged_df['Functionality'].max()
    min_eff = merged_df['Normalized Efficiency Change'].min()
    max_eff = merged_df['Normalized Efficiency Change'].max()
    
    # Normalize and invert functionality (lower functionality = higher criticality)
    merged_df['Norm_Func'] = 1 - ((merged_df['Functionality'] - min_func) / (max_func - min_func))
    
    # Normalize efficiency change (higher change = higher criticality)
    merged_df['Norm_Eff'] = (merged_df['Normalized Efficiency Change'] - min_eff) / (max_eff - min_eff)
    
    # Combined criticality score (0-1 range, higher is more critical)
    merged_df['Criticality Score'] = (merged_df['Norm_Func'] + merged_df['Norm_Eff']) / 2
    
    # Add relative efficiency change (Change in Efficiency / Original Global Efficiency)
    merged_df['Relative Efficiency Change'] = merged_df['Change in Efficiency'] / merged_df['Original Global Efficiency']
    merged_df['Absolute Relative Efficiency Change'] = merged_df['Relative Efficiency Change'].abs()
    
    # Store results
    results[buffer] = merged_df.copy()
    
    # Add buffer distance column and append to comprehensive dataframe
    merged_df['Buffer Distance (km)'] = buffer
    all_criticality_scores.append(merged_df)
    
    # Display the top 5 most critical bridges for this buffer
    print(f"Top 5 most critical bridges for buffer {buffer} km:")
    display(merged_df.sort_values('Criticality Score', ascending=False)[['Bridge IOP', 'Highway Type', 'Functionality', 
                                                                         'Normalized Efficiency Change', 'Criticality Score']].head(5))

In [None]:
# Combine all buffer distances into a single dataframe
all_scores_df = pd.concat(all_criticality_scores, ignore_index=True)

# Save comprehensive file
all_scores_df.to_csv('output/all_buffer_criticality_scores.csv', index=False)
print(f"Saved comprehensive criticality scores to output/all_buffer_criticality_scores.csv")

## 5. Identify Most Critical Bridges Across Buffer Distances

In [None]:
# Create a summary of top 10 most critical bridges for each buffer
summary_rows = []
for buffer in buffer_distances:
    buffer_scores = all_scores_df[all_scores_df['Buffer Distance (km)'] == buffer]
    top_bridges = buffer_scores.sort_values('Criticality Score', ascending=False).head(10)
    
    for _, bridge in top_bridges.iterrows():
        summary_rows.append({
            'Buffer Distance (km)': buffer,
            'Bridge IOP': bridge['Bridge IOP'],
            'OSM ID': bridge['OSM ID'],
            'Highway Type': bridge['Highway Type'],
            'Functionality (%)': bridge['Functionality'],
            'Criticality Score': bridge['Criticality Score'],
            'Normalized Efficiency Change': bridge['Normalized Efficiency Change'],
            'Absolute Relative Efficiency Change': bridge['Absolute Relative Efficiency Change']
        })

summary_df = pd.DataFrame(summary_rows)
summary_file = 'output/top_critical_bridges_summary.csv'
summary_df.to_csv(summary_file, index=False)
print(f"Saved summary of top critical bridges to {summary_file}")

# Display the first few rows of the summary
display(summary_df.head(10))

## 6. Analyze Bridge Criticality Patterns

Let's analyze which bridges appear most frequently in the top 10 most critical bridges across different buffer distances.

In [None]:
# Count frequency of bridges in the top 10 most critical across buffer distances
bridge_frequency = summary_df['Bridge IOP'].value_counts()

# Display bridges that appear in multiple buffer distances
frequent_bridges = bridge_frequency[bridge_frequency > 1]
print(f"Bridges appearing in top 10 most critical across multiple buffer distances:")
display(frequent_bridges)

# Get details of the most frequent bridges
if len(frequent_bridges) > 0:
    most_frequent_bridge = frequent_bridges.index[0]
    bridge_details = summary_df[summary_df['Bridge IOP'] == most_frequent_bridge]
    print(f"\nDetails of most frequent critical bridge ({most_frequent_bridge}):")
    display(bridge_details)

## 7. Visualize Bridge Criticality Across Buffer Distances

Now we'll create visualizations to better understand the relationship between functionality, efficiency change, and criticality scores across different buffer distances.

In [None]:
# Create a figure with 11 subplots (one for each buffer distance)
fig = plt.figure(figsize=(20, 15))
from matplotlib import gridspec
gs = gridspec.GridSpec(4, 3, figure=fig, wspace=0.3, hspace=0.4)

# Create a custom colormap from green to red
cmap = LinearSegmentedColormap.from_list('GreenToRed', ['green', 'yellow', 'red'])

# Process each buffer distance
for i, buffer in enumerate(buffer_distances):
    # Calculate subplot position
    row = i // 3
    col = i % 3
    
    # Create subplot
    ax = fig.add_subplot(gs[row, col])
    
    # Get data for current buffer
    buffer_data = results[buffer]
    
    # Plot all bridges as small gray dots
    ax.scatter(buffer_data['Functionality'], buffer_data['Normalized Efficiency Change'], 
               color='lightgray', alpha=0.5, s=20)
    
    # Get top 10 critical bridges for this buffer
    top_bridges = buffer_data.sort_values('Criticality Score', ascending=False).head(10)
    
    # Plot critical bridges with color based on criticality
    scatter = ax.scatter(top_bridges['Functionality'], 
                        top_bridges['Normalized Efficiency Change'],
                        c=top_bridges['Criticality Score'], 
                        cmap=cmap, 
                        s=100, 
                        edgecolor='black', 
                        zorder=5)
    
    # Add labels for top 3 critical bridges
    for _, bridge in top_bridges.head(3).iterrows():
        ax.annotate(bridge['Bridge IOP'], 
                   (bridge['Functionality'], bridge['Normalized Efficiency Change']),
                   xytext=(5, 5), 
                   textcoords='offset points', 
                   fontsize=8)
    
    # Set title and labels
    ax.set_title(f'Buffer: {buffer} km', fontsize=12)
    ax.set_xlabel('Functionality (%)', fontsize=10)
    ax.set_ylabel('Normalized Efficiency Change', fontsize=10)
    ax.grid(True, linestyle='--', alpha=0.7)
    
    # Set y-axis to start from 0
    ax.set_ylim(bottom=0)

# Add a colorbar for the criticality score
if len(buffer_distances) < 12:  # Make sure we have space for the colorbar
    cbar_ax = fig.add_subplot(gs[3, 2])
    cbar = plt.colorbar(scatter, cax=cbar_ax, orientation='horizontal')
    cbar.set_label('Criticality Score (Higher = More Critical)', fontsize=12)

# Add an overall title
fig.suptitle('Bridge Functionality vs Normalized Efficiency Change Across Buffer Distances\nCritical Bridges Highlighted (Green to Red = Increasing Criticality)', 
             fontsize=16, y=0.98)

# Save the figure
plt.savefig('output/buffer_comparison_matrix.png', dpi=300, bbox_inches='tight')
plt.show()

## 8. Analyze Efficiency Impact Across Buffer Distances

In [None]:
# Create a visualization showing how efficiency impact changes with buffer distance
plt.figure(figsize=(12, 8))

# Group by buffer distance and calculate mean values for reference
buffer_means = efficiency_df.groupby('Buffer Distance (km)')['Change in Efficiency'].mean()
buffer_distances_list = buffer_means.index.tolist()
buffer_means_list = buffer_means.values.tolist()

# Plot mean values as a reference line
plt.plot(buffer_distances_list, [abs(x) for x in buffer_means_list], 'b--', label='Mean Efficiency Drop')

# Get the most frequent critical bridges (appearing in at least 3 buffer distances)
frequent_bridges = bridge_frequency[bridge_frequency >= 3].index.tolist()

# Plot these bridges across buffer distances
for bridge_iop in frequent_bridges[:5]:  # Limit to top 5 most frequent bridges
    bridge_data = summary_df[summary_df['Bridge IOP'] == bridge_iop]
    plt.plot(bridge_data['Buffer Distance (km)'], bridge_data['Normalized Efficiency Change'], 'o-', label=bridge_iop)

plt.xlabel('Buffer Distance (km)')
plt.ylabel('Normalized Efficiency Change')
plt.title('Efficiency Impact of Critical Bridges Across Buffer Distances')
plt.grid(True, linestyle='--', alpha=0.7)
plt.legend()
plt.savefig('output/efficiency_impact_across_distances.png', dpi=300, bbox_inches='tight')
plt.show()

## 9. Analyze Relationship Between Absolute and Relative Efficiency Change

In [None]:
# For a specific buffer (e.g., 5km), compare absolute vs. relative efficiency change
buffer_5km = results[5]

plt.figure(figsize=(10, 6))
plt.scatter(buffer_5km['Normalized Efficiency Change'], 
            buffer_5km['Absolute Relative Efficiency Change'], 
            alpha=0.6)

# Highlight top 10 critical bridges
top_bridges = buffer_5km.sort_values('Criticality Score', ascending=False).head(10)
plt.scatter(top_bridges['Normalized Efficiency Change'], 
            top_bridges['Absolute Relative Efficiency Change'],
            color='red', s=100, edgecolor='black', zorder=5)

# Add labels for top 3 critical bridges
for _, bridge in top_bridges.head(3).iterrows():
    plt.annotate(bridge['Bridge IOP'], 
                (bridge['Normalized Efficiency Change'], bridge['Absolute Relative Efficiency Change']),
                xytext=(5, 5), 
                textcoords='offset points')

plt.xlabel('Absolute Efficiency Change')
plt.ylabel('Relative Efficiency Change (|Change/Original|)')
plt.title('Absolute vs. Relative Efficiency Change (5km Buffer)')
plt.grid(True, linestyle='--', alpha=0.7)
plt.savefig('output/absolute_vs_relative_efficiency.png', dpi=300, bbox_inches='tight')
plt.show()

## 10. Summary and Conclusions

Based on our analysis, we can draw the following conclusions:

1. **Critical Bridge Identification**: We've successfully identified bridges with both low functionality and high efficiency drop across different buffer distances.

2. **Buffer Distance Impact**: The efficiency drop is most significant at smaller buffer distances (5-10km) and decreases as the buffer distance increases.

3. **Consistent Critical Bridges**: Some bridges appear consistently critical across multiple buffer distances, indicating their importance at both local and regional scales.

4. **Criticality Score Methodology**: Our combined criticality score effectively identifies bridges that are both structurally vulnerable and important for network connectivity.

5. **Prioritization for Seismic Risk Management**: The identified critical bridges should be prioritized for seismic retrofitting and protection measures.

This analysis demonstrates the value of combining structural vulnerability metrics (functionality) with network importance metrics (efficiency impact) to identify the most critical infrastructure elements requiring protection and priority repair after seismic events.