# Common Modeling Pitfalls: Mistakes That Make Network Models Unrealistic

This notebook demonstrates **common mistakes** in network models that make them unrealistic and unusable.

Understanding modeling pitfalls is essential because:
- **Pitfalls create gaps between model and reality** - models don't accurately represent the real network
- **Recommendations based on flawed models cannot be implemented** - they don't work in reality
- **Identifying pitfalls prevents failures** - catch problems before implementing recommendations
- **Recognizing pitfalls helps improve models** - fix problems to get better recommendations


## Key Concepts

**Common Modeling Pitfalls**:
- **Missing nodes or links**: Model doesn't include all relevant network elements
- **Unrealistic capacity assumptions**: Model assumes capacities that don't match reality
- **Incorrect cost assumptions**: Model uses wrong costs, leading to suboptimal recommendations
- **Time-dependent constraints ignored**: Model assumes constant constraints when they vary over time
- **Route availability ignored**: Model assumes all routes are always available

**Why Pitfalls Are Dangerous**:
- They create gaps between the model and reality
- Recommendations based on flawed models cannot be implemented
- Implementing impossible recommendations wastes time and resources
- They damage trust in analytics and network models

**Critical insight**: Understanding common pitfalls helps you identify problems in models before implementing recommendations. This prevents failures and helps improve model quality.


## Scenario: Transportation Network Model

A company uses a network model to optimize delivery routes. The model recommends specific routes, but some recommendations cannot be implemented.

**The Problem**:
- Model recommends routes that don't exist
- Model assumes capacities that are too high
- Model ignores that some routes are closed during certain times
- Model is missing a recently opened distribution center

**The Challenge**:
- How do you identify these pitfalls?
- What questions should you ask about the model?
- How do you fix the problems?

**The Question**: What pitfalls exist in this model? How do they affect recommendations?


## Step 1: Install Required Packages (Colab)


In [None]:
%pip install networkx matplotlib pandas -q


## Step 2: Import Libraries


In [None]:
import networkx as nx
import matplotlib.pyplot as plt
import pandas as pd

plt.style.use('default')
plt.rcParams['figure.figsize'] = (16, 10)
plt.rcParams['font.size'] = 10


## Step 3: Pitfall 1: Missing Nodes

The model is missing a recently opened distribution center:


In [None]:
# Model network (missing DC3)
G_model = nx.DiGraph()
G_model.add_nodes_from(['W1', 'W2', 'DC1', 'DC2', 'S1', 'S2', 'S3'])
G_model.add_edges_from([
    ('W1', 'DC1'), ('W1', 'DC2'),
    ('W2', 'DC1'), ('W2', 'DC2'),
    ('DC1', 'S1'), ('DC1', 'S2'),
    ('DC2', 'S2'), ('DC2', 'S3')
])

# Reality: DC3 was recently opened
G_reality = G_model.copy()
G_reality.add_node('DC3')
G_reality.add_edges_from([
    ('W1', 'DC3'), ('W2', 'DC3'),
    ('DC3', 'S1'), ('DC3', 'S3')
])

print("PITFALL 1: MISSING NODE")
print("=" * 60)
print("Model network: Missing DC3 (recently opened)")
print(f"  Model has {G_model.number_of_nodes()} nodes")
print(f"  Reality has {G_reality.number_of_nodes()} nodes")
print("\\n⚠️  PROBLEM: Model doesn't know about DC3")
print("   Model cannot recommend using DC3")
print("   Model might recommend longer routes when shorter ones exist")

# Visualize
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))

pos = {
    'W1': (0, 1), 'W2': (0, 0),
    'DC1': (1.5, 1.5), 'DC2': (1.5, 0.5),
    'DC3': (1.5, -0.5),  # Missing in model
    'S1': (3, 2), 'S2': (3, 1), 'S3': (3, 0)
}

# Model (missing DC3)
nx.draw_networkx_nodes(G_model, pos, nodelist=['W1', 'W2'], 
                      node_color='lightblue', node_size=2000, node_shape='s', ax=ax1)
nx.draw_networkx_nodes(G_model, pos, nodelist=['DC1', 'DC2'], 
                      node_color='lightgreen', node_size=2000, node_shape='s', ax=ax1)
nx.draw_networkx_nodes(G_model, pos, nodelist=['S1', 'S2', 'S3'], 
                      node_color='lightcoral', node_size=2000, node_shape='o', ax=ax1)
nx.draw_networkx_edges(G_model, pos, edge_color='gray', arrows=True, arrowsize=15, width=2, ax=ax1)
nx.draw_networkx_labels(G_model, pos, font_size=10, font_weight='bold', ax=ax1)
ax1.set_title('Model Network\\nMissing DC3!', fontweight='bold', fontsize=12)
ax1.axis('off')

# Reality (has DC3)
nx.draw_networkx_nodes(G_reality, pos, nodelist=['W1', 'W2'], 
                      node_color='lightblue', node_size=2000, node_shape='s', ax=ax2)
nx.draw_networkx_nodes(G_reality, pos, nodelist=['DC1', 'DC2'], 
                      node_color='lightgreen', node_size=2000, node_shape='s', ax=ax2)
nx.draw_networkx_nodes(G_reality, pos, nodelist=['DC3'], 
                      node_color='yellow', node_size=2500, node_shape='s', 
                      edgecolors='red', linewidths=3, ax=ax2)  # Highlight missing node
nx.draw_networkx_nodes(G_reality, pos, nodelist=['S1', 'S2', 'S3'], 
                      node_color='lightcoral', node_size=2000, node_shape='o', ax=ax2)
nx.draw_networkx_edges(G_reality, pos, edge_color='gray', arrows=True, arrowsize=15, width=2, ax=ax2)
nx.draw_networkx_labels(G_reality, pos, font_size=10, font_weight='bold', ax=ax2)
ax2.set_title('Reality\\nDC3 exists (highlighted in yellow)', fontweight='bold', fontsize=12)
ax2.axis('off')

plt.tight_layout()
plt.show()

print("\\nKey Insight:")
print("  • Missing nodes mean model cannot use them")
print("  • Model recommendations may be suboptimal")
print("  • Always verify model includes all relevant nodes")


## Step 4: Pitfall 2: Unrealistic Capacity Assumptions

The model assumes capacities that are too high:


In [None]:
print("PITFALL 2: UNREALISTIC CAPACITY ASSUMPTIONS")
print("=" * 60)

# Model assumptions vs reality
capacity_data = {
    'Route': ['W1 → DC1', 'W1 → DC2', 'W2 → DC1', 'DC1 → S1'],
    'Model_Capacity': [100, 100, 100, 150],  # Model assumes
    'Actual_Capacity': [80, 75, 90, 120]    # Reality
}

capacity_df = pd.DataFrame(capacity_data)
capacity_df['Difference'] = capacity_df['Model_Capacity'] - capacity_df['Actual_Capacity']
capacity_df['Overestimate_%'] = (capacity_df['Difference'] / capacity_df['Actual_Capacity'] * 100).round(1)

display(capacity_df)

print("\\n⚠️  PROBLEM: Model overestimates capacities")
print("   Model might recommend flows that exceed actual capacity")
print("   Recommendations cannot be implemented")

# Visualize
fig, ax = plt.subplots(figsize=(10, 6))

routes = capacity_df['Route'].values
x = np.arange(len(routes))
width = 0.35

bars1 = ax.bar(x - width/2, capacity_df['Model_Capacity'], width, 
              label='Model Assumption', color='lightblue', alpha=0.7)
bars2 = ax.bar(x + width/2, capacity_df['Actual_Capacity'], width, 
              label='Actual Capacity', color='lightcoral', alpha=0.7)

# Highlight overestimates
for i, (model_cap, actual_cap) in enumerate(zip(capacity_df['Model_Capacity'], capacity_df['Actual_Capacity'])):
    if model_cap > actual_cap:
        ax.plot([i - width/2, i + width/2], [model_cap, actual_cap], 
               'r--', linewidth=2, alpha=0.5)

ax.set_xlabel('Route', fontsize=11)
ax.set_ylabel('Capacity', fontsize=11)
ax.set_title('Unrealistic Capacity Assumptions\\nModel Overestimates Actual Capacity', 
            fontweight='bold', fontsize=12)
ax.set_xticks(x)
ax.set_xticklabels(routes, rotation=45, ha='right')
ax.legend()
ax.grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

print("\\nKey Insight:")
print("  • Unrealistic capacities lead to impossible recommendations")
print("  • Always verify capacity assumptions match reality")
print("  • Model recommendations may exceed actual capacity")


In [None]:
print("PITFALL 3: TIME-DEPENDENT CONSTRAINTS IGNORED")
print("=" * 60)

# Model assumes constant capacity
model_capacity = 100  # trucks per hour (constant)

# Reality: Capacity varies by time
time_periods = ['6-9 AM', '9-12 PM', '12-3 PM', '3-6 PM', '6-9 PM']
actual_capacities = [60, 100, 100, 80, 50]  # Rush hour reduces capacity

time_df = pd.DataFrame({
    'Time_Period': time_periods,
    'Model_Assumption': [model_capacity] * len(time_periods),
    'Actual_Capacity': actual_capacities,
    'Difference': [model_capacity - ac for ac in actual_capacities]
})

display(time_df)

print("\\n⚠️  PROBLEM: Model assumes constant capacity")
print("   Model doesn't account for rush hour reductions")
print("   Recommendations work at some times but not others")

# Visualize
fig, ax = plt.subplots(figsize=(12, 6))

x = np.arange(len(time_periods))
ax.plot(x, time_df['Model_Assumption'], 'b--', linewidth=2, marker='o', 
       markersize=8, label='Model Assumption (Constant)', alpha=0.7)
ax.plot(x, time_df['Actual_Capacity'], 'r-', linewidth=3, marker='s', 
       markersize=8, label='Actual Capacity (Varies)', alpha=0.7)

# Highlight problem periods
for i, (model_cap, actual_cap) in enumerate(zip(time_df['Model_Assumption'], time_df['Actual_Capacity'])):
    if model_cap > actual_cap:
        ax.fill_between([i-0.2, i+0.2], [model_cap, model_cap], [actual_cap, actual_cap], 
                       color='red', alpha=0.3)

ax.set_xlabel('Time Period', fontsize=11)
ax.set_ylabel('Capacity (trucks/hour)', fontsize=11)
ax.set_title('Time-Dependent Constraints Ignored\\nModel Assumes Constant, Reality Varies', 
            fontweight='bold', fontsize=12)
ax.set_xticks(x)
ax.set_xticklabels(time_periods, rotation=45, ha='right')
ax.legend()
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\\nKey Insight:")
print("  • Models that ignore time-dependent constraints are inaccurate")
print("  • Recommendations may work at some times but fail at others")
print("  • Always check if constraints vary over time")


## Step 6: Pitfall 4: Route Availability Ignored

The model assumes all routes are always available, but some routes are closed:


In [None]:
print("PITFALL 4: ROUTE AVAILABILITY IGNORED")
print("=" * 60)

# Model assumes all routes available
G_model_routes = nx.DiGraph()
G_model_routes.add_nodes_from(['W1', 'W2', 'DC1', 'S1', 'S2'])
G_model_routes.add_edges_from([
    ('W1', 'DC1'), ('W1', 'DC1'),  # Route 1
    ('W2', 'DC1'),                 # Route 2
    ('DC1', 'S1'), ('DC1', 'S2')   # Routes 3, 4
])

# Reality: Route W1 → DC1 is closed (construction)
G_reality_routes = G_model_routes.copy()
G_reality_routes.remove_edge('W1', 'DC1')

print("Model assumes:")
print(f"  Available routes: {list(G_model_routes.edges())}")
print("\\nReality:")
print(f"  Available routes: {list(G_reality_routes.edges())}")
print("\\n⚠️  PROBLEM: Model recommends using W1 → DC1")
print("   But this route is closed (construction)")
print("   Recommendation cannot be implemented")

# Visualize
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))

pos_routes = {
    'W1': (0, 1), 'W2': (0, 0),
    'DC1': (2, 0.5),
    'S1': (4, 1), 'S2': (4, 0)
}

# Model (all routes available)
nx.draw_networkx_nodes(G_model_routes, pos_routes, node_color='lightblue', 
                       node_size=2000, node_shape='s', ax=ax1)
nx.draw_networkx_nodes(G_model_routes, pos_routes, nodelist=['S1', 'S2'], 
                       node_color='lightcoral', node_size=2000, node_shape='o', ax=ax1)
nx.draw_networkx_edges(G_model_routes, pos_routes, edge_color='green', 
                      arrows=True, arrowsize=20, width=3, ax=ax1)
nx.draw_networkx_labels(G_model_routes, pos_routes, font_size=10, font_weight='bold', ax=ax1)
ax1.set_title('Model: All Routes Available', fontweight='bold', fontsize=12)
ax1.axis('off')

# Reality (route closed)
nx.draw_networkx_nodes(G_reality_routes, pos_routes, node_color='lightblue', 
                       node_size=2000, node_shape='s', ax=ax2)
nx.draw_networkx_nodes(G_reality_routes, pos_routes, nodelist=['S1', 'S2'], 
                       node_color='lightcoral', node_size=2000, node_shape='o', ax=ax2)
nx.draw_networkx_edges(G_reality_routes, pos_routes, edge_color='green', 
                      arrows=True, arrowsize=20, width=3, ax=ax2)
# Draw closed route in red
ax2.plot([pos_routes['W1'][0], pos_routes['DC1'][0]], 
        [pos_routes['W1'][1], pos_routes['DC1'][1]], 
        'r--', linewidth=3, alpha=0.7, label='Closed Route')
ax2.text((pos_routes['W1'][0] + pos_routes['DC1'][0])/2, 
        (pos_routes['W1'][1] + pos_routes['DC1'][1])/2 + 0.2,
        'CLOSED', fontsize=12, fontweight='bold', color='red', 
        bbox=dict(boxstyle='round', facecolor='white', edgecolor='red', linewidth=2))
nx.draw_networkx_labels(G_reality_routes, pos_routes, font_size=10, font_weight='bold', ax=ax2)
ax2.set_title('Reality: Route W1 → DC1 Closed', fontweight='bold', fontsize=12)
ax2.axis('off')
ax2.legend()

plt.tight_layout()
plt.show()

print("\\nKey Insight:")
print("  • Models that ignore route availability recommend impossible routes")
print("  • Always check if routes are actually available")
print("  • Weather, construction, and incidents can close routes")


## Step 7: Key Takeaways

**Common modeling pitfalls**:
- Missing nodes or links make models incomplete
- Unrealistic capacity assumptions lead to impossible recommendations
- Time-dependent constraints ignored make models inaccurate
- Route availability ignored leads to recommendations using closed routes

**Why pitfalls are dangerous**:
- They create gaps between model and reality
- Recommendations based on flawed models cannot be implemented
- Implementing impossible recommendations wastes resources

**Identifying pitfalls prevents failures**:
- Check if model includes all relevant nodes and links
- Verify capacity assumptions match reality
- Check if constraints vary over time
- Verify route availability

**Understanding pitfalls helps**:
- Evaluate model quality before trusting recommendations
- Identify problems before implementation
- Improve models by fixing pitfalls
- Build better network models over time
