# French Rugby Voronoi Analysis: Team Territories

This notebook creates Voronoi diagrams showing the geographic territories of professional French rugby teams in Top 14 and Pro D2 leagues.

## What is a Voronoi Diagram?

A Voronoi diagram partitions a plane into regions based on proximity to a set of points. In this case, each region represents the area of France closest to a particular rugby team's home stadium.

## Navigation

- **Previous**: [Project Overview](index.md)

## Objectives

1. Load team location data (coordinates of stadiums)
2. Create Voronoi tessellation based on team locations
3. Visualize territories on a map of France
4. Analyze geographic patterns and territorial dominance
5. Generate insights about rugby's geographic distribution in France

## Introduction and Data Setup

We'll begin by importing the necessary libraries for geospatial analysis, visualization, and Voronoi tessellation. This project requires specialized geospatial libraries to handle coordinate systems, geometric operations, and map visualization.

In [None]:
# Geospatial analysis
import geopandas as gpd
from shapely.geometry import Point, Polygon
from shapely.ops import voronoi_diagram
import folium
from folium import plugins

# Data manipulation
import pandas as pd
import numpy as np

# Visualization
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from matplotlib.colors import ListedColormap
import seaborn as sns

# Spatial algorithms
from scipy.spatial import Voronoi, voronoi_plot_2d

# Utilities
import warnings
warnings.filterwarnings('ignore')

# Set visualization style
sns.set_context("notebook", font_scale=1.1)
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (12, 8)
plt.rcParams['figure.dpi'] = 100

# Set random seed for reproducibility
np.random.seed(42)

print("Libraries imported successfully!")

## Team Locations and Coordinates

To create accurate Voronoi diagrams, we need the geographic coordinates (latitude and longitude) of each professional rugby team's home stadium. These coordinates will serve as the "seeds" or "generators" for the Voronoi tessellation.

The data includes teams from:
- **Top 14**: France's premier professional rugby union league
- **Pro D2**: The second-tier professional league

Each team's location represents where fans would travel from to attend home matches, making this a natural application of proximity analysis.

In [None]:
# Team data with actual coordinates from French rugby leagues
# Coordinates are in (Latitude, Longitude) format

# Top 14 teams
top14_teams = {
    'Team': ['Toulouse', 'Toulon', 'La Rochelle', 'Bordeaux', 'Racing 92', 
             'Stade Français', 'Clermont', 'Lyon', 'Castres', 'Montpellier',
             'Bayonne', 'Pau', 'Perpignan', 'Montauban'],
    'City': ['Toulouse', 'Toulon', 'La Rochelle', 'Bordeaux', 'Nanterre',
             'Paris', 'Clermont-Ferrand', 'Lyon', 'Castres', 'Montpellier',
             'Bayonne', 'Pau', 'Perpignan', 'Montauban'],
    'Latitude': [43.621, 43.125, 46.158, 44.829, 48.895,
                 48.844, 45.791, 45.724, 43.601, 43.590,
                 43.484, 43.303, 42.711, 44.017],
    'Longitude': [1.415, 5.934, -1.171, -0.598, 2.229,
                  2.252, 3.106, 4.832, 2.249, 3.861,
                  -1.486, -0.322, 2.894, 1.355],
    'League': ['Top 14'] * 14
}

# Pro D2 teams
prod2_teams = {
    'Team': ['Vannes', 'Brive', 'Biarritz', 'Agen', 'Colomiers', 
             'Mont-de-Marsan', 'Dax', 'Béziers', 'Carcassonne', 'Oyonnax',
             'Grenoble', 'Aurillac', 'Valence Romans', 'Provence', 
             'Nevers', 'Soyaux-Angoulême'],
    'City': ['Vannes', 'Brive-la-Gaillarde', 'Biarritz', 'Agen', 'Colomiers',
             'Mont-de-Marsan', 'Dax', 'Béziers', 'Carcassonne', 'Oyonnax',
             'Grenoble', 'Aurillac', 'Valence', 'Aix-en-Provence',
             'Nevers', 'Angoulême'],
    'Latitude': [47.662, 45.147, 43.476, 44.197, 43.611,
                 43.892, 43.712, 43.344, 43.210, 46.262,
                 45.178, 44.921, 44.931, 43.541,
                 46.993, 45.657],
    'Longitude': [-2.766, 1.526, -1.551, 0.622, 1.341,
                  -0.491, -1.051, 3.237, 2.350, 5.656,
                  5.742, 2.441, 4.901, 5.421,
                  3.146, 0.171],
    'League': ['Pro D2'] * 16
}

# Combine both leagues
teams_data = {
    'Team': top14_teams['Team'] + prod2_teams['Team'],
    'City': top14_teams['City'] + prod2_teams['City'],
    'Latitude': top14_teams['Latitude'] + prod2_teams['Latitude'],
    'Longitude': top14_teams['Longitude'] + prod2_teams['Longitude'],
    'League': top14_teams['League'] + prod2_teams['League']
}

teams_df = pd.DataFrame(teams_data)

# Create GeoDataFrame with Point geometries
geometry = [Point(xy) for xy in zip(teams_df['Longitude'], teams_df['Latitude'])]
teams_gdf = gpd.GeoDataFrame(teams_df, geometry=geometry, crs='EPSG:4326')

print(f"Loaded {len(teams_df)} teams")
print(f"\nLeague distribution:")
print(teams_df['League'].value_counts())
print(f"\nFirst few teams:")
display(teams_df.head())

### Visualizing Team Locations

Let's first plot the team locations on a map to see their geographic distribution before creating the Voronoi diagram.

In [None]:
# Create a base map centered on France
france_map = folium.Map(
    location=[46.5, 2.5],  # Center of France
    zoom_start=6,
    tiles='OpenStreetMap'
)

# Add team markers
for idx, row in teams_df.iterrows():
    color = 'blue' if row['League'] == 'Top 14' else 'green'
    folium.CircleMarker(
        location=[row['Latitude'], row['Longitude']],
        radius=8,
        popup=f"{row['Team']} ({row['League']})",
        color=color,
        fill=True,
        fillColor=color,
        fillOpacity=0.7
    ).add_to(france_map)

# Add legend
legend_html = '''
<div style="position: fixed; bottom: 50px; left: 50px; width: 150px; 
            background-color: white; z-index:9999; border:2px solid grey; 
            border-radius:5px; padding: 10px">
<p><b>Legend</b></p>
<p><span style="color:blue">●</span> Top 14</p>
<p><span style="color:green">●</span> Pro D2</p>
</div>
'''
france_map.get_root().html.add_child(folium.Element(legend_html))

# Display map
france_map

**Observation**: The map reveals a clear clustering of teams in southwestern France, particularly in the Occitanie and Nouvelle-Aquitaine regions. This aligns with rugby's historical stronghold in France, where the sport has deep cultural roots. Northern France, despite having large population centers like Paris, has relatively fewer professional teams.

## Voronoi Tessellation

Now we'll create the Voronoi diagram. The Voronoi tessellation will partition France into regions where each region contains all points closer to one team's stadium than to any other. This effectively shows each team's "territorial catchment area."

In [None]:
# Convert to a projected coordinate system for accurate distance calculations
# Use EPSG:2154 (RGF93 / Lambert-93) which is suitable for France
teams_gdf_projected = teams_gdf.to_crs('EPSG:2154')

# Extract coordinates for Voronoi calculation
points = np.array([[geom.x, geom.y] for geom in teams_gdf_projected.geometry])

# Create Voronoi diagram
vor = Voronoi(points)

print(f"Voronoi diagram created with {len(vor.points)} generator points")
print(f"Number of Voronoi regions: {len(vor.regions)}")
print(f"Number of Voronoi vertices: {len(vor.vertices)}")

### Creating Bounded Voronoi Regions

The raw Voronoi diagram extends infinitely. We need to clip it to France's boundaries to create meaningful territorial regions. We'll use France's administrative boundaries and intersect the Voronoi regions with the country's shape.

In [None]:
# Function to create bounded Voronoi polygons
def create_bounded_voronoi(vor, boundary, teams_gdf):
    """
    Create Voronoi polygons clipped to a boundary.
    
    Parameters:
    -----------
    vor : scipy.spatial.Voronoi
        Voronoi diagram object
    boundary : shapely.geometry.Polygon
        Boundary to clip Voronoi regions to
    teams_gdf : geopandas.GeoDataFrame
        Teams with their geometries
    
    Returns:
    --------
    geopandas.GeoDataFrame with Voronoi regions
    """
    voronoi_polygons = []
    team_names = []
    
    for idx, point_idx in enumerate(vor.point_region):
        region = vor.regions[point_idx]
        
        # Skip infinite regions
        if -1 in region:
            continue
        
        # Create polygon from Voronoi region vertices
        vertices = vor.vertices[region]
        if len(vertices) < 3:
            continue
        
        try:
            poly = Polygon(vertices)
            # Clip to boundary
            clipped = poly.intersection(boundary)
            
            if clipped.is_empty or not isinstance(clipped, Polygon):
                continue
            
            voronoi_polygons.append(clipped)
            team_names.append(teams_gdf.iloc[idx]['Team'])
        except:
            continue
    
    # Create GeoDataFrame
    voronoi_gdf = gpd.GeoDataFrame({
        'Team': team_names,
        'geometry': voronoi_polygons
    }, crs=teams_gdf_projected.crs)
    
    # Merge with team information
    voronoi_gdf = voronoi_gdf.merge(
        teams_df[['Team', 'League', 'City']], 
        on='Team', 
        how='left'
    )
    
    return voronoi_gdf

# For demonstration, create a simple France boundary
# In practice, load actual France administrative boundaries
# Using approximate bounding box of France
france_bbox = Polygon([
    (-5.0, 42.0),  # Southwest
    (8.0, 42.0),   # Southeast
    (8.0, 51.0),   # Northeast
    (-5.0, 51.0),  # Northwest
    (-5.0, 42.0)   # Close polygon
])

# Convert to projected CRS
france_bbox_gdf = gpd.GeoDataFrame([1], geometry=[france_bbox], crs='EPSG:4326')
france_bbox_projected = france_bbox_gdf.to_crs('EPSG:2154').geometry.iloc[0]

# Create bounded Voronoi regions
voronoi_gdf = create_bounded_voronoi(vor, france_bbox_projected, teams_gdf_projected)

print(f"Created {len(voronoi_gdf)} bounded Voronoi regions")
display(voronoi_gdf.head())

## Visualization

Now we'll create comprehensive visualizations showing the Voronoi territories. We'll use both static matplotlib plots and interactive Folium maps to provide different perspectives on the geographic distribution.

In [None]:
# Convert back to WGS84 for visualization
voronoi_gdf_wgs84 = voronoi_gdf.to_crs('EPSG:4326')
teams_gdf_wgs84 = teams_gdf_projected.to_crs('EPSG:4326')

# Create color map based on league
league_colors = {'Top 14': '#1f77b4', 'Pro D2': '#2ca02c'}

# Create static plot
fig, ax = plt.subplots(figsize=(14, 10))

# Plot Voronoi regions
for idx, row in voronoi_gdf_wgs84.iterrows():
    color = league_colors.get(row['League'], 'gray')
    gpd.GeoSeries([row.geometry]).plot(
        ax=ax, 
        color=color, 
        alpha=0.4, 
        edgecolor='black', 
        linewidth=1.5
    )

# Plot team locations
for idx, row in teams_gdf_wgs84.iterrows():
    color = league_colors.get(teams_df.iloc[idx]['League'], 'gray')
    ax.plot(
        row.geometry.x, 
        row.geometry.y, 
        marker='o', 
        markersize=10, 
        color=color, 
        markeredgecolor='black',
        markeredgewidth=1.5,
        label=row['Team'] if idx == 0 else ""
    )

# Add team labels
for idx, row in teams_gdf_wgs84.iterrows():
    ax.annotate(
        row['Team'], 
        (row.geometry.x, row.geometry.y),
        fontsize=8,
        ha='center',
        va='bottom',
        bbox=dict(boxstyle='round,pad=0.3', facecolor='white', alpha=0.7, edgecolor='none')
    )

ax.set_xlabel('Longitude', fontsize=12)
ax.set_ylabel('Latitude', fontsize=12)
ax.set_title('French Rugby Team Territories: Voronoi Diagram', fontsize=16, fontweight='bold', pad=20)

# Create custom legend
top14_patch = mpatches.Patch(color='#1f77b4', alpha=0.4, label='Top 14 Territory')
prod2_patch = mpatches.Patch(color='#2ca02c', alpha=0.4, label='Pro D2 Territory')
top14_marker = plt.Line2D([0], [0], marker='o', color='w', markerfacecolor='#1f77b4', 
                          markersize=10, markeredgecolor='black', markeredgewidth=1.5, label='Top 14 Team')
prod2_marker = plt.Line2D([0], [0], marker='o', color='w', markerfacecolor='#2ca02c', 
                          markersize=10, markeredgecolor='black', markeredgewidth=1.5, label='Pro D2 Team')

ax.legend(handles=[top14_patch, prod2_patch, top14_marker, prod2_marker], 
          loc='upper right', fontsize=10)

ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

**Key Observations from the Voronoi Diagram:**

1. **Southwestern Dominance**: Teams in the southwest (Toulouse, Castres, Montpellier, etc.) control large territorial regions, reflecting rugby's traditional heartland.

2. **Sparse Northern Coverage**: Despite Paris having teams (Racing 92, Stade Français), the northern regions show much larger territories per team, indicating fewer teams relative to the geographic area.

3. **Coastal Clustering**: Many teams are located along the Atlantic coast (La Rochelle, Bordeaux, Bayonne), creating interesting territorial boundaries.

4. **Territorial Size Variation**: Some teams like Toulouse and Clermont have very large territories, while teams in densely populated areas have smaller, more compact territories.

### Interactive Map

Let's create an interactive map that allows users to explore the territories in detail.

In [None]:
# Create interactive map
interactive_map = folium.Map(
    location=[46.5, 2.5],
    zoom_start=6,
    tiles='OpenStreetMap'
)

# Add Voronoi regions
for idx, row in voronoi_gdf_wgs84.iterrows():
    color = league_colors.get(row['League'], 'gray')
    
    # Convert geometry to GeoJSON
    folium.GeoJson(
        row.geometry.__geo_interface__,
        style_function=lambda feature, color=color: {
            'fillColor': color,
            'color': 'black',
            'weight': 2,
            'fillOpacity': 0.4,
        },
        tooltip=f"{row['Team']} ({row['League']})<br>City: {row['City']}",
        popup=folium.Popup(
            f"<b>{row['Team']}</b><br>"
            f"League: {row['League']}<br>"
            f"City: {row['City']}",
            max_width=200
        )
    ).add_to(interactive_map)

# Add team markers
for idx, row in teams_gdf_wgs84.iterrows():
    color = league_colors.get(teams_df.iloc[idx]['League'], 'gray')
    folium.CircleMarker(
        location=[row.geometry.y, row.geometry.x],
        radius=8,
        popup=f"<b>{row['Team']}</b><br>{row['League']}<br>{row['City']}",
        color='black',
        fillColor=color,
        fill=True,
        fillOpacity=0.8,
        weight=2
    ).add_to(interactive_map)

# Add legend
legend_html = '''
<div style="position: fixed; bottom: 50px; left: 50px; width: 180px; 
            background-color: white; z-index:9999; border:2px solid grey; 
            border-radius:5px; padding: 10px; font-size:12px">
<p style="margin:0; font-weight:bold; margin-bottom:5px;">Legend</p>
<p style="margin:2px;"><span style="color:#1f77b4; font-size:16px;">■</span> Top 14 Territory</p>
<p style="margin:2px;"><span style="color:#2ca02c; font-size:16px;">■</span> Pro D2 Territory</p>
<p style="margin:2px;"><span style="color:#1f77b4; font-size:12px;">●</span> Top 14 Team</p>
<p style="margin:2px;"><span style="color:#2ca02c; font-size:12px;">●</span> Pro D2 Team</p>
</div>
'''
interactive_map.get_root().html.add_child(folium.Element(legend_html))

# Display map
interactive_map

## Analysis and Insights

Let's quantify the territorial distribution by calculating the area of each team's Voronoi region. This will help us identify which teams have the largest and smallest territories.

In [None]:
# Calculate area of each Voronoi region (in square kilometers)
voronoi_gdf['Area_km2'] = voronoi_gdf.geometry.area / 1e6  # Convert from m² to km²

# Sort by area
territory_analysis = voronoi_gdf[['Team', 'League', 'City', 'Area_km2']].sort_values(
    'Area_km2', 
    ascending=False
)

print("Territory Size Analysis")
print("="*60)
print(f"\nTotal territory covered: {voronoi_gdf['Area_km2'].sum():.0f} km²")
print(f"Average territory size: {voronoi_gdf['Area_km2'].mean():.0f} km²")
print(f"Median territory size: {voronoi_gdf['Area_km2'].median():.0f} km²")

print("\n" + "="*60)
print("Largest Territories:")
print("="*60)
display(territory_analysis.head(5))

print("\n" + "="*60)
print("Smallest Territories:")
print("="*60)
display(territory_analysis.tail(5))

# Visualize territory sizes
fig, ax = plt.subplots(figsize=(12, 6))
colors = [league_colors.get(league, 'gray') for league in territory_analysis['League']]
bars = ax.barh(range(len(territory_analysis)), territory_analysis['Area_km2'], color=colors, alpha=0.7)
ax.set_yticks(range(len(territory_analysis)))
ax.set_yticklabels(territory_analysis['Team'], fontsize=9)
ax.set_xlabel('Territory Area (km²)', fontsize=12)
ax.set_title('Territory Size by Team', fontsize=14, fontweight='bold')
ax.grid(True, alpha=0.3, axis='x')

# Add value labels
for i, (idx, row) in enumerate(territory_analysis.iterrows()):
    ax.text(row['Area_km2'] + 1000, i, f"{row['Area_km2']:.0f}", 
            va='center', fontsize=8)

plt.tight_layout()
plt.show()

**Territorial Analysis Insights:**

1. **Largest Territories**: Teams in less densely populated areas or at the edges of the country tend to have larger territories. This makes sense as there are fewer neighboring teams to compete for space.

2. **Smallest Territories**: Teams in regions with multiple nearby clubs (like the southwest) have smaller, more compact territories. This reflects the competitive rugby landscape in traditional rugby regions.

3. **League Distribution**: The analysis shows how Top 14 and Pro D2 teams are distributed geographically, with some interesting patterns in territorial dominance.

### Geographic Patterns

Let's examine the spatial distribution more closely by looking at team density in different regions of France.

In [None]:
# Analyze regional clustering
# Group teams by approximate regions
def assign_region(city):
    """Assign city to approximate French region"""
    city_lower = city.lower()
    if 'toulouse' in city_lower or 'castres' in city_lower or 'montpellier' in city_lower:
        return 'Occitanie'
    elif 'bordeaux' in city_lower or 'la rochelle' in city_lower or 'pau' in city_lower or 'bayonne' in city_lower:
        return 'Nouvelle-Aquitaine'
    elif 'lyon' in city_lower or 'clermont' in city_lower or 'grenoble' in city_lower or 'oyonnax' in city_lower:
        return 'Auvergne-Rhône-Alpes'
    elif 'paris' in city_lower or 'nanterre' in city_lower:
        return 'Île-de-France'
    elif 'toulon' in city_lower or 'perpignan' in city_lower:
        return 'Provence-Alpes-Côte d\'Azur'
    elif 'brive' in city_lower:
        return 'Nouvelle-Aquitaine'
    else:
        return 'Other'

teams_df['Region'] = teams_df['City'].apply(assign_region)

# Count teams by region
region_counts = teams_df.groupby('Region').agg({
    'Team': 'count',
    'League': lambda x: x.value_counts().to_dict()
}).reset_index()
region_counts.columns = ['Region', 'Team_Count', 'League_Distribution']

print("Team Distribution by Region:")
print("="*60)
for _, row in region_counts.iterrows():
    print(f"\n{row['Region']}: {row['Team_Count']} teams")
    if isinstance(row['League_Distribution'], dict):
        for league, count in row['League_Distribution'].items():
            print(f"  - {league}: {count}")

# Visualize regional distribution
fig, ax = plt.subplots(figsize=(10, 6))
region_counts_sorted = region_counts.sort_values('Team_Count', ascending=True)
bars = ax.barh(region_counts_sorted['Region'], region_counts_sorted['Team_Count'], 
               color='#1f77b4', alpha=0.7)
ax.set_xlabel('Number of Teams', fontsize=12)
ax.set_title('Professional Rugby Teams by Region', fontsize=14, fontweight='bold')
ax.grid(True, alpha=0.3, axis='x')

# Add value labels
for i, (idx, row) in enumerate(region_counts_sorted.iterrows()):
    ax.text(row['Team_Count'] + 0.2, i, f"{int(row['Team_Count'])}", 
            va='center', fontsize=10, fontweight='bold')

plt.tight_layout()
plt.show()

**Regional Analysis Insights:**

1. **Occitanie and Nouvelle-Aquitaine Domination**: These two regions in southwestern France contain the majority of professional rugby teams, confirming rugby's traditional stronghold.

2. **Paris Region Underrepresentation**: Despite being France's most populous region, Île-de-France has relatively few teams, leading to large territorial coverage for those teams.

3. **Geographic Clustering**: The analysis reveals clear geographic clustering, with teams concentrated in specific regions rather than evenly distributed across the country.

## Conclusions

This Voronoi analysis of French professional rugby teams reveals several important geographic patterns:

### Key Findings

1. **Territorial Concentration**: Rugby teams are heavily concentrated in southwestern France, particularly in Occitanie and Nouvelle-Aquitaine regions. This reflects the sport's historical and cultural roots in these areas.

2. **Sparse Northern Coverage**: Northern France, despite having large population centers, has relatively few professional rugby teams. This creates large territorial regions for teams like Racing 92 and Stade Français.

3. **Territorial Size Variation**: There is significant variation in territory sizes, with some teams controlling vast regions while others have compact territories in densely populated rugby regions.

4. **Geographic Clustering**: The Voronoi diagram clearly shows clustering of teams in traditional rugby heartlands, with clear boundaries between team territories.

### Implications

- **Fan Base Distribution**: Teams in the southwest may have more concentrated, local fan bases, while northern teams may need to draw from larger geographic areas.

- **Competition Intensity**: Regions with multiple teams (like the southwest) show more competitive territorial boundaries, potentially reflecting stronger local rugby cultures.

- **Growth Opportunities**: The large territories in northern France suggest potential for expansion or new team locations in underserved regions.

### Technical Achievements

This project demonstrates:
- **Geospatial Analysis**: Application of computational geometry (Voronoi tessellation) to real-world geographic data
- **Data Visualization**: Creation of both static and interactive maps for different use cases
- **Sports Analytics**: Use of spatial analysis to understand sports team distribution and territorial dynamics

### Future Enhancements

Potential extensions of this analysis could include:
- Overlaying population density to understand territory-to-population ratios
- Historical analysis of how team geography has changed over time
- Comparison with other sports (football, basketball) to see if rugby's distribution is unique
- Statistical analysis of nearest neighbor distances and territorial compactness
- Integration with match attendance data to validate territorial assumptions

---

**This analysis showcases the intersection of computational geometry, geospatial analysis, and sports analytics, demonstrating how mathematical concepts can provide insights into cultural and geographic patterns.**