# S2 Geospatial Indexing for Tree Risk Pro

This notebook demonstrates the S2 geospatial indexing system implemented for the Tree Risk Pro Dashboard. S2 is Google's spatial indexing system that divides the earth into cells at different levels of granularity.

In [None]:
import sys
import os
import json
import asyncio
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import display

# Add backend directory to the path
sys.path.append(os.path.join(os.path.dirname('__file__'), 'backend'))

# Import the S2 index manager
from services.detection_service import S2IndexManager

## Initialize the S2 Index Manager

First, we'll initialize the S2IndexManager, which handles all S2 operations for the Tree Risk Pro system.

In [None]:
# Create instance of S2IndexManager
s2_manager = S2IndexManager()

# Display the available cell levels
print("S2 Cell Levels:")
for level_name, level in s2_manager.cell_levels.items():
    print(f"  {level_name.capitalize()}: Level {level}")

## Generate S2 Cell IDs for a Location

Now, let's generate S2 cell IDs for a specific location at different levels.

In [None]:
# Sample location (Downtown Dallas, TX)
lat, lng = 32.7767, -96.7970

# Generate cell IDs at different levels
print(f"Cell IDs for location: [{lat}, {lng}]")
for level_name, level in s2_manager.cell_levels.items():
    cell_id = s2_manager.get_cell_id(lat, lng, level_name)
    print(f"  {level_name.capitalize()}: {cell_id}")

# Get all cell IDs for this location at once
all_cells = s2_manager.get_cell_ids_for_tree(lat, lng)
print("\nAll cell IDs for the location:")
print(json.dumps(all_cells, indent=2))

## Find Neighboring Cells

Let's find neighboring cells around our location.

In [None]:
level = 'block'  # Use block level for this example
neighbors = s2_manager.get_neighbors(lat, lng, level, 8)

print(f"Found {len(neighbors)} neighboring cells at {level} level:")
for i, neighbor_id in enumerate(neighbors):
    print(f"  Neighbor {i+1}: {neighbor_id}")

## Get Cells for a Geographic Bounding Box

Now let's find all the S2 cells that cover a specific geographic area.

In [None]:
# Define a bounding box (roughly Dallas downtown area)
bounds = [
    [lng - 0.01, lat - 0.01],  # SW corner
    [lng + 0.01, lat + 0.01]   # NE corner
]

print(f"Bounds: SW {bounds[0]}, NE {bounds[1]}")

# Get cells at different levels
for level_name in s2_manager.cell_levels.keys():
    cells = s2_manager.get_cells_for_bounds(bounds, level_name)
    print(f"\n{level_name.capitalize()} level: {len(cells)} cells")
    if len(cells) <= 10:
        print(f"  Cell IDs: {cells}")
    else:
        print(f"  First 5 cell IDs: {cells[:5]}")

## Demonstration with Simulated Tree Data

Let's create some simulated tree data and show how it would be indexed with S2 cells.

In [None]:
# Create simulated tree data
np.random.seed(42)  # For reproducibility
num_trees = 50

# Generate random tree locations within our bounds
tree_lats = np.random.uniform(bounds[0][1], bounds[1][1], num_trees)
tree_lngs = np.random.uniform(bounds[0][0], bounds[1][0], num_trees)

# Create tree objects with S2 cells
tree_data = []
for i in range(num_trees):
    # Get S2 cell IDs for this tree
    s2_cells = s2_manager.get_cell_ids_for_tree(tree_lats[i], tree_lngs[i])
    
    # Assign a random risk level
    risk_levels = ['low', 'medium', 'high']
    risk_level = risk_levels[np.random.randint(0, len(risk_levels))]
    
    # Create tree object
    tree = {
        'id': f'tree_{i}',
        'location': [tree_lngs[i], tree_lats[i]],
        'risk_level': risk_level,
        's2_cells': s2_cells
    }
    tree_data.append(tree)

print(f"Created {len(tree_data)} simulated trees with S2 cell indexing")
print("Example tree data:")
print(json.dumps(tree_data[0], indent=2))

## Group Trees by S2 Cell

Now let's group the trees by S2 cell to demonstrate how this is useful for visualization and analysis.

In [None]:
# Group trees by block level S2 cell
grouped_trees = {}
level = 'block'

for tree in tree_data:
    # Get the cell ID for this tree at the block level
    cell_id = tree['s2_cells'][level]
    
    # Add to the appropriate group
    if cell_id not in grouped_trees:
        grouped_trees[cell_id] = []
        
    grouped_trees[cell_id].append(tree)

print(f"Grouped {len(tree_data)} trees into {len(grouped_trees)} S2 cells at {level} level")

# Display the number of trees in each cell
print("\nTrees per cell:")
for cell_id, trees in grouped_trees.items():
    print(f"  Cell {cell_id}: {len(trees)} trees")

## Calculate Statistics for Each Cell

Now let's calculate statistics for each cell, which would be useful for visualizations and risk analysis.

In [None]:
# Calculate statistics for each cell
cell_stats = {}

# Risk level mapping
risk_levels = {
    'low': 1,
    'medium': 2,
    'high': 3
}

for cell_id, trees in grouped_trees.items():
    # Calculate basic statistics
    tree_count = len(trees)
    
    # Calculate average position
    avg_lat = sum(tree['location'][1] for tree in trees) / tree_count
    avg_lng = sum(tree['location'][0] for tree in trees) / tree_count
    
    # Calculate average risk
    risk_values = [risk_levels[tree['risk_level']] for tree in trees]
    avg_risk = sum(risk_values) / tree_count
    
    # Determine dominant risk level
    if avg_risk < 1.5:
        dominant_risk = 'low'
    elif avg_risk < 2.5:
        dominant_risk = 'medium'
    else:
        dominant_risk = 'high'
    
    # Store the statistics
    cell_stats[cell_id] = {
        'tree_count': tree_count,
        'center': [avg_lng, avg_lat],
        'avg_risk_value': avg_risk,
        'dominant_risk': dominant_risk,
        'trees': [tree['id'] for tree in trees]
    }

print(f"Calculated statistics for {len(cell_stats)} S2 cells")
print("\nExample cell statistics:")
first_cell_id = list(cell_stats.keys())[0]
print(json.dumps(cell_stats[first_cell_id], indent=2))

## Visualize Trees and S2 Cells

Let's create a simple visualization of our trees and S2 cells.

In [None]:
plt.figure(figsize=(10, 10))

# Plot all trees
for tree in tree_data:
    lng, lat = tree['location']
    risk = tree['risk_level']
    
    # Choose color based on risk
    if risk == 'low':
        color = 'green'
    elif risk == 'medium':
        color = 'orange'
    else:  # high
        color = 'red'
        
    plt.plot(lng, lat, 'o', color=color, markersize=6, alpha=0.7)

# Plot cell centers with size based on tree count
for cell_id, stats in cell_stats.items():
    center_lng, center_lat = stats['center']
    tree_count = stats['tree_count']
    
    # Choose color based on dominant risk
    risk = stats['dominant_risk']
    if risk == 'low':
        color = 'green'
    elif risk == 'medium':
        color = 'orange'
    else:  # high
        color = 'red'
    
    # Plot cell center with size based on tree count
    plt.plot(center_lng, center_lat, 's', color=color, 
             markersize=10 + tree_count, alpha=0.5)

# Add labels and title
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.title(f'Trees and S2 Cells ({level.capitalize()} Level)')

# Add legend
from matplotlib.lines import Line2D
legend_elements = [
    Line2D([0], [0], marker='o', color='w', markerfacecolor='green', markersize=10, label='Low Risk Tree'),
    Line2D([0], [0], marker='o', color='w', markerfacecolor='orange', markersize=10, label='Medium Risk Tree'),
    Line2D([0], [0], marker='o', color='w', markerfacecolor='red', markersize=10, label='High Risk Tree'),
    Line2D([0], [0], marker='s', color='w', markerfacecolor='blue', markersize=10, alpha=0.5, label='S2 Cell Center')
]
plt.legend(handles=legend_elements, loc='upper right')

plt.grid(True)
plt.tight_layout()
plt.show()

## Conclusion

In this notebook, we've demonstrated the S2 geospatial indexing system implemented for the Tree Risk Pro Dashboard. This system enables efficient spatial organization and querying of tree data, which is essential for both visualization and analysis.

Key benefits of S2 indexing for the Tree Risk Pro system:

1. Efficient spatial queries (find trees in a specific area)
2. Hierarchical grouping at different zoom levels
3. Fast neighbor lookups (find trees near a specific location)
4. Statistical analysis by geographic area
5. Clustering for visualization

The S2 indexing system integrates with YOLO/DeepForest detection and SAM segmentation to provide a comprehensive geospatial solution for tree risk assessment.