# 🔗 Spatial Joins & Analysis - Connecting Data Through Location

**GIST 604B - Python GeoPandas Analysis**  
**Notebook 3: Mastering Spatial Relationships and Analysis**

---

## 🎯 Learning Objectives

By the end of this notebook, you will be able to:
- Perform spatial joins to find relationships between datasets
- Execute different types of spatial intersections (intersects, within, contains)
- Aggregate spatial data by geographic groups and boundaries
- Apply complex multi-criteria filters combining spatial and attribute conditions
- Build complete spatial analysis workflows using multiple operations
- Implement the three spatial analysis functions professionally

## 🌍 Why Spatial Relationships Matter

The power of GIS comes from understanding **where things are in relation to other things**:
- **Which cities are in flood zones?** - Point-in-polygon analysis
- **How many people live near hospitals?** - Proximity and aggregation
- **What roads cross protected areas?** - Line-polygon intersection
- **Which counties have high pollution AND low income?** - Multi-criteria analysis

**Spatial relationships reveal patterns that attributes alone cannot show!**

In [None]:
# Import necessary libraries for spatial analysis
import geopandas as gpd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from shapely.geometry import Point, Polygon, LineString
from shapely.ops import unary_union
import warnings
warnings.filterwarnings('ignore')

print("🔗 Spatial analysis toolkit loaded!")
print(f"🐼 GeoPandas version: {gpd.__version__}")
print("✅ Ready for spatial relationships analysis!")

## 🎯 Function 4: Spatial Intersections and Joins

Finding features that have spatial relationships with other features...

In [None]:
# TODO: Create sample datasets for spatial join demonstrations
# - Cities (points) and watersheds (polygons)
# - Roads (lines) and administrative boundaries (polygons)
# - Monitoring stations and environmental zones
# - Ensure different CRS scenarios for testing
pass

### Understanding Spatial Predicates

The different ways features can be spatially related...

In [None]:
# TODO: Explain and visualize spatial predicates
# - Intersects: Any overlap or touching
# - Within: Completely inside another feature
# - Contains: Completely contains another feature
# - Crosses: Lines crossing polygon boundaries
# - Touches: Adjacent but not overlapping
# - Overlaps: Partial overlap
pass

### Point-in-Polygon Analysis

The most common spatial analysis: finding which polygon contains each point...

In [None]:
# TODO: Demonstrate point-in-polygon analysis
# - Cities within states/counties
# - Monitoring stations within watersheds
# - GPS points within administrative boundaries
# - Handle edge cases (points on boundaries)
pass

### Polygon Intersection Analysis

Finding overlapping areas between polygon datasets...

In [None]:
# TODO: Demonstrate polygon intersection analysis
# - Land use categories overlapping with flood zones
# - Protected areas intersecting with development zones
# - Calculate intersection areas and percentages
# - Visualize overlapping regions
pass

### Line-Polygon Relationships

Analyzing how linear features interact with areal features...

In [None]:
# TODO: Demonstrate line-polygon relationships
# - Roads crossing administrative boundaries
# - Rivers flowing through different land use areas
# - Utility lines intersecting environmental zones
# - Calculate crossing lengths and intersection points
pass

### Handling CRS Compatibility in Spatial Joins

Ensuring accurate spatial relationships across different coordinate systems...

In [None]:
# TODO: Demonstrate CRS handling in spatial joins
# - Detecting CRS mismatches
# - Automatic reprojection strategies
# - Performance implications of reprojection
# - Validation of join results
pass

### Implementing execute_spatial_intersection()

Building a robust spatial intersection function...

In [None]:
# TODO: Step-by-step implementation guide for execute_spatial_intersection()
# Handle all spatial predicates and edge cases
pass

## 📊 Function 5: Spatial Data Aggregation

Summarizing data by geographic groups and boundaries...

### Attribute-Based Aggregation

Grouping features by shared attribute values...

In [None]:
# TODO: Demonstrate attribute-based spatial aggregation
# - Group cities by state, calculate total population
# - Group land parcels by zoning type, calculate total area
# - Handle different aggregation functions (sum, mean, count, etc.)
# - Aggregate geometries (union, dissolve)
pass

### Spatial Aggregation by Geographic Boundaries

Summarizing point data within polygon boundaries...

In [None]:
# TODO: Demonstrate spatial aggregation
# - Aggregate monitoring station data by watersheds
# - Summarize crime incidents by police districts
# - Calculate statistics for points within polygons
# - Handle empty aggregation groups
pass

### Advanced Aggregation Techniques

Sophisticated aggregation methods for complex analysis...

In [None]:
# TODO: Show advanced aggregation techniques
# - Weighted aggregations (area-weighted, population-weighted)
# - Multi-level aggregations (nested geographic hierarchies)
# - Time-based spatial aggregations
# - Handling missing data in aggregations
pass

### Geometry Aggregation Strategies

How to handle geometries when grouping features...

In [None]:
# TODO: Demonstrate geometry aggregation strategies
# - Union/dissolve for polygon groups
# - Multipoint for point groups
# - Convex hull for point clusters
# - Representative points for complex aggregations
pass

### Implementing aggregate_spatial_data()

Creating a flexible aggregation function...

In [None]:
# TODO: Step-by-step implementation guide for aggregate_spatial_data()
# Handle both attribute and spatial grouping methods
pass

## 🔍 Function 6: Multi-Criteria Spatial Filtering

Selecting features that meet complex spatial and attribute conditions...

### Spatial Filter Types

Different ways to filter features based on spatial properties...

In [None]:
# TODO: Demonstrate different spatial filter types
# - Area-based filters (min/max area)
# - Bounding box filters (within geographic extent)
# - Proximity filters (within distance of features)
# - Intersection filters (overlapping with specific areas)
pass

### Attribute Filter Strategies

Filtering based on non-spatial characteristics...

In [None]:
# TODO: Demonstrate attribute filtering
# - Numeric range filters (population between X and Y)
# - Categorical filters (city type in ['urban', 'suburban'])
# - Date range filters (events after specific date)
# - Null/missing value handling
pass

### Combining Multiple Criteria

Building complex selection logic with multiple conditions...

In [None]:
# TODO: Demonstrate multi-criteria filtering
# - AND logic: features meeting ALL conditions
# - OR logic: features meeting ANY condition
# - Complex combinations of spatial and attribute filters
# - Progressive filtering with condition tracking
pass

### Real-World Multi-Criteria Examples

Practical applications of complex spatial filtering...

In [None]:
# TODO: Show realistic multi-criteria examples
# - Site selection: large parcels, near roads, outside flood zones
# - Market analysis: high-income areas, population > 1000, within 10km of city
# - Environmental: protected species habitat AND low human disturbance
# - Emergency planning: hospitals with capacity AND accessible by major roads
pass

### Performance Optimization for Complex Filters

Making multi-criteria analysis efficient with large datasets...

In [None]:
# TODO: Demonstrate filter optimization techniques
# - Apply most selective filters first
# - Use spatial indexes for geometric filters
# - Batch processing for large datasets
# - Memory-efficient filtering strategies
pass

### Implementing filter_by_spatial_criteria()

Building a comprehensive multi-criteria filtering function...

In [None]:
# TODO: Step-by-step implementation guide for filter_by_spatial_criteria()
# Handle complex criteria combinations and edge cases
pass

## 🔄 Complete Analysis Workflows

Combining all three functions into powerful analysis pipelines...

In [None]:
# TODO: Demonstrate complete analysis workflows
# Example 1: Environmental Impact Assessment
# - Filter industrial sites by size and proximity to water
# - Find communities within impact buffer zones
# - Aggregate population statistics by impact level
#
# Example 2: Public Health Analysis
# - Intersect disease cases with demographic areas
# - Aggregate case counts by income level
# - Filter high-risk areas by multiple criteria
#
# Example 3: Urban Planning Study
# - Find developable land (large parcels, outside flood zones)
# - Calculate access to services (within buffers of schools, hospitals)
# - Aggregate development potential by planning district
pass

## 🧪 Testing Your Spatial Analysis Functions

Validating that your spatial analysis functions work correctly...

In [None]:
# TODO: Comprehensive testing examples
# - Test spatial joins with different geometry combinations
# - Test aggregation with various grouping methods
# - Test filtering with edge cases and boundary conditions
# - Validate results make geographic sense
# - Performance testing with realistic dataset sizes
pass

## 📈 Performance and Scalability

Making your spatial analysis efficient with large datasets...

In [None]:
# TODO: Demonstrate performance optimization techniques
# - Spatial indexing with .sindex
# - Chunked processing for large datasets
# - Memory-efficient operations
# - When to use different join strategies
# - Profiling and benchmarking spatial operations
pass

## 🌟 Professional Applications

How spatial analysis techniques are used across different industries...

In [None]:
# TODO: Show professional application examples
# - Environmental consulting: habitat impact assessments
# - Urban planning: zoning compliance and development analysis
# - Public health: disease surveillance and service planning
# - Business intelligence: market analysis and site selection
# - Emergency management: resource allocation and risk assessment
# - Transportation: route planning and accessibility analysis
pass

## 🎯 Key Takeaways

After completing this notebook, you should understand:

✅ **Spatial relationships** - How features relate to each other through location  
✅ **Spatial joins** - Combining datasets based on geographic relationships  
✅ **Data aggregation** - Summarizing information by spatial groups  
✅ **Multi-criteria analysis** - Applying complex selection logic  
✅ **Analysis workflows** - Combining operations for comprehensive analysis  
✅ **Performance optimization** - Handling large datasets efficiently  

## 📚 Next Steps

1. **Implement** your three spatial analysis functions in `src/spatial_analysis.py`
2. **Test** your implementations with `uv run pytest tests/ -k "spatial" -v`
3. **Move on** to `04_mapping_visualization.ipynb` to learn about professional mapping

---

*Spatial analysis is where GIS really shines - connecting datasets through location to reveal patterns and relationships invisible in traditional data analysis. Master these techniques and you'll be ready to tackle complex real-world spatial problems!* 🌟