# 📂 Loading Spatial Data - Your Data Loading Toolkit

**GIST 604B - Python GeoPandas Introduction**  
**Notebook 2: Mastering Spatial Data Loading**

---

## 🎯 Learning Objectives

By the end of this notebook, you will be able to:
- Load spatial data from different file formats (Shapefile, GeoJSON, GeoPackage)
- Handle common loading issues (encoding, missing files, corrupted data)
- Load spatial data from URLs and compressed files
- Understand the structure of different spatial data formats
- Implement the `load_spatial_dataset()` function

## 📁 Common Spatial Data Formats Deep Dive

Let's explore each major spatial data format and learn how to work with them...

In [None]:
# Import necessary libraries
import geopandas as gpd
import pandas as pd
import numpy as np
from pathlib import Path
import os
import warnings
warnings.filterwarnings('ignore')

print("📦 Libraries loaded successfully!")
print(f"🐼 GeoPandas version: {gpd.__version__}")

## 🗂️ Format 1: Shapefile (.shp) - The GIS Classic

Shapefiles are the most common vector format in GIS. They're actually composed of multiple files:
- `.shp` - Main geometry file
- `.shx` - Index file
- `.dbf` - Attribute data
- `.prj` - Projection information
- Others: `.sbn`, `.sbx`, `.fbn`, `.fbx`, `.ain`, `.aih`, `.cpg`

**Key characteristics:**
- Binary format (not human-readable)
- Widely supported by all GIS software
- Limited to 2GB file size
- Column names limited to 10 characters
- Only one geometry type per file

In [None]:
# TODO: Load a shapefile and explore its structure
# Example code will go here for loading shapefiles
pass

## 🌐 Format 2: GeoJSON (.geojson, .json) - The Web Standard

GeoJSON is a text-based format that's perfect for web applications:

**Key characteristics:**
- Human-readable text format (JSON-based)
- Single file contains everything
- Perfect for web mapping
- Supports multiple geometry types in one file
- Can be large for complex datasets
- Always uses WGS84 coordinate system (EPSG:4326)

In [None]:
# TODO: Load a GeoJSON file and compare with shapefile
# Example code will go here for loading GeoJSON
pass

## 📦 Format 3: GeoPackage (.gpkg) - The Modern Choice

GeoPackage is a modern, SQLite-based spatial format:

**Key characteristics:**
- Based on SQLite database
- Single file, no size limitations
- Can store multiple layers
- Supports raster and vector data
- Open standard (OGC approved)
- Cross-platform compatibility

In [None]:
# TODO: Load a GeoPackage and explore multiple layers
# Example code will go here for loading GeoPackage
pass

## 🛠️ Handling Common Loading Issues

Real-world spatial data often has problems. Let's learn to handle them gracefully...

### Issue 1: File Not Found Errors

In [None]:
# TODO: Demonstrate proper file path handling and error checking
pass

### Issue 2: Encoding Problems

In [None]:
# TODO: Show how to handle different text encodings
pass

### Issue 3: Corrupted or Invalid Files

In [None]:
# TODO: Demonstrate graceful handling of corrupted files
pass

## 🌐 Loading Data from URLs

Sometimes you need to load spatial data directly from the web...

In [None]:
# TODO: Show how to load spatial data from URLs
pass

## 📦 Working with Compressed Files

Spatial data is often compressed to save space and bandwidth...

In [None]:
# TODO: Demonstrate loading from ZIP files and other compressed formats
pass

## 🛠️ Building Your load_spatial_dataset() Function

Now let's put it all together and build a robust function for loading spatial data...

In [None]:
# TODO: Step-by-step implementation guide for load_spatial_dataset()
# This will walk through each requirement and provide working examples
pass

## 🧪 Testing Your Implementation

Let's test our function with different scenarios...

In [None]:
# TODO: Test cases for the function
pass

## 🎯 Key Takeaways

After completing this notebook, you should understand:

✅ **Different spatial formats** - When to use Shapefile vs GeoJSON vs GeoPackage  
✅ **Error handling** - How to gracefully handle loading problems  
✅ **File path management** - Using Path objects for better path handling  
✅ **Format detection** - How to identify file types automatically  
✅ **Validation** - Checking that loaded data is valid  

## 📚 Next Steps

1. **Implement** your `load_spatial_dataset()` function in `src/spatial_basics.py`
2. **Test** your implementation with `uv run pytest tests/ -k "load_spatial_dataset" -v`
3. **Move on** to `03_explore_properties.ipynb` to learn about spatial data exploration

---

*Remember: Good data loading is the foundation of all spatial analysis. Take time to build robust loading functions that handle real-world data problems gracefully!* 🌟