# 🔍 Function 1: Load and Explore Raster Data

## Building the `load_and_explore_raster` Function

**Learning Objectives:**
- Understand raster data structure and metadata
- Learn to open and inspect raster files with rasterio
- Extract comprehensive raster properties (CRS, bounds, transforms)
- Work with different raster formats (GeoTIFF, NetCDF, etc.)
- Handle raster metadata for downstream analysis

**Professional Context:**
Before any raster analysis, professionals always explore the data to understand:
- **Spatial extent and resolution** - Where is the data and how detailed is it?
- **Coordinate reference system** - What projection is used?
- **Data structure** - How many bands, what data type?
- **Data quality indicators** - NoData values, valid ranges

## Part 1: Understanding Raster Data Structure

### 1.1 What Makes Data "Spatial"?

Raster data is **georeferenced** - each pixel has a specific location on Earth. This is achieved through:

1. **Coordinate Reference System (CRS)** - Defines how coordinates relate to Earth's surface
2. **Affine Transform** - Mathematical relationship between pixel indices and coordinates
3. **Spatial Bounds** - The geographic extent covered by the raster

In [None]:
import rasterio
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path
import tempfile
import os

# Let's create a sample raster to work with
def create_sample_raster():
    """Create a sample elevation raster for demonstration."""
    # Create sample elevation data (simulated mountainous terrain)
    width, height = 100, 100
    x = np.linspace(-2, 2, width)
    y = np.linspace(-2, 2, height)
    X, Y = np.meshgrid(x, y)
    
    # Create elevation data with multiple peaks
    elevation = (1000 + 500 * np.exp(-(X**2 + Y**2)) + 
                 300 * np.exp(-((X-1)**2 + (Y-0.5)**2)) +
                 200 * np.random.normal(0, 0.1, (height, width)))
    
    # Define transform (pixel to coordinate mapping)
    transform = rasterio.transform.from_bounds(
        west=-120.5, south=35.0, east=-119.5, north=36.0,
        width=width, height=height
    )
    
    # Create temporary file
    temp_path = tempfile.mktemp(suffix='.tif')
    
    # Write the raster
    with rasterio.open(
        temp_path, 'w',
        driver='GTiff',
        height=height, width=width,
        count=1,  # number of bands
        dtype=elevation.dtype,
        crs='EPSG:4326',  # WGS84 geographic coordinates
        transform=transform,
        nodata=-9999
    ) as dst:
        dst.write(elevation, 1)  # Write to band 1
    
    return temp_path

# Create our sample data
sample_raster_path = create_sample_raster()
print(f"Created sample raster: {sample_raster_path}")

### 1.2 Opening and Inspecting a Raster

The first step in any raster analysis is opening the file and understanding its properties:

In [None]:
# Open the raster file
with rasterio.open(sample_raster_path) as src:
    print("=== BASIC PROPERTIES ===")
    print(f"Width (columns): {src.width} pixels")
    print(f"Height (rows): {src.height} pixels")
    print(f"Number of bands: {src.count}")
    print(f"Data type: {src.dtype}")
    print(f"NoData value: {src.nodata}")
    print(f"Driver (file format): {src.driver}")
    
    print("\n=== SPATIAL PROPERTIES ===")
    print(f"Coordinate Reference System: {src.crs}")
    print(f"Bounds (left, bottom, right, top): {src.bounds}")
    
    print("\n=== TRANSFORM MATRIX ===")
    print(f"Transform: {src.transform}")
    print(f"Pixel size X: {src.transform[0]}")
    print(f"Pixel size Y: {abs(src.transform[4])}")

### 1.3 Understanding the Affine Transform

The affine transform is a 6-parameter matrix that converts pixel coordinates to geographic coordinates:

```
| a  b  c |
| d  e  f |
| 0  0  1 |
```

Where:
- **a** = pixel width (x-direction)
- **b** = row rotation (usually 0)
- **c** = x-coordinate of upper-left corner
- **d** = column rotation (usually 0)
- **e** = pixel height (y-direction, usually negative)
- **f** = y-coordinate of upper-left corner

In [None]:
# Demonstrate coordinate transformation
with rasterio.open(sample_raster_path) as src:
    transform = src.transform
    
    # Convert pixel coordinates to geographic coordinates
    row, col = 0, 0  # Upper-left pixel
    x, y = rasterio.transform.xy(transform, row, col)
    print(f"Pixel (0, 0) -> Geographic ({x:.6f}, {y:.6f})")
    
    row, col = 50, 50  # Center pixel
    x, y = rasterio.transform.xy(transform, row, col)
    print(f"Pixel (50, 50) -> Geographic ({x:.6f}, {y:.6f})")
    
    # Convert geographic coordinates back to pixel coordinates
    row, col = rasterio.transform.rowcol(transform, x, y)
    print(f"Geographic ({x:.6f}, {y:.6f}) -> Pixel ({row}, {col})")

## Part 2: Implementing the Function

Now let's build the `load_and_explore_raster` function step by step. This function should extract all important raster properties into a comprehensive dictionary.

### 2.1 Function Requirements

Your function must return a dictionary with these keys:
- `'width'`: Width in pixels (int)
- `'height'`: Height in pixels (int)
- `'count'`: Number of bands (int)
- `'crs'`: Coordinate reference system as string
- `'driver'`: File format driver (str)
- `'dtype'`: Data type of the raster (str)
- `'nodata'`: NoData value (float or None)
- `'bounds'`: Geographic bounds as dict with 'left', 'bottom', 'right', 'top'
- `'transform'`: Affine transformation parameters (list of 6 values)
- `'pixel_size'`: Pixel resolution as tuple (x_res, y_res)

### 2.2 Step-by-Step Implementation

Let's implement this function together:

In [None]:
def load_and_explore_raster_example(raster_path: str):
    """Example implementation of load_and_explore_raster function."""
    
    # Step 1: Open the raster file
    with rasterio.open(raster_path) as src:
        
        # Step 2: Extract basic properties
        basic_info = {
            'width': src.width,
            'height': src.height,
            'count': src.count,
            'crs': str(src.crs) if src.crs else None,
            'driver': src.driver,
            'dtype': str(src.dtype),
            'nodata': src.nodata
        }
        
        # Step 3: Get geographic bounds
        bounds = src.bounds
        bounds_dict = {
            'left': bounds.left,
            'bottom': bounds.bottom,
            'right': bounds.right,
            'top': bounds.top
        }
        
        # Step 4: Get transformation matrix
        transform = src.transform
        transform_list = list(transform)[:6]  # Convert to list, take first 6 values
        
        # Step 5: Calculate pixel size
        pixel_size = (abs(transform[0]), abs(transform[4]))
        
        # Step 6: Combine all information
        raster_info = {
            **basic_info,  # Unpack basic info
            'bounds': bounds_dict,
            'transform': transform_list,
            'pixel_size': pixel_size
        }
        
        return raster_info

# Test the example function
info = load_and_explore_raster_example(sample_raster_path)

print("=== FUNCTION OUTPUT ===")
for key, value in info.items():
    print(f"{key}: {value}")

### 2.3 Understanding the Output

Let's explore what each piece of information tells us:

In [None]:
# Analyze the raster properties
print("=== RASTER ANALYSIS ===")
print(f"Spatial Coverage: {info['bounds']['right'] - info['bounds']['left']:.3f}° × {info['bounds']['top'] - info['bounds']['bottom']:.3f}°")
print(f"Resolution: {info['pixel_size'][0]:.6f}° per pixel")
print(f"Total pixels: {info['width'] * info['height']:,}")
print(f"Coordinate system: {info['crs']}")

# Calculate area covered (approximate for geographic coordinates)
if 'EPSG:4326' in str(info['crs']):
    # For geographic coordinates, convert to approximate kilometers
    lat_center = (info['bounds']['top'] + info['bounds']['bottom']) / 2
    degree_to_km = 111.32  # km per degree at equator
    width_km = (info['bounds']['right'] - info['bounds']['left']) * degree_to_km * np.cos(np.radians(lat_center))
    height_km = (info['bounds']['top'] - info['bounds']['bottom']) * degree_to_km
    print(f"Approximate area: {width_km:.1f} × {height_km:.1f} km")

### 2.4 Handling Different Raster Types

Your function should work with different raster types. Let's test with multi-band data:

In [None]:
# Create a multi-band raster (simulating RGB satellite image)
def create_multiband_raster():
    """Create a sample multi-band raster."""
    width, height = 50, 50
    
    # Simulate RGB bands
    red_band = np.random.randint(0, 255, (height, width), dtype=np.uint8)
    green_band = np.random.randint(0, 255, (height, width), dtype=np.uint8)
    blue_band = np.random.randint(0, 255, (height, width), dtype=np.uint8)
    
    transform = rasterio.transform.from_bounds(
        west=-121.0, south=35.5, east=-120.0, north=36.5,
        width=width, height=height
    )
    
    temp_path = tempfile.mktemp(suffix='_rgb.tif')
    
    with rasterio.open(
        temp_path, 'w',
        driver='GTiff',
        height=height, width=width,
        count=3,  # RGB = 3 bands
        dtype='uint8',
        crs='EPSG:4326',
        transform=transform,
        nodata=0
    ) as dst:
        dst.write(red_band, 1)
        dst.write(green_band, 2)
        dst.write(blue_band, 3)
    
    return temp_path

# Test with multi-band data
multiband_path = create_multiband_raster()
multiband_info = load_and_explore_raster_example(multiband_path)

print("=== MULTI-BAND RASTER ===")
print(f"Bands: {multiband_info['count']}")
print(f"Data type: {multiband_info['dtype']}")
print(f"NoData value: {multiband_info['nodata']}")

## Part 3: Your Implementation Task

### 3.1 Implementation Guidelines

Now it's time to implement this function in the `src/rasterio_basics.py` file. Here are the key steps:

```python
def load_and_explore_raster(raster_path: str) -> Dict[str, Any]:
    # TODO: Implement this function
    #
    # STEP 1: Open the raster file using rasterio.open()
    # HINT: Use a 'with' statement to ensure proper file handling
    #
    # STEP 2: Extract basic properties (width, height, count, crs, driver, dtype, nodata)
    # HINT: These are properties of the rasterio dataset object
    #
    # STEP 3: Get the geographic bounds
    # HINT: Use dataset.bounds which returns left, bottom, right, top
    #
    # STEP 4: Get the transformation matrix
    # HINT: Use dataset.transform and convert to list if needed
    #
    # STEP 5: Calculate pixel size from the transformation
    # HINT: Transform[0] is x pixel size, abs(transform[4]) is y pixel size
    #
    # STEP 6: Return all information as a dictionary
```

### 3.2 Testing Your Implementation

Once you've implemented the function, test it with:

```bash
uv run pytest tests/test_rasterio_basics.py::test_load_and_explore_raster -v
```

The test will verify that:
1. Your function returns the correct data structure
2. All required keys are present in the output dictionary
3. Data types are correct (strings, numbers, lists, etc.)
4. Values are reasonable and accurate

### 3.3 Common Issues and Solutions

**Issue 1: CRS is None**
```python
# Wrong:
'crs': src.crs

# Right:
'crs': str(src.crs) if src.crs else None
```

**Issue 2: Transform object vs list**
```python
# Wrong:
'transform': src.transform

# Right:
'transform': list(src.transform)[:6]
```

**Issue 3: Pixel size calculation**
```python
# Remember: Y pixel size is usually negative, take absolute value
'pixel_size': (abs(transform[0]), abs(transform[4]))
```

## Part 4: Professional Applications

### 4.1 Why This Function Matters

In professional GIS work, you **always** start by exploring your data:

- **Data Quality Assessment**: Check resolution, extent, coordinate system
- **Processing Planning**: Understand data structure before analysis
- **Integration Preparation**: Ensure compatibility with other datasets
- **Documentation**: Record data properties for reproducible workflows

### 4.2 Real-World Scenarios

This function is used when:
- Receiving new satellite imagery from data providers
- Preparing elevation data for hydrological modeling
- Integrating raster layers with different projections
- Quality checking processed raster outputs
- Creating metadata for data catalogs

## 🎯 Summary and Next Steps

### What You've Learned
- How to open and inspect raster files with rasterio
- Understanding raster metadata and spatial properties
- Working with coordinate reference systems and transforms
- Extracting comprehensive raster information programmatically

### Your Implementation Checklist
- [ ] Open raster file with proper context management
- [ ] Extract all required properties
- [ ] Format bounds as a dictionary with named keys
- [ ] Convert transform to list format
- [ ] Calculate pixel size correctly
- [ ] Return complete dictionary with all required keys

### Next Function
Once you've implemented and tested this function, move on to:
**[`02_function_calculate_raster_statistics.ipynb`](02_function_calculate_raster_statistics.ipynb)**

Where you'll learn to compute comprehensive statistics for raster data!

---

**Remember**: Understanding your data is the foundation of all successful GIS analysis! 🛰️

In [None]:
# Cleanup temporary files
import os

for temp_file in [sample_raster_path, multiband_path]:
    if os.path.exists(temp_file):
        os.remove(temp_file)
        
print("Cleaned up temporary files. Ready to implement!")