# Reclamation Assessment using Robust Z-Score Transformed NDVI

## Overview
This notebook analyzes crop growth within lease boundaries compared to the background field using robust z-score transformation of NDVI rasters. The robust z-score method uses median and MAD (Median Absolute Deviation) statistics, making it more resistant to outliers than standard z-scores.

## Workflow
1. Upload multiple NDVI rasters (GeoTIFFs)
2. Upload two polygon boundaries:
   - Field boundary (entire field)
   - Lease boundary (area of interest within field)
3. For each NDVI raster:
   - Create background mask (field minus lease)
   - Calculate robust statistics on background pixels
   - Transform entire raster using background statistics
   - Generate z-score raster showing standard deviations from background median
4. Download transformed rasters for further analysis

## Important Notes
- **NoData Handling**: All pixels outside the field boundary are automatically set to NoData
- **Multiple Uploads**: You can run the upload cell multiple times to add forgotten files

## Interpretation
- **Z-score = 0**: Pixel value equals background median
- **Z-score > 0**: Above background median (better than background)
- **Z-score < 0**: Below background median (worse than background)
- **|Z-score| > 2**: Significantly different from background (outlier)

## 1. Setup and Imports

In [None]:
# Install required packages
%pip install -q geopandas rasterio fiona shapely numpy pandas matplotlib

# Import libraries
import os
import warnings
import zipfile
from datetime import datetime
from typing import List, Tuple, Optional, Dict, Any

import numpy as np
import pandas as pd
import geopandas as gpd
import rasterio
from rasterio.io import MemoryFile
from rasterio.mask import mask
from rasterio.warp import calculate_default_transform, reproject, Resampling
from shapely.geometry import shape, mapping
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
from matplotlib.patches import Patch

# Google Colab specific imports
from google.colab import files
from IPython.display import display, HTML

warnings.filterwarnings('ignore')

print("‚úÖ Setup complete. All libraries imported successfully.")
print("üìç Running in Google Colab environment")

## 2. File Upload Interface

### Upload Files (Run Multiple Times as Needed)
You can run this cell multiple times to add more files. Previous uploads are preserved.

In [None]:
# Initialize file storage if not exists
if 'all_uploaded_files' not in locals():
    all_uploaded_files = {}
if 'ndvi_files' not in locals():
    ndvi_files = []
if 'polygon_files' not in locals():
    polygon_files = []

print("üìÇ Upload your files (you can run this cell multiple times to add more):")
print("="*50)
print("Required files:")
print("1Ô∏è‚É£ NDVI Rasters (.tif or .tiff files)")
print("2Ô∏è‚É£ Field Boundary polygon (.kml, .geojson, or .shp)")
print("3Ô∏è‚É£ Lease Boundary polygon (.kml, .geojson, or .shp)")
print("="*50)
print("\nüí° TIP: Forgot a file? Just run this cell again!\n")

# Upload files
uploaded = files.upload()

# Add to master collection and save to disk
for filename, content in uploaded.items():
    all_uploaded_files[filename] = content
    # Save to disk for processing
    with open(filename, 'wb') as f:
        f.write(content)
    print(f"‚úÖ Added: {filename} ({len(content)/1024:.1f} KB)")

# Re-categorize all files
ndvi_files = []
polygon_files = []

for filename in all_uploaded_files.keys():
    if filename.lower().endswith(('.tif', '.tiff')):
        ndvi_files.append(filename)
    elif filename.lower().endswith(('.kml', '.geojson', '.shp', '.json')):
        polygon_files.append(filename)

# Sort files for consistent processing
ndvi_files.sort()
polygon_files.sort()

print(f"\nüìä Total Files Summary:")
print(f"\nüìà NDVI Rasters ({len(ndvi_files)} files):")
if ndvi_files:
    for f in ndvi_files:
        print(f"   ‚Ä¢ {f}")
else:
    print("   ‚ö†Ô∏è No NDVI files uploaded yet")
    
print(f"\nüó∫Ô∏è Polygon Files ({len(polygon_files)} files):")
if polygon_files:
    for f in polygon_files:
        print(f"   ‚Ä¢ {f}")
else:
    print("   ‚ö†Ô∏è No polygon files uploaded yet")

# Status check
print("\n" + "="*50)
if len(ndvi_files) > 0 and len(polygon_files) >= 2:
    print("‚úÖ All required file types present! Ready to proceed.")
else:
    missing = []
    if len(ndvi_files) == 0:
        missing.append("NDVI rasters")
    if len(polygon_files) < 2:
        missing.append(f"polygon files (need 2, have {len(polygon_files)})")
    print(f"‚ö†Ô∏è Still need: {', '.join(missing)}")
    print("   Run this cell again to add more files.")

## 3. Identify Field and Lease Boundaries

In [None]:
# Helper function to load polygon
def load_polygon(filename: str) -> gpd.GeoDataFrame:
    """Load polygon from various formats"""
    try:
        # Read directly from file
        gdf = gpd.read_file(filename)
        return gdf
    except Exception as e:
        print(f"Error loading {filename}: {e}")
        return None

# Load and identify polygons
if len(polygon_files) >= 2:
    print("üîç Identifying field and lease boundaries...\n")
    
    # Try to auto-identify based on filename
    field_boundary_file = None
    lease_boundary_file = None
    
    for filename in polygon_files:
        fname_lower = filename.lower()
        if 'field' in fname_lower and not field_boundary_file:
            field_boundary_file = filename
        elif 'lease' in fname_lower and not lease_boundary_file:
            lease_boundary_file = filename
    
    # If not automatically identified, use manual selection
    if not field_boundary_file or not lease_boundary_file:
        print("Could not auto-identify boundaries from filenames.")
        print("\nPlease identify which file is which:")
        print("Available polygon files:")
        for i, filename in enumerate(polygon_files):
            print(f"  {i+1}. {filename}")
        
        print("\nüìù Using default assignment:")
        print("   (Modify the code below if different assignment is needed)")
        field_boundary_file = polygon_files[0]
        lease_boundary_file = polygon_files[1] if len(polygon_files) > 1 else polygon_files[0]
    
    print(f"üìç Field Boundary: {field_boundary_file}")
    print(f"üìç Lease Boundary: {lease_boundary_file}")
    
    # Load the polygons
    field_gdf = load_polygon(field_boundary_file)
    lease_gdf = load_polygon(lease_boundary_file)
    
    if field_gdf is not None and lease_gdf is not None:
        print("\n‚úÖ Both boundaries loaded successfully")
        print(f"   Field CRS: {field_gdf.crs}")
        print(f"   Lease CRS: {lease_gdf.crs}")
        print(f"   Field area: {field_gdf.geometry[0].area:.6f} sq units")
        print(f"   Lease area: {lease_gdf.geometry[0].area:.6f} sq units")
else:
    print("‚ùå Need at least 2 polygon files to proceed")
    print("   Please run the upload cell again to add missing files.")
    field_gdf = None
    lease_gdf = None

## 4. Process NDVI Rasters with Robust Z-Score Transformation

In [None]:
def calculate_robust_stats(data: np.ndarray) -> Dict[str, float]:
    """Calculate robust statistics (median and MAD)"""
    # Remove NaN and NoData values
    valid_data = data[~np.isnan(data)]
    
    # Additional filtering for common NoData values
    nodata_values = [-9999, -10000, -3.4028235e+38, 3.4028235e+38]
    for ndv in nodata_values:
        valid_data = valid_data[np.abs(valid_data - ndv) > 1e-6]
    
    if len(valid_data) == 0:
        return {
            'median': np.nan, 
            'mad': np.nan, 
            'robust_std': np.nan, 
            'n_valid': 0
        }
    
    median = np.median(valid_data)
    mad = np.median(np.abs(valid_data - median))
    robust_std = 1.4826 * mad  # Scale factor for consistency with standard deviation
    
    return {
        'median': median,
        'mad': mad,
        'robust_std': robust_std,
        'n_valid': len(valid_data),
        'mean': np.mean(valid_data),  # For comparison
        'std': np.std(valid_data)     # For comparison
    }

def process_ndvi_raster(raster_file: str, field_geom, lease_geom) -> Dict[str, Any]:
    """Process a single NDVI raster with robust z-score transformation"""
    
    results = {'filename': raster_file}
    
    try:
        with rasterio.open(raster_file) as src:
            # Read metadata
            raster_crs = src.crs
            nodata_value = src.nodata if src.nodata is not None else -9999
            
            # Reproject polygons to match raster CRS if needed
            if field_gdf.crs != raster_crs:
                field_geom_proj = field_gdf.to_crs(raster_crs).geometry[0]
                lease_geom_proj = lease_gdf.to_crs(raster_crs).geometry[0]
            else:
                field_geom_proj = field_geom
                lease_geom_proj = lease_geom
            
            # Crop to field extent with explicit NoData handling
            # This ensures pixels outside field are set to NaN
            field_data, out_transform = mask(src, [field_geom_proj], 
                                            crop=True, 
                                            nodata=np.nan,
                                            filled=True)
            field_data = field_data[0].astype(np.float32)  # Get first band
            
            # Create binary field mask (True where field exists)
            from rasterio.features import geometry_mask
            field_mask = ~geometry_mask(
                [field_geom_proj],
                out_shape=field_data.shape,
                transform=out_transform,
                invert=False
            )
            
            # Create lease mask within field extent
            lease_mask = ~geometry_mask(
                [lease_geom_proj],
                out_shape=field_data.shape,
                transform=out_transform,
                invert=False
            )
            
            # Ensure lease is within field
            lease_mask = lease_mask & field_mask
            
            # Create background mask (field minus lease)
            background_mask = field_mask & ~lease_mask
            
            # Extract background pixels for statistics
            background_pixels = np.full_like(field_data, np.nan)
            background_pixels[background_mask] = field_data[background_mask]
            
            # Handle NoData values in background
            nodata_values = [nodata_value, -9999, -10000, -3.4028235e+38]
            for ndv in nodata_values:
                background_pixels[np.abs(background_pixels - ndv) < 1e-6] = np.nan
            
            # Calculate robust statistics on background
            stats = calculate_robust_stats(background_pixels)
            
            # Initialize z-score raster with NaN (everything outside field is NaN)
            z_score_raster = np.full(field_data.shape, np.nan, dtype=np.float32)
            
            # Only calculate z-scores for pixels within field boundary
            if stats['robust_std'] > 0:  # Avoid division by zero
                # Find valid field pixels (not NaN and not NoData)
                valid_field_pixels = field_mask.copy()
                
                # Exclude NaN values
                valid_field_pixels = valid_field_pixels & ~np.isnan(field_data)
                
                # Exclude NoData values
                for ndv in nodata_values:
                    valid_field_pixels = valid_field_pixels & (np.abs(field_data - ndv) > 1e-6)
                
                # Calculate z-scores only for valid pixels
                z_score_raster[valid_field_pixels] = (
                    field_data[valid_field_pixels] - stats['median']
                ) / stats['robust_std']
            
            # Double-check: ensure everything outside field boundary is NaN
            z_score_raster[~field_mask] = np.nan
            
            # Store results
            results['success'] = True
            results['z_score_raster'] = z_score_raster
            results['transform'] = out_transform
            results['crs'] = raster_crs
            results['stats'] = stats
            results['lease_mask'] = lease_mask
            results['field_mask'] = field_mask
            results['background_mask'] = background_mask
            results['shape'] = z_score_raster.shape
            
            # Calculate summary statistics for lease area
            lease_pixels = z_score_raster[lease_mask & ~np.isnan(z_score_raster)]
            
            if len(lease_pixels) > 0:
                results['lease_stats'] = {
                    'mean_z': np.mean(lease_pixels),
                    'median_z': np.median(lease_pixels),
                    'std_z': np.std(lease_pixels),
                    'min_z': np.min(lease_pixels),
                    'max_z': np.max(lease_pixels),
                    'n_pixels': len(lease_pixels),
                    'pct_above_zero': np.sum(lease_pixels > 0) / len(lease_pixels) * 100,
                    'pct_below_minus2': np.sum(lease_pixels < -2) / len(lease_pixels) * 100,
                    'pct_above_2': np.sum(lease_pixels > 2) / len(lease_pixels) * 100
                }
            else:
                results['lease_stats'] = None
            
            # Calculate background area statistics for comparison
            background_z = z_score_raster[background_mask & ~np.isnan(z_score_raster)]
            if len(background_z) > 0:
                results['background_stats'] = {
                    'mean_z': np.mean(background_z),
                    'median_z': np.median(background_z),
                    'std_z': np.std(background_z),
                    'n_pixels': len(background_z)
                }
            
    except Exception as e:
        results['success'] = False
        results['error'] = str(e)
    
    return results

print("‚úÖ Processing functions defined")
print("\nFunction capabilities:")
print("  ‚Ä¢ Robust statistics using Median and MAD")
print("  ‚Ä¢ Automatic NoData masking outside field boundary")
print("  ‚Ä¢ Background area = Field minus Lease")
print("  ‚Ä¢ Z-score calculation relative to background median")

In [None]:
# Process all NDVI rasters
processed_rasters = []

if ndvi_files and field_gdf is not None and lease_gdf is not None:
    print("üîÑ Processing NDVI rasters...")
    print("="*60)
    
    field_geom = field_gdf.geometry[0]
    lease_geom = lease_gdf.geometry[0]
    
    for i, raster_file in enumerate(ndvi_files):
        print(f"\n[{i+1}/{len(ndvi_files)}] Processing: {raster_file}")
        print("-" * 40)
        
        result = process_ndvi_raster(raster_file, field_geom, lease_geom)
        
        if result['success']:
            processed_rasters.append(result)
            stats = result['stats']
            
            print(f"‚úÖ Successfully processed")
            print(f"\nüìä Background Statistics:")
            print(f"   ‚Ä¢ Median: {stats['median']:.4f}")
            print(f"   ‚Ä¢ MAD: {stats['mad']:.4f}")
            print(f"   ‚Ä¢ Robust Std: {stats['robust_std']:.4f}")
            print(f"   ‚Ä¢ Valid pixels: {stats['n_valid']:,}")
            print(f"   ‚Ä¢ Mean (comparison): {stats['mean']:.4f}")
            print(f"   ‚Ä¢ Std (comparison): {stats['std']:.4f}")
            
            if result['lease_stats']:
                lease_stats = result['lease_stats']
                print(f"\nüéØ Lease Area Z-Score Statistics:")
                print(f"   ‚Ä¢ Mean Z: {lease_stats['mean_z']:.3f}")
                print(f"   ‚Ä¢ Median Z: {lease_stats['median_z']:.3f}")
                print(f"   ‚Ä¢ Range: [{lease_stats['min_z']:.3f}, {lease_stats['max_z']:.3f}]")
                print(f"   ‚Ä¢ Pixels: {lease_stats['n_pixels']:,}")
                print(f"\nüìà Performance Indicators:")
                print(f"   ‚Ä¢ Above background (Z>0): {lease_stats['pct_above_zero']:.1f}%")
                print(f"   ‚Ä¢ Significantly below (Z<-2): {lease_stats['pct_below_minus2']:.1f}%")
                print(f"   ‚Ä¢ Significantly above (Z>2): {lease_stats['pct_above_2']:.1f}%")
        else:
            print(f"‚ùå Error: {result.get('error', 'Unknown error')}")
    
    print("\n" + "="*60)
    print(f"‚úÖ Processed {len(processed_rasters)}/{len(ndvi_files)} rasters successfully")
else:
    print("‚ùå Cannot process: Missing required files")
    if not ndvi_files:
        print("   ‚Ä¢ No NDVI rasters uploaded")
    if field_gdf is None:
        print("   ‚Ä¢ Field boundary not loaded")
    if lease_gdf is None:
        print("   ‚Ä¢ Lease boundary not loaded")

## 5. Visualize Z-Score Transformed Rasters

In [None]:
def plot_z_score_raster(result: Dict, figsize=(14, 8)):
    """Create visualization of z-score transformed raster"""
    
    fig, axes = plt.subplots(1, 3, figsize=figsize)
    
    z_raster = result['z_score_raster']
    lease_mask = result['lease_mask']
    field_mask = result['field_mask']
    background_mask = result['background_mask']
    
    # Create custom colormap (red-white-green)
    colors = ['darkred', 'red', 'white', 'lightgreen', 'darkgreen']
    n_bins = 100
    cmap = mcolors.LinearSegmentedColormap.from_list('z_score', colors, N=n_bins)
    
    # Set color limits for better visualization
    vmin, vmax = -3, 3  # Standard range for z-scores
    
    # Plot 1: Full field z-score map
    im1 = axes[0].imshow(z_raster, cmap=cmap, vmin=vmin, vmax=vmax)
    axes[0].set_title('Z-Score Transformed NDVI\n(Full Field)', fontsize=11, fontweight='bold')
    axes[0].axis('off')
    
    # Add lease boundary overlay
    lease_overlay = np.ma.masked_where(~lease_mask, np.ones_like(z_raster))
    axes[0].imshow(lease_overlay, alpha=0.2, cmap='Blues')
    
    # Plot 2: Mask visualization
    mask_display = np.zeros_like(z_raster)
    mask_display[background_mask] = 1  # Background in gray
    mask_display[lease_mask] = 2       # Lease in blue
    mask_display[~field_mask] = np.nan # Outside field is transparent
    
    cmap_masks = mcolors.ListedColormap(['lightgray', 'lightblue'])
    axes[1].imshow(mask_display, cmap=cmap_masks, alpha=0.8)
    axes[1].set_title('Area Masks', fontsize=11, fontweight='bold')
    axes[1].axis('off')
    
    # Add legend
    from matplotlib.patches import Patch
    legend_elements = [
        Patch(facecolor='lightgray', label='Background'),
        Patch(facecolor='lightblue', label='Lease Area')
    ]
    axes[1].legend(handles=legend_elements, loc='upper right', fontsize=9)
    
    # Plot 3: Histogram of z-scores
    valid_z = z_raster[~np.isnan(z_raster)]
    lease_z = z_raster[lease_mask & ~np.isnan(z_raster)]
    background_z = z_raster[background_mask & ~np.isnan(z_raster)]
    
    axes[2].hist(background_z, bins=50, alpha=0.5, label='Background', color='gray', density=True)
    axes[2].hist(lease_z, bins=50, alpha=0.7, label='Lease Area', color='blue', density=True)
    axes[2].axvline(0, color='black', linestyle='--', label='Background Median', linewidth=2)
    axes[2].axvline(-2, color='red', linestyle=':', alpha=0.5)
    axes[2].axvline(2, color='red', linestyle=':', alpha=0.5, label='¬±2 Robust Std')
    
    axes[2].set_xlabel('Z-Score', fontsize=10)
    axes[2].set_ylabel('Density', fontsize=10)
    axes[2].set_title('Distribution of Z-Scores', fontsize=11, fontweight='bold')
    axes[2].legend(loc='upper right', fontsize=9)
    axes[2].grid(True, alpha=0.3)
    axes[2].set_xlim(-4, 4)
    
    # Add colorbar
    cbar = plt.colorbar(im1, ax=axes, orientation='horizontal', pad=0.1, aspect=40)
    cbar.set_label('Z-Score (Robust Standard Deviations from Background Median)', fontsize=10)
    
    # Add title with filename
    fig.suptitle(f"File: {result['filename']}", fontsize=13, fontweight='bold', y=1.02)
    
    plt.tight_layout()
    return fig

# Visualize processed rasters
if processed_rasters:
    print("\nüìä Generating visualizations...")
    print("="*50)
    
    # Show first few rasters (to avoid overwhelming output)
    max_plots = min(3, len(processed_rasters))
    
    for i in range(max_plots):
        print(f"\nVisualization {i+1}/{max_plots}: {processed_rasters[i]['filename']}")
        fig = plot_z_score_raster(processed_rasters[i])
        plt.show()
    
    if len(processed_rasters) > max_plots:
        print(f"\nüìå Note: Showing first {max_plots} of {len(processed_rasters)} visualizations")
        print("   (All rasters will be included in the download)")
else:
    print("\n‚ö†Ô∏è No processed rasters to visualize")

## 6. Export Z-Score Transformed Rasters

In [None]:
def save_z_score_geotiff(result: Dict, output_dir: str) -> str:
    """Save z-score raster as GeoTIFF with proper NoData handling"""
    
    # Create output filename
    base_name = os.path.splitext(result['filename'])[0]
    output_file = os.path.join(output_dir, f"{base_name}_zscore.tif")
    
    # Prepare data for saving
    z_data = result['z_score_raster'].copy()
    
    # Write GeoTIFF
    with rasterio.open(
        output_file,
        'w',
        driver='GTiff',
        height=result['shape'][0],
        width=result['shape'][1],
        count=1,
        dtype='float32',
        crs=result['crs'],
        transform=result['transform'],
        compress='lzw',
        nodata=-9999  # Set explicit NoData value
    ) as dst:
        # Replace NaN with NoData value for saving
        z_data[np.isnan(z_data)] = -9999
        dst.write(z_data.astype(np.float32), 1)
        
        # Add metadata tags
        dst.update_tags(
            description="Robust Z-Score Transformed NDVI",
            background_median=str(result['stats']['median']),
            background_mad=str(result['stats']['mad']),
            background_robust_std=str(result['stats']['robust_std']),
            processing_date=datetime.now().isoformat(),
            interpretation="Values represent robust standard deviations from background median",
            nodata_note="Pixels outside field boundary are set to NoData (-9999)"
        )
    
    return output_file

# Create output directory and save all processed rasters
if processed_rasters:
    output_dir = 'zscore_outputs'
    os.makedirs(output_dir, exist_ok=True)
    
    print("üíæ Saving z-score transformed rasters...")
    print("="*50)
    
    saved_files = []
    
    for i, result in enumerate(processed_rasters):
        try:
            output_file = save_z_score_geotiff(result, output_dir)
            saved_files.append(output_file)
            file_size = os.path.getsize(output_file) / 1024  # KB
            print(f"   ‚úÖ [{i+1}/{len(processed_rasters)}] Saved: {os.path.basename(output_file)} ({file_size:.1f} KB)")
        except Exception as e:
            print(f"   ‚ùå [{i+1}/{len(processed_rasters)}] Error saving {result['filename']}: {e}")
    
    print(f"\n‚úÖ Saved {len(saved_files)} z-score rasters to '{output_dir}/'")
else:
    print("‚ö†Ô∏è No processed rasters to save")

## 7. Generate Summary Statistics

In [None]:
# Create summary statistics CSV
if processed_rasters:
    print("üìä Generating summary statistics...\n")
    
    summary_data = []
    
    for result in processed_rasters:
        row = {
            'Filename': result['filename'],
            'Background_Median': result['stats']['median'],
            'Background_MAD': result['stats']['mad'],
            'Background_Robust_Std': result['stats']['robust_std'],
            'Background_Mean': result['stats']['mean'],
            'Background_Std': result['stats']['std'],
            'Background_Pixels': result['stats']['n_valid']
        }
        
        if result['lease_stats']:
            row.update({
                'Lease_Mean_Z': result['lease_stats']['mean_z'],
                'Lease_Median_Z': result['lease_stats']['median_z'],
                'Lease_Std_Z': result['lease_stats']['std_z'],
                'Lease_Min_Z': result['lease_stats']['min_z'],
                'Lease_Max_Z': result['lease_stats']['max_z'],
                'Lease_Pixels': result['lease_stats']['n_pixels'],
                'Lease_Pct_Above_Background': result['lease_stats']['pct_above_zero'],
                'Lease_Pct_Significantly_Below': result['lease_stats']['pct_below_minus2'],
                'Lease_Pct_Significantly_Above': result['lease_stats']['pct_above_2']
            })
        
        summary_data.append(row)
    
    # Create DataFrame and save to CSV
    df_summary = pd.DataFrame(summary_data)
    summary_file = os.path.join(output_dir, 'zscore_summary_statistics.csv')
    df_summary.to_csv(summary_file, index=False)
    
    # Display summary
    print("üìã Summary Statistics Table:")
    print("=" * 60)
    
    # Create simplified display version
    display_cols = ['Filename', 'Lease_Mean_Z', 'Lease_Median_Z', 
                   'Lease_Pct_Above_Background']
    display_df = df_summary[display_cols].copy()
    display_df.columns = ['File', 'Mean Z', 'Median Z', '% Above Background']
    
    pd.set_option('display.max_columns', None)
    pd.set_option('display.width', None)
    print(display_df.to_string(index=False))
    
    print(f"\n‚úÖ Full summary saved to: {summary_file}")
    
    # Overall statistics
    print("\nüìà Overall Performance Summary:")
    print("=" * 40)
    overall_mean_z = df_summary['Lease_Mean_Z'].mean()
    overall_pct_above = df_summary['Lease_Pct_Above_Background'].mean()
    
    print(f"Average Lease Z-Score: {overall_mean_z:.3f}")
    print(f"Average % Above Background: {overall_pct_above:.1f}%")
    
    if overall_mean_z > 0.5:
        print("‚úÖ Overall: Lease performing WELL ABOVE background")
    elif overall_mean_z > 0:
        print("‚úÖ Overall: Lease performing SLIGHTLY ABOVE background")
    elif overall_mean_z > -0.5:
        print("‚ö†Ô∏è Overall: Lease performing SLIGHTLY BELOW background")
    else:
        print("‚ùå Overall: Lease performing SIGNIFICANTLY BELOW background")
else:
    print("‚ö†Ô∏è No processed data for summary")

## 8. Create Download Archive

In [None]:
# Create ZIP archive for download
if processed_rasters and 'saved_files' in locals():
    print("üì¶ Creating download archive...\n")
    
    zip_filename = 'zscore_transformed_ndvi.zip'
    
    with zipfile.ZipFile(zip_filename, 'w', zipfile.ZIP_DEFLATED) as zipf:
        # Add all saved GeoTIFFs
        for file in saved_files:
            if os.path.exists(file):
                zipf.write(file, os.path.basename(file))
        
        # Add summary CSV
        if 'summary_file' in locals() and os.path.exists(summary_file):
            zipf.write(summary_file, os.path.basename(summary_file))
        
        # Add detailed README
        readme_content = f"""Z-Score Transformed NDVI Rasters - Reclamation Assessment
=========================================================

Processing Date: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}
Number of Rasters: {len(processed_rasters)}

METHODOLOGY
-----------
Transformation Method: Robust Z-Score using Median Absolute Deviation (MAD)
Background Definition: Field boundary minus lease boundary
Statistics Calculation: Based on background pixels only
NoData Handling: All pixels outside field boundary set to NoData (-9999)

Formula: Z = (NDVI - Background_Median) / (1.4826 * Background_MAD)

INTERPRETATION GUIDE
-------------------
Z-Score Ranges:
  ‚Ä¢ Z < -2: Significantly below background (potential problem area)
  ‚Ä¢ -2 ‚â§ Z < -1: Moderately below background
  ‚Ä¢ -1 ‚â§ Z < 0: Slightly below background
  ‚Ä¢ Z ‚âà 0: Similar to background (baseline)
  ‚Ä¢ 0 < Z ‚â§ 1: Slightly above background
  ‚Ä¢ 1 < Z ‚â§ 2: Moderately above background
  ‚Ä¢ Z > 2: Significantly above background (excellent performance)

Reclamation Success Indicators:
  ‚Ä¢ Successful: Lease area Z-scores close to 0 or positive
  ‚Ä¢ Needs Attention: Lease area Z-scores consistently negative
  ‚Ä¢ Excellent Recovery: Lease area Z-scores consistently positive

FILES INCLUDED
--------------
1. *_zscore.tif: Z-score transformed NDVI rasters (GeoTIFF format)
   - NoData value: -9999 (pixels outside field boundary)
   - Values: Robust z-scores relative to background median

2. zscore_summary_statistics.csv: Detailed statistics for all rasters
   - Background statistics (median, MAD, pixels)
   - Lease area z-score statistics
   - Performance percentages

USAGE IN GIS SOFTWARE
---------------------
The GeoTIFF files can be loaded in any GIS software (QGIS, ArcGIS, etc.).
Recommended symbology:
  ‚Ä¢ Color ramp: Red-White-Green (diverging)
  ‚Ä¢ Range: -3 to +3
  ‚Ä¢ NoData value: -9999

ADVANTAGES OF ROBUST Z-SCORE
----------------------------
1. Outlier Resistant: Uses median/MAD instead of mean/std
2. Standardized Scale: Easy comparison across dates
3. Statistical Significance: ¬±2 represents significant deviation
4. Relative Performance: Accounts for field-wide conditions
5. Robust to Non-Normal Distributions: Works well with skewed data

For questions or additional analysis needs, consult the summary CSV file.
"""
        
        zipf.writestr('README.txt', readme_content)
    
    file_size_mb = os.path.getsize(zip_filename) / 1024 / 1024
    print(f"‚úÖ Archive created: {zip_filename}")
    print(f"   Size: {file_size_mb:.2f} MB")
    print(f"   Contents: {len(saved_files)} GeoTIFFs + 1 CSV + README")
    print("\n‚¨áÔ∏è Starting download...")
    
    # Trigger download
    files.download(zip_filename)
    
    print("\n" + "="*60)
    print("üéâ Processing complete! Your z-score transformed rasters are ready.")
    print("\nüí° Next Steps:")
    print("   1. Load the GeoTIFFs in your GIS software")
    print("   2. Apply Red-White-Green color ramp with range -3 to +3")
    print("   3. Review the summary CSV for detailed statistics")
    print("   4. Compare z-scores across dates to track reclamation progress")
else:
    print("‚ö†Ô∏è No files to download. Please process rasters first.")

## 9. Interpretation Guide

### Understanding Z-Scores in Reclamation Context

The robust z-score transformation provides a standardized way to compare lease area performance against the background field:

#### Z-Score Ranges:
- **Z < -2**: Significantly below background (potential problem area)
- **-2 ‚â§ Z < -1**: Moderately below background
- **-1 ‚â§ Z < 0**: Slightly below background
- **Z ‚âà 0**: Similar to background
- **0 < Z ‚â§ 1**: Slightly above background
- **1 < Z ‚â§ 2**: Moderately above background
- **Z > 2**: Significantly above background (excellent performance)

#### Reclamation Assessment:
- **Successful Reclamation**: Lease area Z-scores close to 0 or positive
- **Needs Attention**: Lease area Z-scores consistently negative
- **Excellent Recovery**: Lease area Z-scores consistently positive

#### Important Notes:
- **NoData Handling**: All pixels outside the field boundary are automatically set to NoData
- **Background Definition**: Field area excluding the lease area
- **Robust Statistics**: Median and MAD are less sensitive to outliers than mean and standard deviation

#### Advantages of This Approach:
1. **Outlier Resistant**: Uses median/MAD instead of mean/std
2. **Standardized Scale**: Easy comparison across dates
3. **Statistical Significance**: ¬±2 represents significant deviation
4. **Relative Performance**: Accounts for field-wide conditions
5. **Proper NoData Handling**: Ensures analysis only within field boundary