# ☁️ Function 4: Create Cloud Optimized GeoTIFF

## Building the `create_cloud_optimized_geotiff` Function

**Learning Objectives:**
- Create COG-compliant files with proper tiling and overviews
- Implement compression and optimization strategies
- Understand modern geospatial data standards and best practices
- Optimize raster data for web mapping and cloud storage
- Generate efficient data structures for large-scale applications
- Validate COG compliance and performance characteristics

**Professional Context:**
Cloud Optimized GeoTIFFs (COGs) are the modern standard for web-optimized geospatial raster data. Professionals use COGs for:
- Web GIS applications and tile servers
- Cloud-native geospatial workflows
- Efficient data sharing and distribution
- Scalable geospatial data platforms
- Remote sensing data archives

## 🎯 Function Overview

**Function Signature:**
```python
def create_cloud_optimized_geotiff(input_path, output_path, 
                                  compression='lzw', tiling=True, 
                                  overviews=True, blocksize=512):
    """
    Convert a raster to Cloud Optimized GeoTIFF format.
    
    Parameters:
    -----------
    input_path : str
        Path to the input raster file
    output_path : str
        Path for the output COG file
    compression : str, default 'lzw'
        Compression method: 'lzw', 'deflate', 'jpeg', 'none'
    tiling : bool, default True
        Whether to create internal tiling
    overviews : bool, default True
        Whether to build overview pyramids
    blocksize : int, default 512
        Internal tile size in pixels
    
    Returns:
    --------
    dict
        Dictionary containing COG creation metadata and statistics
    """
```

## 📚 Cloud Optimized GeoTIFF Fundamentals

### COG Requirements

| Feature | Description | Benefit |
|---------|-------------|---------|
| **Internal Tiling** | Data organized in tiles (e.g., 512x512) | Efficient partial reads |
| **Overviews** | Multi-resolution pyramid | Fast zoom levels |
| **Compression** | Reduced file size | Faster transfers |
| **Valid GeoTIFF** | Proper metadata structure | Standards compliance |

### Implementation Strategy
```python
import rasterio
from rasterio.enums import Resampling

# COG creation workflow:
# 1. Load source raster
# 2. Configure COG profile
# 3. Write with tiling
# 4. Build overviews
# 5. Validate result
```

## 💻 Hands-On Examples

In [None]:
import rasterio
from rasterio.enums import Resampling
from rasterio.transform import from_bounds
import numpy as np
import tempfile
import os

# Example 1: Create sample raster data
def create_sample_raster():
    """Create a sample raster for COG conversion"""
    
    np.random.seed(42)
    
    # Create sample raster (500x500)
    rows, cols = 500, 500
    
    # Generate elevation data
    x = np.linspace(0, 10, cols)
    y = np.linspace(0, 10, rows)
    X, Y = np.meshgrid(x, y)
    
    elevation = (
        1000 +
        300 * np.sin(X) * np.cos(Y) +
        100 * np.random.normal(0, 1, (rows, cols))
    )
    
    # Create temporary file
    temp_dir = tempfile.mkdtemp()
    raster_path = os.path.join(temp_dir, 'sample.tif')
    
    # Define spatial reference
    bounds = (-120, 35, -110, 45)
    transform = from_bounds(*bounds, cols, rows)
    
    # Write regular GeoTIFF
    with rasterio.open(
        raster_path, 'w',
        driver='GTiff',
        height=rows, width=cols,
        count=1, dtype=elevation.dtype,
        crs='EPSG:4326',
        transform=transform
    ) as dst:
        dst.write(elevation.astype(np.float32), 1)
    
    print(f"Created sample raster: {raster_path}")
    print(f"Size: {rows}x{cols} pixels")
    print(f"File size: {os.path.getsize(raster_path) / 1024 / 1024:.1f} MB")
    
    return raster_path

# Create sample data
sample_raster = create_sample_raster()

In [None]:
# Example 2: Create COG with different configurations
def example_cog_creation():
    """Demonstrate COG creation with different settings"""
    
    print("\n=== COG CREATION EXAMPLES ===")
    
    temp_dir = os.path.dirname(sample_raster)
    
    # Test different COG configurations
    configs = [
        {
            'name': 'Basic COG',
            'compression': 'lzw',
            'blocksize': 512
        },
        {
            'name': 'Compressed COG',
            'compression': 'deflate',
            'blocksize': 512
        }
    ]
    
    results = []
    
    for config in configs:
        output_path = os.path.join(temp_dir, f"cog_{config['name'].lower().replace(' ', '_')}.tif")
        
        # Create COG
        with rasterio.open(sample_raster) as src:
            
            # Configure COG profile
            profile = src.profile.copy()
            profile.update({
                'driver': 'GTiff',
                'tiled': True,
                'blockxsize': config['blocksize'],
                'blockysize': config['blocksize'],
                'compress': config['compression']
            })
            
            # Write COG
            with rasterio.open(output_path, 'w', **profile) as dst:
                dst.write(src.read())
                
                # Build overviews
                overview_factors = [2, 4, 8, 16]
                dst.build_overviews(overview_factors, Resampling.average)
                dst.update_tags(ns='gdal', TILED='YES')
        
        # Collect statistics
        file_size = os.path.getsize(output_path)
        
        result = {
            'name': config['name'],
            'path': output_path,
            'size_mb': file_size / 1024 / 1024,
            'compression': config['compression'],
            'blocksize': config['blocksize']
        }
        
        results.append(result)
        
        print(f"\nCreated {config['name']}:")
        print(f"  File size: {result['size_mb']:.1f} MB")
        print(f"  Compression: {result['compression']}")
        print(f"  Block size: {result['blocksize']}")
    
    return results

# Run COG creation examples
cog_examples = example_cog_creation()

In [None]:
# Example 3: Validate COG structure
def example_cog_validation():
    """Demonstrate COG validation"""
    
    print("\n=== COG VALIDATION ===")
    
    def validate_cog(raster_path):
        """Check COG compliance"""
        
        with rasterio.open(raster_path) as src:
            info = {
                'tiled': src.profile.get('tiled', False),
                'blocksize': (src.profile.get('blockxsize', 0), 
                             src.profile.get('blockysize', 0)),
                'overviews': len(src.overviews(1)),
                'compression': src.profile.get('compress', 'none')
            }
            
            # Check COG compliance
            is_compliant = (
                info['tiled'] and
                info['overviews'] > 0 and
                info['blocksize'][0] >= 256
            )
            
            info['cog_compliant'] = is_compliant
            return info
    
    # Validate original
    print("Original GeoTIFF:")
    original_info = validate_cog(sample_raster)
    for key, value in original_info.items():
        print(f"  {key}: {value}")
    
    # Validate COG
    if cog_examples:
        print(f"\n{cog_examples[0]['name']}:")
        cog_info = validate_cog(cog_examples[0]['path'])
        for key, value in cog_info.items():
            print(f"  {key}: {value}")
    
    return original_info, cog_info if cog_examples else None

# Run validation
validation_results = example_cog_validation()

## 🎯 Your Implementation Task

Now implement the `create_cloud_optimized_geotiff` function in `src/advanced_rasterio_analysis.py`.

### Requirements Checklist:
- [ ] Load source raster and validate inputs
- [ ] Configure COG-compliant profile with tiling
- [ ] Apply compression settings
- [ ] Write raster with COG structure
- [ ] Build overview pyramids
- [ ] Validate COG compliance
- [ ] Return creation metadata and statistics

In [None]:
# Test your implementation
import sys
sys.path.append('../src')

try:
    from advanced_rasterio_analysis import create_cloud_optimized_geotiff
    
    # Test with sample data
    temp_dir = tempfile.mkdtemp()
    test_output = os.path.join(temp_dir, 'test_cog.tif')
    
    print("Testing create_cloud_optimized_geotiff function...")
    result = create_cloud_optimized_geotiff(
        input_path=sample_raster,
        output_path=test_output,
        compression='lzw',
        blocksize=512
    )
    
    if isinstance(result, dict) and os.path.exists(test_output):
        print("✓ Test passed! Function works correctly.")
        print(f"  Created COG: {test_output}")
        if 'file_size_mb' in result:
            print(f"  File size: {result['file_size_mb']:.1f} MB")
    else:
        print("✗ Test failed! Function did not create expected output.")
        
except ImportError:
    print("Function not implemented yet. Complete implementation in src/advanced_rasterio_analysis.py")
except Exception as e:
    print(f"✗ Test failed with error: {e}")

## 🧪 Testing Your Function

Test your implementation:

```bash
cd /workspaces/your-repo
python -m pytest tests/test_advanced_rasterio_analysis.py::test_create_cloud_optimized_geotiff -v
```

## 🚀 Next Steps

After completing this function:
1. Move to Function 5: `05_stac_integration.ipynb`
2. Build on COG knowledge for cloud-native workflows

**Goal:** Master COG creation - essential for modern geospatial data workflows!