# Paddyfield Phenology Analysis with sits + FuseTS

This notebook demonstrates a complete workflow for extracting paddyfield phenology metrics using:
- **R sits package** for satellite data acquisition from Microsoft Planetary Computer
- **FuseTS** for time series smoothing and phenological metrics extraction

## Workflow Overview

1. **Data Extraction (R)**: Use sits to get Sentinel-2 NDVI time series
2. **Export to Python**: Convert sits data to FuseTS-compatible format
3. **Smoothing (Python)**: Apply Whittaker smoothing to remove noise
4. **Phenology Extraction (Python)**: Calculate SOS, EOS, Length of Season
5. **Paddyfield Filtering**: Focus analysis on rice paddyfield areas
6. **Visualization**: Create maps and time series plots

## Prerequisites

### R packages:
```r
install.packages(c("sits", "sf", "terra", "dplyr"))
```

### Python packages:
```bash
pip install fusets rioxarray matplotlib pandas geopandas
```

---
# PART 1: DATA EXTRACTION WITH SITS (R)

**Note**: Run this section in R (or R Markdown). Copy results to continue with Python below.

---

## Step 1.1: Setup and Study Area Definition (R)

In [None]:
%%R
library(sits)
library(sf)
library(terra)
library(dplyr)

# Source FuseTS helper functions
source("/home/unika_sianturi/work/FuseTS/scripts/sits_to_fusets.R")

# Define study area (example: Demak region, Indonesia)
# Option A: Load from file
study_area <- st_read("/home/unika_sianturi/work/FuseTS/data/klambu-glapan.shp.shp")

# Option B: Create from coordinates
# study_area <- st_bbox(c(xmin = 110.5, ymin = -6.95, 
#                         xmax = 110.8, ymax = -6.75), 
#                       crs = 4326) %>% st_as_sfc() %>% st_as_sf()

print(study_area)

## Step 1.2: Get Sentinel-2 Data Cube from Microsoft Planetary Computer (R)

In [None]:
%%R
# Define temporal period (example: 2024 agricultural year)
start_date <- "2024-01-01"
end_date <- "2024-12-31"

# Get Sentinel-2 cube from Microsoft Planetary Computer
s2_cube <- sits_cube(
  source = "MPC",
  collection = "SENTINEL-2-L2A",
  roi = study_area,
  start_date = start_date,
  end_date = end_date,
  bands = c("B02", "B04", "B08", "B11")  # Blue, Red, NIR, SWIR
)

print(s2_cube)

## Step 1.3: Calculate Vegetation Indices (R)

In [None]:
%%R
# Calculate NDVI and EVI for paddyfield monitoring
vi_cube <- sits_apply(s2_cube,
  NDVI = (B08 - B04) / (B08 + B04),
  EVI = 2.5 * (B08 - B04) / (B08 + 6 * B04 - 7.5 * B02 + 1),
  LSWI = (B08 - B11) / (B08 + B11)  # Land Surface Water Index (good for rice)
)

print(vi_cube)

## Step 1.4: Export for FuseTS Processing (R)

Choose one of two options:
- **Option A**: Point-based (faster, good for validation)
- **Option B**: Raster-based (full spatial coverage)

### Option A: Point-Based Extraction

In [None]:
%%R
# Option A1: Random sampling across study area
sample_points <- st_sample(study_area, size = 200)

# Option A2: Load existing paddyfield points
# paddy_points <- st_read("/path/to/paddyfield_training_points.shp")
# sample_points <- paddy_points

# Option A3: Sample from paddyfield classification mask
# paddy_mask <- rast("/path/to/paddy_classification_2024.tif")
# paddy_pixels <- as.points(paddy_mask, values = TRUE)
# paddy_pixels <- paddy_pixels[paddy_pixels$classification == 1, ]
# sample_points <- st_as_sf(paddy_pixels)[sample(nrow(st_as_sf(paddy_pixels)), 200), ]

# Extract time series at sample points
point_samples <- sits_get_data(vi_cube, samples = sample_points)

# Export to CSV for FuseTS
sits_to_fusets_csv(point_samples, 
                   "/home/unika_sianturi/work/FuseTS/data/paddy_points_timeseries.csv",
                   bands = c("NDVI", "EVI", "LSWI"))

print(paste("Exported", nrow(point_samples), "point time series to CSV"))

### Option B: Raster-Based Extraction (Full Spatial Coverage)

In [None]:
%%R
# Export entire cube as GeoTIFF stacks
sits_cube_to_fusets_geotiff(
  vi_cube,
  output_dir = "/home/unika_sianturi/work/FuseTS/data/sits_output",
  bands = c("NDVI", "EVI", "LSWI")
)

print("Exported GeoTIFF stacks to data/sits_output/")
print("Files created:")
list.files("/home/unika_sianturi/work/FuseTS/data/sits_output", pattern = ".tif$")

---
# PART 2: PHENOLOGY EXTRACTION WITH FUSETS (Python)
---

## Step 2.1: Setup Python Environment

In [None]:
import sys
sys.path.insert(0, '/home/unika_sianturi/work/FuseTS/src')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import xarray as xr
import warnings
warnings.filterwarnings('ignore')

from fusets.io.sits_bridge import load_sits_csv, load_sits_geotiff
from fusets import whittaker
from fusets.analytics import phenology

print("FuseTS imported successfully!")

## Step 2.2: Load Data from sits

Choose the option that matches your R export:

### Option A: Load Point Time Series

In [None]:
# Load point-based time series
data = load_sits_csv(
    "/home/unika_sianturi/work/FuseTS/data/paddy_points_timeseries.csv",
    time_col='Index',
    band_cols=['NDVI', 'EVI', 'LSWI']
)

print(f"Loaded data dimensions: {data.dims}")
print(f"Number of time points: {len(data.t)}")
print(f"Number of locations: {len(data.coords) - 1}")
print(f"Variables: {list(data.data_vars)}")

### Option B: Load Raster Time Series Stack

In [None]:
# Load raster stack (if using Option B from R)
import rioxarray as rxr

ndvi_stack = load_sits_geotiff("/home/unika_sianturi/work/FuseTS/data/sits_output/NDVI_stack.tif")
evi_stack = load_sits_geotiff("/home/unika_sianturi/work/FuseTS/data/sits_output/EVI_stack.tif")
lswi_stack = load_sits_geotiff("/home/unika_sianturi/work/FuseTS/data/sits_output/LSWI_stack.tif")

print(f"NDVI stack shape: {ndvi_stack.shape}")
print(f"Dimensions: {ndvi_stack.dims}")
print(f"CRS: {ndvi_stack.rio.crs}")

## Step 2.3: Visualize Raw Time Series (Quality Check)

In [None]:
# For point data - visualize sample locations
if 'data' in locals():
    fig, axes = plt.subplots(3, 1, figsize=(14, 10))
    
    # Select first location for visualization
    sample_loc = 0
    
    # Plot NDVI
    axes[0].plot(data.t, data['NDVI'].isel({list(data['NDVI'].dims)[1]: sample_loc}), 
                 'o-', alpha=0.6, label='NDVI')
    axes[0].set_ylabel('NDVI')
    axes[0].set_title('Raw Vegetation Indices Time Series (Sample Location)')
    axes[0].legend()
    axes[0].grid(True, alpha=0.3)
    
    # Plot EVI
    axes[1].plot(data.t, data['EVI'].isel({list(data['EVI'].dims)[1]: sample_loc}), 
                 'o-', alpha=0.6, color='green', label='EVI')
    axes[1].set_ylabel('EVI')
    axes[1].legend()
    axes[1].grid(True, alpha=0.3)
    
    # Plot LSWI (water index - important for rice)
    axes[2].plot(data.t, data['LSWI'].isel({list(data['LSWI'].dims)[1]: sample_loc}), 
                 'o-', alpha=0.6, color='blue', label='LSWI')
    axes[2].set_ylabel('LSWI')
    axes[2].set_xlabel('Date')
    axes[2].legend()
    axes[2].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.savefig('raw_timeseries_paddy.png', dpi=300, bbox_inches='tight')
    plt.show()

# For raster data - visualize spatial mean
if 'ndvi_stack' in locals():
    fig, ax = plt.subplots(figsize=(14, 5))
    ndvi_mean = ndvi_stack.mean(dim=['x', 'y'])
    ax.plot(range(len(ndvi_mean)), ndvi_mean, 'o-', alpha=0.6)
    ax.set_xlabel('Time Step')
    ax.set_ylabel('Mean NDVI')
    ax.set_title('Spatial Mean NDVI Time Series')
    ax.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.savefig('raw_ndvi_spatial_mean.png', dpi=300, bbox_inches='tight')
    plt.show()

## Step 2.4: Apply Whittaker Smoothing (Gap-filling & Noise Removal)

In [None]:
# Choose lambda parameter (smoothing strength)
# Higher lambda = more smoothing
# Typical values: 5000-20000 for 10-day composites
lambda_param = 10000

# For point data
if 'data' in locals():
    print("Smoothing point time series...")
    ndvi_smoothed = whittaker(data['NDVI'], lmbd=lambda_param)
    evi_smoothed = whittaker(data['EVI'], lmbd=lambda_param)
    lswi_smoothed = whittaker(data['LSWI'], lmbd=lambda_param)
    print("✓ Smoothing complete")

# For raster data
if 'ndvi_stack' in locals():
    print("Smoothing raster time series (this may take a few minutes)...")
    ndvi_smoothed = whittaker(ndvi_stack, lmbd=lambda_param)
    evi_smoothed = whittaker(evi_stack, lmbd=lambda_param)
    lswi_smoothed = whittaker(lswi_stack, lmbd=lambda_param)
    print("✓ Smoothing complete")

## Step 2.5: Visualize Smoothed vs Raw Data

In [None]:
# For point data
if 'data' in locals():
    fig, ax = plt.subplots(figsize=(14, 6))
    
    sample_loc = 0
    raw_ndvi = data['NDVI'].isel({list(data['NDVI'].dims)[1]: sample_loc})
    smooth_ndvi = ndvi_smoothed.isel({list(ndvi_smoothed.dims)[1]: sample_loc})
    
    ax.plot(data.t, raw_ndvi, 'o', alpha=0.4, color='gray', label='Raw NDVI (with clouds/gaps)')
    ax.plot(data.t, smooth_ndvi, '-', linewidth=2, color='green', label='Smoothed NDVI (Whittaker)')
    
    ax.set_xlabel('Date', fontsize=12)
    ax.set_ylabel('NDVI', fontsize=12)
    ax.set_title('Whittaker Smoothing Effect on Paddyfield NDVI', fontsize=14)
    ax.legend(fontsize=11)
    ax.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.savefig('whittaker_smoothing_effect.png', dpi=300, bbox_inches='tight')
    plt.show()

## Step 2.6: Extract Phenological Metrics

Available detection methods:
- `'seasonal_amplitude'`: Best for clear seasonal patterns (recommended for rice)
- `'first_of_slope'`: Detects rapid greening/senescence
- `'median_of_slope'`: More robust to noise
- `'absolute_value'`: Uses fixed NDVI thresholds

In [None]:
# Extract phenology using NDVI (most common for rice)
print("Extracting phenological metrics...")

pheno_results = phenology(
    ndvi_smoothed,
    detection_method='seasonal_amplitude',  # Good for rice paddyfield
    amplitude_threshold=0.2,  # Minimum amplitude to detect a season
    # absolute_threshold=0.3,  # Optional: for 'absolute_value' method
    # slope_threshold=0.01      # Optional: for slope-based methods
)

print("✓ Phenology extraction complete")
print("\nAvailable metrics:")
print("- da_sos_times: Start of Season (day of year)")
print("- da_eos_times: End of Season (day of year)")
print("- da_los: Length of Season (days)")
print("- da_sos_values: NDVI value at SOS")
print("- da_eos_values: NDVI value at EOS")
print("- da_peak_time: Day of year of peak NDVI")
print("- da_peak_value: Peak NDVI value")
print("- da_season_amplitude: Amplitude (peak - base)")
print("- da_seasonal_integral: Time-integrated NDVI (proxy for productivity)")

## Step 2.7: Analyze Phenology Results

In [None]:
# Extract phenology arrays
sos_doy = pheno_results.da_sos_times.values
eos_doy = pheno_results.da_eos_times.values
los_days = pheno_results.da_los.values
peak_ndvi = pheno_results.da_peak_value.values
amplitude = pheno_results.da_season_amplitude.values
integral = pheno_results.da_seasonal_integral.values

# For point data - create summary DataFrame
if 'data' in locals() and len(sos_doy.shape) == 2:
    # Flatten if needed
    n_locations = sos_doy.shape[1] if len(sos_doy.shape) > 1 else len(sos_doy)
    
    results_df = pd.DataFrame({
        'location_id': range(n_locations),
        'SOS_doy': sos_doy.flatten()[:n_locations],
        'EOS_doy': eos_doy.flatten()[:n_locations],
        'LOS_days': los_days.flatten()[:n_locations],
        'peak_NDVI': peak_ndvi.flatten()[:n_locations],
        'amplitude': amplitude.flatten()[:n_locations],
        'seasonal_integral': integral.flatten()[:n_locations]
    })
    
    # Remove invalid seasons (NaN or unrealistic values)
    results_df_clean = results_df.dropna()
    
    print(f"\nPhenology Summary Statistics (n={len(results_df_clean)} valid locations):")
    print("="*70)
    print(results_df_clean.describe())
    
    # Export to CSV
    results_df_clean.to_csv('paddyfield_phenology_results.csv', index=False)
    print("\n✓ Results exported to: paddyfield_phenology_results.csv")

## Step 2.8: Filter for Typical Paddyfield Phenology Patterns

In [None]:
# Filter for typical rice paddyfield characteristics
if 'results_df_clean' in locals():
    
    # Typical paddyfield criteria (adjust based on your region):
    # - Growing season length: 90-130 days
    # - Peak NDVI: > 0.6 (healthy vegetation)
    # - Amplitude: > 0.3 (clear seasonal signal)
    # - Planting typically Nov-Dec or Mar-Apr in Indonesia
    
    paddyfield_mask = (
        (results_df_clean['LOS_days'] >= 90) & 
        (results_df_clean['LOS_days'] <= 130) &
        (results_df_clean['peak_NDVI'] >= 0.6) &
        (results_df_clean['amplitude'] >= 0.3)
    )
    
    paddyfield_phenology = results_df_clean[paddyfield_mask]
    
    print(f"\nPaddyfield Filtering Results:")
    print("="*70)
    print(f"Total locations: {len(results_df_clean)}")
    print(f"Paddyfield-like locations: {len(paddyfield_phenology)} ({len(paddyfield_phenology)/len(results_df_clean)*100:.1f}%)")
    print(f"Non-paddyfield locations: {len(results_df_clean) - len(paddyfield_phenology)}")
    
    print(f"\nPaddyfield Phenology Statistics:")
    print("="*70)
    print(paddyfield_phenology.describe())
    
    # Export paddyfield-only results
    paddyfield_phenology.to_csv('paddyfield_only_phenology.csv', index=False)
    print("\n✓ Paddyfield-only results exported to: paddyfield_only_phenology.csv")

## Step 2.9: Visualize Phenology with Time Series

In [None]:
# Visualize phenological markers on time series
if 'data' in locals() and 'paddyfield_phenology' in locals():
    
    # Select a representative paddyfield location
    paddy_loc_id = paddyfield_phenology.iloc[0]['location_id']
    
    fig, axes = plt.subplots(2, 1, figsize=(15, 10))
    
    # Get time series for this location
    raw_ndvi = data['NDVI'].isel({list(data['NDVI'].dims)[1]: int(paddy_loc_id)})
    smooth_ndvi = ndvi_smoothed.isel({list(ndvi_smoothed.dims)[1]: int(paddy_loc_id)})
    
    # Get phenology metrics for this location
    sos = paddyfield_phenology[paddyfield_phenology['location_id'] == paddy_loc_id]['SOS_doy'].values[0]
    eos = paddyfield_phenology[paddyfield_phenology['location_id'] == paddy_loc_id]['EOS_doy'].values[0]
    los = paddyfield_phenology[paddyfield_phenology['location_id'] == paddy_loc_id]['LOS_days'].values[0]
    peak = paddyfield_phenology[paddyfield_phenology['location_id'] == paddy_loc_id]['peak_NDVI'].values[0]
    
    # Plot 1: NDVI with phenological markers
    axes[0].plot(data.t, raw_ndvi, 'o', alpha=0.3, color='gray', label='Raw NDVI')
    axes[0].plot(data.t, smooth_ndvi, '-', linewidth=2, color='darkgreen', label='Smoothed NDVI')
    
    # Add vertical lines for SOS and EOS
    # Note: Converting DOY to actual date requires knowing the year
    axes[0].axhline(y=peak, color='orange', linestyle='--', alpha=0.7, label=f'Peak NDVI = {peak:.3f}')
    axes[0].text(0.02, 0.95, f'SOS (DOY): {sos:.0f}\nEOS (DOY): {eos:.0f}\nLOS: {los:.0f} days', 
                transform=axes[0].transAxes, fontsize=11, verticalalignment='top',
                bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))
    
    axes[0].set_ylabel('NDVI', fontsize=12)
    axes[0].set_title(f'Paddyfield Phenology - Location {int(paddy_loc_id)}', fontsize=14, fontweight='bold')
    axes[0].legend(loc='upper right', fontsize=11)
    axes[0].grid(True, alpha=0.3)
    
    # Plot 2: Multi-index comparison (NDVI, EVI, LSWI)
    smooth_evi = evi_smoothed.isel({list(evi_smoothed.dims)[1]: int(paddy_loc_id)})
    smooth_lswi = lswi_smoothed.isel({list(lswi_smoothed.dims)[1]: int(paddy_loc_id)})
    
    axes[1].plot(data.t, smooth_ndvi, '-', linewidth=2, label='NDVI', color='green')
    axes[1].plot(data.t, smooth_evi, '-', linewidth=2, label='EVI', color='darkgreen')
    axes[1].plot(data.t, smooth_lswi, '-', linewidth=2, label='LSWI (water)', color='blue')
    
    axes[1].set_xlabel('Date', fontsize=12)
    axes[1].set_ylabel('Index Value', fontsize=12)
    axes[1].set_title('Multi-Index Comparison (Smoothed)', fontsize=13)
    axes[1].legend(fontsize=11)
    axes[1].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.savefig('paddyfield_phenology_timeseries.png', dpi=300, bbox_inches='tight')
    plt.show()
    
    print("\n✓ Phenology time series plot saved: paddyfield_phenology_timeseries.png")

## Step 2.10: Create Phenology Distribution Plots

In [None]:
# Histogram distributions of phenology metrics
if 'paddyfield_phenology' in locals():
    
    fig, axes = plt.subplots(2, 3, figsize=(16, 10))
    
    # SOS distribution
    axes[0, 0].hist(paddyfield_phenology['SOS_doy'], bins=30, color='green', alpha=0.7, edgecolor='black')
    axes[0, 0].set_xlabel('Day of Year')
    axes[0, 0].set_ylabel('Frequency')
    axes[0, 0].set_title('Start of Season (Planting Date) Distribution')
    axes[0, 0].axvline(paddyfield_phenology['SOS_doy'].median(), color='red', linestyle='--', 
                       label=f"Median: {paddyfield_phenology['SOS_doy'].median():.0f}")
    axes[0, 0].legend()
    axes[0, 0].grid(True, alpha=0.3)
    
    # EOS distribution
    axes[0, 1].hist(paddyfield_phenology['EOS_doy'], bins=30, color='orange', alpha=0.7, edgecolor='black')
    axes[0, 1].set_xlabel('Day of Year')
    axes[0, 1].set_ylabel('Frequency')
    axes[0, 1].set_title('End of Season (Harvest Date) Distribution')
    axes[0, 1].axvline(paddyfield_phenology['EOS_doy'].median(), color='red', linestyle='--',
                       label=f"Median: {paddyfield_phenology['EOS_doy'].median():.0f}")
    axes[0, 1].legend()
    axes[0, 1].grid(True, alpha=0.3)
    
    # LOS distribution
    axes[0, 2].hist(paddyfield_phenology['LOS_days'], bins=30, color='blue', alpha=0.7, edgecolor='black')
    axes[0, 2].set_xlabel('Days')
    axes[0, 2].set_ylabel('Frequency')
    axes[0, 2].set_title('Length of Season Distribution')
    axes[0, 2].axvline(paddyfield_phenology['LOS_days'].median(), color='red', linestyle='--',
                       label=f"Median: {paddyfield_phenology['LOS_days'].median():.0f} days")
    axes[0, 2].legend()
    axes[0, 2].grid(True, alpha=0.3)
    
    # Peak NDVI distribution
    axes[1, 0].hist(paddyfield_phenology['peak_NDVI'], bins=30, color='darkgreen', alpha=0.7, edgecolor='black')
    axes[1, 0].set_xlabel('NDVI')
    axes[1, 0].set_ylabel('Frequency')
    axes[1, 0].set_title('Peak NDVI Distribution')
    axes[1, 0].axvline(paddyfield_phenology['peak_NDVI'].median(), color='red', linestyle='--',
                       label=f"Median: {paddyfield_phenology['peak_NDVI'].median():.3f}")
    axes[1, 0].legend()
    axes[1, 0].grid(True, alpha=0.3)
    
    # Amplitude distribution
    axes[1, 1].hist(paddyfield_phenology['amplitude'], bins=30, color='purple', alpha=0.7, edgecolor='black')
    axes[1, 1].set_xlabel('NDVI Amplitude')
    axes[1, 1].set_ylabel('Frequency')
    axes[1, 1].set_title('Season Amplitude Distribution')
    axes[1, 1].axvline(paddyfield_phenology['amplitude'].median(), color='red', linestyle='--',
                       label=f"Median: {paddyfield_phenology['amplitude'].median():.3f}")
    axes[1, 1].legend()
    axes[1, 1].grid(True, alpha=0.3)
    
    # Seasonal integral (productivity proxy)
    axes[1, 2].hist(paddyfield_phenology['seasonal_integral'], bins=30, color='brown', alpha=0.7, edgecolor='black')
    axes[1, 2].set_xlabel('Integrated NDVI')
    axes[1, 2].set_ylabel('Frequency')
    axes[1, 2].set_title('Seasonal Integral (Productivity Proxy)')
    axes[1, 2].axvline(paddyfield_phenology['seasonal_integral'].median(), color='red', linestyle='--',
                       label=f"Median: {paddyfield_phenology['seasonal_integral'].median():.1f}")
    axes[1, 2].legend()
    axes[1, 2].grid(True, alpha=0.3)
    
    plt.suptitle('Paddyfield Phenology Metrics Distributions', fontsize=16, fontweight='bold', y=1.00)
    plt.tight_layout()
    plt.savefig('paddyfield_phenology_distributions.png', dpi=300, bbox_inches='tight')
    plt.show()
    
    print("\n✓ Phenology distributions plot saved: paddyfield_phenology_distributions.png")

## Step 2.11: Export Phenology Maps (For Raster Data)

In [None]:
# Export phenology maps as GeoTIFF (if using raster workflow)
if 'ndvi_stack' in locals():
    
    print("Exporting phenology maps...")
    
    # Export individual metrics
    pheno_results.da_sos_times.rio.to_raster('paddy_SOS_map.tif', compress='lzw')
    pheno_results.da_eos_times.rio.to_raster('paddy_EOS_map.tif', compress='lzw')
    pheno_results.da_los.rio.to_raster('paddy_LOS_map.tif', compress='lzw')
    pheno_results.da_peak_value.rio.to_raster('paddy_peak_NDVI_map.tif', compress='lzw')
    pheno_results.da_season_amplitude.rio.to_raster('paddy_amplitude_map.tif', compress='lzw')
    pheno_results.da_seasonal_integral.rio.to_raster('paddy_productivity_proxy_map.tif', compress='lzw')
    
    print("\n✓ Phenology maps exported:")
    print("  - paddy_SOS_map.tif (Planting date)")
    print("  - paddy_EOS_map.tif (Harvest date)")
    print("  - paddy_LOS_map.tif (Growing season length)")
    print("  - paddy_peak_NDVI_map.tif (Peak vegetation)")
    print("  - paddy_amplitude_map.tif (Seasonal amplitude)")
    print("  - paddy_productivity_proxy_map.tif (Productivity proxy)")
    
    # Optional: Apply paddyfield mask if you have one
    # paddy_mask = rxr.open_rasterio("/path/to/paddy_classification.tif")
    # sos_paddy_only = pheno_results.da_sos_times.where(paddy_mask == 1)
    # sos_paddy_only.rio.to_raster('paddy_SOS_masked.tif')

## Step 2.12: Visualize Phenology Maps (For Raster Data)

In [None]:
# Create map visualizations (if using raster workflow)
if 'ndvi_stack' in locals():
    
    fig, axes = plt.subplots(2, 3, figsize=(18, 12))
    
    # SOS map
    im1 = axes[0, 0].imshow(pheno_results.da_sos_times, cmap='RdYlGn', vmin=1, vmax=365)
    axes[0, 0].set_title('Start of Season (DOY)', fontsize=13, fontweight='bold')
    axes[0, 0].axis('off')
    plt.colorbar(im1, ax=axes[0, 0], fraction=0.046, pad=0.04)
    
    # EOS map
    im2 = axes[0, 1].imshow(pheno_results.da_eos_times, cmap='RdYlGn_r', vmin=1, vmax=365)
    axes[0, 1].set_title('End of Season (DOY)', fontsize=13, fontweight='bold')
    axes[0, 1].axis('off')
    plt.colorbar(im2, ax=axes[0, 1], fraction=0.046, pad=0.04)
    
    # LOS map
    im3 = axes[0, 2].imshow(pheno_results.da_los, cmap='viridis', vmin=0, vmax=200)
    axes[0, 2].set_title('Length of Season (Days)', fontsize=13, fontweight='bold')
    axes[0, 2].axis('off')
    plt.colorbar(im3, ax=axes[0, 2], fraction=0.046, pad=0.04)
    
    # Peak NDVI map
    im4 = axes[1, 0].imshow(pheno_results.da_peak_value, cmap='YlGn', vmin=0, vmax=1)
    axes[1, 0].set_title('Peak NDVI Value', fontsize=13, fontweight='bold')
    axes[1, 0].axis('off')
    plt.colorbar(im4, ax=axes[1, 0], fraction=0.046, pad=0.04)
    
    # Amplitude map
    im5 = axes[1, 1].imshow(pheno_results.da_season_amplitude, cmap='plasma', vmin=0, vmax=1)
    axes[1, 1].set_title('Season Amplitude', fontsize=13, fontweight='bold')
    axes[1, 1].axis('off')
    plt.colorbar(im5, ax=axes[1, 1], fraction=0.046, pad=0.04)
    
    # Productivity proxy map
    im6 = axes[1, 2].imshow(pheno_results.da_seasonal_integral, cmap='copper_r')
    axes[1, 2].set_title('Seasonal Integral (Productivity)', fontsize=13, fontweight='bold')
    axes[1, 2].axis('off')
    plt.colorbar(im6, ax=axes[1, 2], fraction=0.046, pad=0.04)
    
    plt.suptitle('Paddyfield Phenology Maps', fontsize=16, fontweight='bold', y=0.98)
    plt.tight_layout()
    plt.savefig('paddyfield_phenology_maps.png', dpi=300, bbox_inches='tight')
    plt.show()
    
    print("\n✓ Phenology maps visualization saved: paddyfield_phenology_maps.png")

## Step 2.13: Identify Planting Seasons (Indonesian Context)

In [None]:
# Classify into Indonesian planting seasons
if 'paddyfield_phenology' in locals():
    
    def classify_planting_season(sos_doy):
        """
        Classify planting season based on SOS day of year
        Indonesian rice calendar:
        - Season 1 (Musim Tanam 1): Nov-Dec (DOY 305-365 or 1-31)
        - Season 2 (Musim Tanam 2): Apr-May (DOY 90-150)
        - Season 3 (Musim Tanam 3): Jul-Aug (DOY 180-240)
        """
        if (sos_doy >= 305) or (sos_doy <= 31):
            return 'Season 1 (Nov-Dec)'
        elif (sos_doy >= 90) and (sos_doy <= 150):
            return 'Season 2 (Apr-May)'
        elif (sos_doy >= 180) and (sos_doy <= 240):
            return 'Season 3 (Jul-Aug)'
        else:
            return 'Off-season'
    
    paddyfield_phenology['planting_season'] = paddyfield_phenology['SOS_doy'].apply(classify_planting_season)
    
    # Count by season
    season_counts = paddyfield_phenology['planting_season'].value_counts()
    
    print("\nPlanting Season Distribution:")
    print("="*70)
    print(season_counts)
    print()
    
    # Visualize
    fig, ax = plt.subplots(figsize=(10, 6))
    season_counts.plot(kind='bar', ax=ax, color=['#2ecc71', '#3498db', '#e74c3c', '#95a5a6'], 
                       edgecolor='black', linewidth=1.5)
    ax.set_xlabel('Planting Season', fontsize=12)
    ax.set_ylabel('Number of Paddyfields', fontsize=12)
    ax.set_title('Paddyfield Distribution by Planting Season', fontsize=14, fontweight='bold')
    ax.set_xticklabels(ax.get_xticklabels(), rotation=45, ha='right')
    ax.grid(True, alpha=0.3, axis='y')
    
    # Add value labels on bars
    for i, v in enumerate(season_counts.values):
        ax.text(i, v + 0.5, str(v), ha='center', va='bottom', fontweight='bold')
    
    plt.tight_layout()
    plt.savefig('planting_season_distribution.png', dpi=300, bbox_inches='tight')
    plt.show()
    
    # Export with season classification
    paddyfield_phenology.to_csv('paddyfield_phenology_with_seasons.csv', index=False)
    print("\n✓ Results with season classification exported to: paddyfield_phenology_with_seasons.csv")

---
## Summary and Next Steps

### What We Accomplished:

1. ✅ **Data Extraction**: Used sits to download Sentinel-2 data from Microsoft Planetary Computer
2. ✅ **Vegetation Indices**: Calculated NDVI, EVI, and LSWI for paddyfield monitoring
3. ✅ **Time Series Smoothing**: Applied Whittaker smoothing to remove noise and fill gaps
4. ✅ **Phenology Extraction**: Calculated Start of Season (planting), End of Season (harvest), and other metrics
5. ✅ **Paddyfield Filtering**: Identified locations with typical rice phenology patterns
6. ✅ **Season Classification**: Categorized by Indonesian planting seasons
7. ✅ **Visualization**: Created comprehensive plots and maps

### Outputs Generated:

- `paddyfield_phenology_results.csv`: Full phenology metrics for all locations
- `paddyfield_only_phenology.csv`: Filtered for rice-like phenology patterns
- `paddyfield_phenology_with_seasons.csv`: Includes planting season classification
- `paddyfield_phenology_timeseries.png`: Time series with phenological markers
- `paddyfield_phenology_distributions.png`: Distribution plots for all metrics
- `planting_season_distribution.png`: Seasonal analysis
- `paddy_*_map.tif`: Phenology maps (if using raster workflow)

### Next Steps:

1. **Validation**: Compare SOS/EOS dates with ground truth planting/harvest records
2. **Multi-temporal Analysis**: Run this for multiple years to detect trends
3. **Yield Modeling**: Use phenology metrics as features for yield prediction models
4. **Integration with S1**: Combine with Sentinel-1 SAR data using MOGPR fusion
5. **Operational Monitoring**: Set up automated processing for near-real-time monitoring

### Customization Options:

- **Lambda parameter** (Step 2.4): Adjust smoothing strength (try 5000-20000)
- **Detection method** (Step 2.6): Try `'first_of_slope'` or `'median_of_slope'` for different patterns
- **Filtering criteria** (Step 2.8): Adjust LOS, peak NDVI, amplitude thresholds for your region
- **Season definitions** (Step 2.13): Modify DOY ranges to match local planting calendar

---

**For questions or issues, refer to:**
- FuseTS documentation: `/home/unika_sianturi/work/FuseTS/README.md`
- sits + FuseTS workflow guide: `/home/unika_sianturi/work/FuseTS/SITS_FUSETS_WORKFLOW.md`
- sits documentation: https://e-sensing.github.io/sitsbook/
