# Milgadara Case Study 2.0

### Goal:
Demonstrating PaddockTS -> DAESIM using annotated, paddock-level crop type and yield data from Milgadara

### Inputs:
#### Time series satellite data from Sentinel-2 
- annotated paddock boundaries (.gpkg)
    - the field 'name' must be unique
- annotated paddock-year table. (.csv)
    - field 'name' must link to paddock boundaries.
    - Other fields include: crop type, yield, treatments, dates (sow, harvest, observed flower, etc.)
- auto-generated paddock boundaries, produced using SAMGeo (*.csv)
- xarray object with 10m resolution reflectance data from 2018-2024 including derived indices and veg frac proportions (LARGE file ~70GB?)
- xarray object with paddock-level summaries of time series.
    - two versions, produced either using annotated and autogenerated paddock boundaries.
    - 

### Outputs:
1. Map of property with manual paddocks overlaid.
2. Show a summary table of the paddock annotation data.
4. RGB and veg frac time lapse with paddocks overlaid.
    - impute or not?
    - this can be hard to show in a saved ipynb because the video is a large file
5. false colour image showing the Fourrier transform from NDWI time series.
6. Map of SAMGeo paddocks next to manual paddocks, showing labels for each.
7. Function to create "calendar plots" given list of paddock names for either the manual or auto-generated paddocks.
8. 


#### Below are run settings for testrun.sh on Dec 10 2025.
- This is an incomplete version of the automated pipeline, which gets most of the way towards the automated outputs.
- By running this first, we prepared some of the (automated) processed inputs for this case study that can simply be read in for this notebook.
- A key step that was not in pipeline at the time is to make the paddock-level time series... This is run in final_steps_paddockTS.ipynb but will be included in the pipeline. 

### Setup

In [1]:
import geopandas as gpd
import pandas as pd
import numpy as np
import seaborn as sns
import rasterio #
import xarray as xr
import matplotlib.pyplot as plt 
import matplotlib
import rioxarray
from shapely.geometry import mapping

import pickle
import os
import shutil



In [4]:
#stub = "MILGADARA_b03_2018-2024"
stub = "MILG_b01_2018-2024"
outdir = "/g/data/xe2/jb5097/PaddockTS_Results/"
paddocks_manual = "/g/data/xe2/John/Data/PadSeg/milg_manualpaddocks2.gpkg" # hand-drawn paddock polygons with name column that MAY match with annotation data (not all rows will have annotations)
paddock_annotations = "/g/data/xe2/John/Data/PadSeg/paddock-year-yield.csv" # paddock management annotation data (Format is to be changed to Agriweb .json in future version)

# Read in the polygons from SAMGeo (these will not neccesarily match user-provided paddocks)
pol = gpd.read_file(outdir+stub+'_filt.gpkg')
# # have to set a paddock id. Preferably do this in earlier step in future... 
# pol['paddock'] = range(1,len(pol)+1)
# pol['paddock'] = pol.paddock.astype('category')


In [10]:
# Read in the polygons from SAMGeo (these will not neccesarily match user-provided paddocks)
pol = gpd.read_file(outdir+stub+'_filt.gpkg')
# have to set a paddock id. Preferably do this in earlier step in future... 
pol['paddock'] = range(1,len(pol)+1)
pol['paddock'] = pol.paddock.astype('category')
pol.head()

Unnamed: 0,area_ha,log_area_ha,perim-area,geometry,paddock
0,59.5,1.774517,22.857143,"POLYGON ((14324740.000 -4133970.000, 14325190....",1
1,43.9,1.642465,26.879271,"POLYGON ((14324350.000 -4133970.000, 14324350....",2
2,112.5,2.051153,15.466667,"POLYGON ((14324090.000 -4133970.000, 14324310....",3
3,17.5,1.243038,33.142857,"POLYGON ((14325770.000 -4134180.000, 14325810....",4
4,22.8,1.357935,31.578947,"POLYGON ((14324990.000 -4134250.000, 14325010....",5
