# Monthly SST Processing

**GOAL:**  
This notebook processes raw SST data from both **observations and NEMO model** to create monthly climatology datasets with consistent spatial resolution and time coverage (2011-2022). The processed data serves as input for quantitative analyses like **PCA** to evaluate model performance and identify climate patterns.

---

# Data Processing

| Property               | observations and model                   | 
|------------------------|------------------------------------------|
| **Content**            | `sst`                                    |
| **Resolution**         | `1.0°`                                   |
| **Mask**               | `Yes (NaN)`                              | 
| **Time**               | `2011-01 to 2022-12 (monthly)`           |
| **Dimensions**         | `time`, `lat`, `lon`                     |
| **Dimension lon/lat**  | `1D`                                     |
| **Name time**          | `time`                                   |
| **Longitude range**    | `0.50° to 359.50° (west to east) `       |
| **Latitude range**     | `89.50° to -89.50°` (north to south)     |

## Imports

In [1]:
import xarray as xr
import numpy as np
import scipy.ndimage
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
import regionmask
import contextlib, os, sys
import cartopy.crs as ccrs
from utils.data.general import show_coverage_mask_model, fill_coastal_points_in_time

## Processing

In [2]:
observations = xr.open_dataset('../obs/sst.mnmean.nc')
model = xr.open_dataset('../model/nemo00_1m_201001_202212_grid_T.nc')
mesh = xr.open_dataset('../model/orca05l75_domain_cfg_nemov5_10m.nc')

### Extracting SST

In [3]:
model = model['tos']
observations = observations['sst']

### Renaming

In [4]:
model = model.rename({
    'nav_lon': 'lon',
    'nav_lat': 'lat',
    'time_counter': 'time'
})

times = model.time.values.astype('datetime64[M]')  # truncate to month start
model['time'] = times
model = model.assign_coords(lon=mesh['glamt'], lat=mesh['gphit'])

### 2011 to 2022

In [5]:
observations = observations.sel(time=slice("2011-01", "2022-12"))
model = model.sel(time=slice("2011-01", "2022-12"))

### Filling coastal points

In [6]:
land_mask = mesh['bathy_metry'] == 0
model = model.where(~land_mask)
filled_model = fill_coastal_points_in_time(model, 20)

### Regridding (interpolation): Model -> Observations

In [7]:
import xesmf as xe

target_grid = xr.Dataset({
    'lon': (['lon'], np.arange(-179.5, 179.5 + 1)),  
    'lat': (['lat'], np.arange(-89.5, 89.5 + 1))     
})

with open(os.devnull, 'w') as f, contextlib.redirect_stdout(f):
    regridder = xe.Regridder(
        filled_model, target_grid,
        method='bilinear',
        filename='../weights/weights_bilinear_monthly_sst.nc',  
        reuse_weights=True,                  
        ignore_degenerate=True,
        periodic=True
    )

model_regridded = regridder(filled_model)

### -180 to 180 -> 0 to 360 *and* -89.5 to 89.5 -> 89.5 to -89.5 *and* dimension ordering

In [8]:
model_regridded['lon'] = (model_regridded['lon'] + 360) % 360
model_regridded = model_regridded.sortby('lon') 
model_regridded = model_regridded.sortby('lat', ascending=False)
model_regridded = model_regridded.transpose('time', 'lat', 'lon')
observations = observations.transpose('time', 'lat', 'lon')

### Masking the continents

In [9]:
land_mask = regionmask.defined_regions.natural_earth_v5_0_0.land_110.mask(observations)
ocean_mask = land_mask.isnull()
observations = observations.where(ocean_mask)  
model_regridded = model_regridded.where(ocean_mask) 

### Naming coordinates and attributes

In [10]:
model_regridded.coords['lon'].attrs.update({
    'long_name': 'Longitude',
    'units': 'degrees_east',
    'standard_name': 'longitude'
})

model_regridded.coords['lat'].attrs.update({
    'long_name': 'Latitude',
    'units': 'degrees_north',
    'standard_name': 'latitude'
})

model_regridded.attrs.update({
    'long_name': 'Monthly Mean of Sea Surface Temperature',
    'units': 'degC',
    'standard_name': 'sea_surface_temperature'
})

### Saving in a file

In [13]:
model_regridded.to_netcdf('../processed/nemo00_sst_monthly_2011_2022.nc', mode='w')
observations.to_netcdf('../processed/observations_sst_monthly_2011_2022.nc', mode='w')