# Spatial gapfilling

Statistics is the answer to everything

Use this notebook to gapfill a saved netcdf file.

## Potential techniques

"Several techniques have been used to fill the gaps in either the UWLS or OI derived total vector maps.

These are implemented using covariance derived from normal mode analysis (Lipphardt et al. 2000), open-boundary modal analysis (OMA) (Kaplan and Lekien 2007), and empirical orthogonal function (EOF) analysis (Beckers and Rixen 2003; Alvera-Azcárate et al. 2005); and using idealized or smoothed observed covariance (Davis 1985)."

- normal mode analysis
- open-boundary modal analysis (OMA)
- empirical orthogonal function analysis (DINEEOF)
- use idealized/smoothed observed covariance
- self-organizing maps (SOM)
- penalized least squares (DCT-PLS)

## What's implemented right now?

- low resolution oversampling
- DINEOF
- DCT-PLS

In [None]:
%matplotlib inline
%load_ext autoreload
%autoreload 2

In [None]:
from pathlib import Path
import numpy as np

import pyplume.utils as utils
from pyplume.dataloaders import dataset_to_fieldset, DataLoader
from pyplume import plotting, gapfilling

## Change these variables

`target_path` is the path to the data you want to gapfill.

In [None]:
target_path = "data/field_netcdfs/tj_plume_1km_2022-09.nc"
target = DataLoader(target_path).dataset

## Choose gapfilling methods to execute

They will process on the target sequentially in the order defined.

In [None]:
gapfiller = gapfilling.Gapfiller()
# ADD GAPFILLING STEPS HERE
gapfiller.add_steps(
    # gapfilling.LowResOversample([
    #     "data/field_netcdfs/tj_plume_2km_2022-09.nc",
    #     "data/field_netcdfs/tj_plume_6km_2022-09.nc",
    # ]),
    gapfilling.DCTPLS(exclude_oob=False),
    # gapfilling.DINEOF(exclude_oob=False)
)

## formatting and saving

In [None]:
target_interped_ds = gapfiller.execute(target)

## Display interpolated field

In [None]:
timestep = 10
plotting.plot_vectorfield(target, show_time=timestep)
plotting.plot_vectorfield(target_interped_ds, show_time=timestep)

In [None]:
fs = dataset_to_fieldset(target)
fs_interp = dataset_to_fieldset(target_interped_ds)
fs.U.show()  # original
fs_interp.U.show()  # gapfilled

## Save gapfilled data to file

In [None]:
save_path = str(target_path).split(".nc")[0] + "_interped.nc"
target_interped_ds.to_netcdf(save_path)
print(f"saved to {save_path}")