# Moving window operations

Objective: apply a moving window algorithm along the lat and lon axes by applying a kernel function on the time axis.

In [21]:
println(versioninfo())
using Pkg; Pkg.activate("."); Pkg.instantiate()
using Zarr, EarthDataLab, YAXArrays, Statistics

Julia Version 1.9.1
Commit 147bdf428cd (2023-06-07 08:27 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 10 × Apple M1 Pro
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 20 on 8 virtual cores
Environment:
  JULIA_NUM_THREADS = 20
nothing


[32m[1m  Activating[22m[39m project at `~/BGI/DeepCube/DaCuMoWin`


In [22]:
# Create a dummy array
a = Array{Union{Float64,Missing}}(rand(40,20,10))
lon = RangeAxis("Lon",1:40)
lat = RangeAxis("Lat",1:20)
tim = RangeAxis("Time",1:10)
c = YAXArray([lon,lat,tim],a)

YAXArray with the following dimensions
Lon                 Axis with 40 Elements from 1 to 40
Lat                 Axis with 20 Elements from 1 to 20
Time                Axis with 10 Elements from 1 to 10
Total size: 62.5 KB


In [23]:
# access remote data cube

rc = Cube(open_dataset(zopen("https://storage.de.cloud.ovh.net/v1/AUTH_84d6da8e37fe4bb5aea18902da8c1170/uc3/UC3SubCube_ts.zarr", fill_as_missing=false)))


YAXArray with the following dimensions
x                   Axis with 1253 Elements from 18.70211622302644 to 28.899269063907617
y                   Axis with 983 Elements from 42.299194879353976 to 34.30110854569159
time                Axis with 4560 Elements from 2009-03-06T10:00:00 to 2021-08-29T10:00:00
Variable            Axis with 5 elements: avg_rh ignition_points burned_areas lst_day evi 
units: unitless
Total size: 104.62 GB


In [24]:
# subset cube: 1 time step
sc1d = subsetcube(rc, x = (20.9,21), y = (36.9,37), time = Date(2021,1,1), variable = "evi")

YAXArray with the following dimensions
x                   Axis with 13 Elements from 20.901182730245225 to 20.99891901945495
y                   Axis with 12 Elements from 36.99700118972647 to 36.90740959128422
units: unitless
Total size: 624.0 bytes


In [32]:
# subset cube: 10 time steps
sc3d = subsetcube(rc, x = (20.9,21), y = (36.9,37), time = (Date(2021,1,1), Date(2021,1,4)), variable = "evi")

YAXArray with the following dimensions
x                   Axis with 13 Elements from 20.901182730245225 to 20.99891901945495
y                   Axis with 12 Elements from 36.99700118972647 to 36.90740959128422
time                Axis with 3 Elements from 2021-01-01T10:00:00 to 2021-01-03T10:00:00
units: unitless
Total size: 1.83 KB


In [39]:
# subset cube: 10 time steps, 1 degree
sc10d1deg = subsetcube(rc, x = (20,21), y = (36,37), time = (Date(2021,1,1), Date(2021,4,15)), variable = "evi")

YAXArray with the following dimensions
x                   Axis with 123 Elements from 20.005266745822755 to 20.99891901945495
y                   Axis with 123 Elements from 36.99700118972647 to 36.003348916094275
time                Axis with 104 Elements from 2021-01-01T10:00:00 to 2021-04-14T10:00:00
units: unitless
Total size: 6.0 MB


## Naive approach

In [27]:
function my_mapwindow(f, img, window)
    out = zeros(eltype(img), axes(img))
    R = CartesianIndices(img)
    I_first, I_last = first(R), last(R)
    Δ = CartesianIndex(ntuple(x->window[x] ÷ 2, ndims(img)))
    @inbounds @simd for I in R
        patch = max(I_first, I-Δ):min(I_last, I+Δ)
        out[I] = f(view(img, patch))
    end
    return out
end

my_mapwindow (generic function with 1 method)

In [28]:
@time my_mapwindow(mean, c.data, (3,3,3));

  0.039044 seconds (213.24 k allocations: 13.328 MiB, 88.98% compilation time)


In [30]:
@time my_mapwindow(mean, sc1d.data, (3,3,1));

 28.356848 seconds (265.61 k allocations: 299.142 MiB, 0.15% gc time)


In [33]:
@time my_mapwindow(mean, sc3d.data, (3,3,3));

 90.293991 seconds (1.81 M allocations: 954.776 MiB, 0.21% gc time, 0.32% compilation time)


## With YAXArrays functions mapCube and MovingWindow

`MovingWindow(desc, pre, after)`

Constructs a `MovingWindow` object to be passed to an `InDims` constructor to define that the axis in `desc` shall participate in the inner function (i.e. shall be looped over), but inside the inner function `pre` values before and `after` values after the center value will be passed as well.

For example passing `MovingWindow("Time", 2, 0)` will loop over the time axis and always pass the current time step plus the 2 previous steps. So in the inner function the array will have an additional dimension of size 3. 

In [34]:
# Define the input dimensions with time first (all time steps) and lon and lat each with one previous and one consecutive slice. 
indims = InDims( MovingWindow("Lon",1,1), MovingWindow("Lat",1,1))

@time r1 = mapCube(c, indims=indims, outdims=OutDims()) do xout,xin
    # Inside this function, xin will have size 1x3x3 (time x lon x lat)
    # xout should have size 1 (time)
    xout = mean(xin)
end

  0.203130 seconds (347.68 k allocations: 17.332 MiB, 221.86% compilation time)


YAXArray with the following dimensions
Lon                 Axis with 40 Elements from 1 to 40
Lat                 Axis with 20 Elements from 1 to 20
Time                Axis with 10 Elements from 1 to 10
Total size: 62.5 KB


When applying a function on moving windows with mapCube, one should take care of susceptible **missing values**, as well as of what happens at the **edges of the cube**.

Missing values need to be dealt with in the function called by `mapCube`. For example, using `skipmissing`, the missing values are ignored.

Adding the keyword argument `window_oob_value` to `InDims` : if one of the input dimensions is a MowingWindow, this value will be used to fill out-of-bounds areas

In [35]:
# Define the input dimensions with time first (all time steps) and lon and lat each with one previous and one consecutive slice. 
# set out-of-bounds values to -9999.0
indims = InDims(MovingWindow("x",1,1), MovingWindow("y",1,1), window_oob_value = -9999.0)

@time r1 = mapCube(sc1d, indims=indims) do xout,xin
    # Inside this function, xin will have size 10x3x3 (time x lon x lat)
    # xout should have size 10 (time)
    xout[:] = mapslices(x->mean(skipmissing(x)), xin; dims = (1,2))
end

[32mProgress:  50%|████████████████████▌                    |  ETA: 0:00:02[39m[K

[32mProgress:  75%|██████████████████████████████▊          |  ETA: 0:00:01[39m[K

[32mProgress: 100%|█████████████████████████████████████████| Time: 0:00:01[39m[K


  2.069208 seconds (662.83 k allocations: 42.706 MiB, 0.90% gc time, 119.14% compilation time)


YAXArray with the following dimensions
x                   Axis with 13 Elements from 20.901182730245225 to 20.99891901945495
y                   Axis with 12 Elements from 36.99700118972647 to 36.90740959128422
Total size: 624.0 bytes


Trying on larger subsets:

- with 10 time steps

In [36]:
indims = InDims(MovingWindow("time",1,1),  MovingWindow("x",1,1), MovingWindow("y",1,1), window_oob_value = -9999.0)

@time r10 = mapCube(sc3d, indims=indims, outdims=OutDims()) do xout,xin
    # Inside this function, xin will have size 1x3x3 (time x lon x lat)
    # xout should have size 1 (time)
    xout[] = mean(skipmissing(xin))
end

[32mProgress:  50%|████████████████████▌                    |  ETA: 0:00:01[39m[K

[32mProgress: 100%|█████████████████████████████████████████| Time: 0:00:00[39m[K


  1.369130 seconds (2.32 M allocations: 146.202 MiB, 1.94% gc time, 266.34% compilation time)


YAXArray with the following dimensions
x                   Axis with 13 Elements from 20.901182730245225 to 20.99891901945495
y                   Axis with 12 Elements from 36.99700118972647 to 36.90740959128422
time                Axis with 3 Elements from 2021-01-01T10:00:00 to 2021-01-03T10:00:00
Total size: 1.83 KB


- with 10 time steps and 100 times more spatial grid cells

In [40]:
@time r101 = mapCube(sc10d1deg, indims=indims, outdims=OutDims()) do xout,xin
    # Inside this function, xin will have size 1x3x3 (time x lon x lat)
    # xout should have size 1 (time)
    xout[] = mean(skipmissing(xin))
end

[32mProgress:  50%|████████████████████▌                    |  ETA: 0:00:02[39m[K

[32mProgress:  75%|██████████████████████████████▊          |  ETA: 0:00:01[39m[K

[32mProgress: 100%|█████████████████████████████████████████| Time: 0:00:02[39m[K


  2.248003 seconds (25.52 M allocations: 1.353 GiB, 1.47% gc time, 23.19% compilation time)


YAXArray with the following dimensions
x                   Axis with 123 Elements from 20.005266745822755 to 20.99891901945495
y                   Axis with 123 Elements from 36.99700118972647 to 36.003348916094275
time                Axis with 104 Elements from 2021-01-01T10:00:00 to 2021-04-14T10:00:00
Total size: 6.0 MB
