# Dev Notebook: Running Process Functions

**Author:** Xavier Nogueira

**Problem:** As of October 11th 2023 we use `xarray.apply_ufunc` to map equations unto our `xarray.dataset` to iterate timesteps. This works with math only equations, where operators combine arrays, however if/else decision tree logic does not work on the arrays.

**Solution:** Explore a way to support both equation versions in this notebook, while optimizing for model timestep execution performance.

In [1]:
import xarray as xr
import numpy as np
import numba

# Create mock data

## Add variables to the air temp xarray

In [2]:
air_ds = xr.tutorial.open_dataset('air_temperature')
air_ds['air2'] = air_ds.air * 2
air_ds['air3'] = air_ds.air * 3
air_ds = air_ds.isel(time=-1)
air_ds

In [3]:
# under the hood this is what xr.where does
# it contains added logic to handle attrs
# if also uses numpy OR some other array library depending on whether
# the dataset has an "__array_namespace__" class attribute
xr.apply_ufunc(
    np.where,
    air_ds > 1,
    1,
    0,
)

In [4]:
air_ds.air

# Test functions out 

**Findings:**
* Using the simple equation of "mock 1", pre-vectorizing vs JIT-compiling produce relatively simulat speeds with JIT seeming to be a tad faster. This gap will likely increase with iterations, so **numba is still the better choice**.
* Regarding IF/ELSE logic, one can either use `xr.where` or loop thru all indices and output a fresh numpy array.
* Using `xr.where` DO NOT use `np.vectorize` (it 10x slows it wierdly), and JIT does not work. Regardless it is reasonably fast.
* Using the loop we can JIT compile it, and it **seems slightly faster**. That said, the logic is more complex.

In [7]:
input_list: list[xr.DataArray] = [air_ds.air, air_ds.air2, air_ds.air3]

In [5]:
@numba.njit
def mock_1(air, air2, air3):
    return air * 0.01 * air2 * air3

@np.vectorize
def mock_1_vec(air, air2, air3):
    return air * 0.01 * air2 * air3

## Compare basic arithmatic

In [8]:
%%timeit
air_ds['mock_1_numba'] = xr.apply_ufunc(
    mock_1,
    *input_list
)
air_ds['mock_1_numba']

3.59 ms ± 562 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [9]:
%%timeit
air_ds['mock_1_np'] = xr.apply_ufunc(
    mock_1_vec,
    *input_list
)
air_ds['mock_1_np']

3.21 ms ± 109 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [10]:
air_ds['mock_1_np'].quantile(q=0.5)

## Compare if/else

`xr.where`:
* Pro: Xarray native, requires no data copying.
* Pro: Quite fast.
* Con: Creates ugly nested where statements.

`np.select`
* Pro: Cleaner logic implementation. Data is mapped to a list of conditions. No nesting.
* Con: We will have to copy the data out, and write back to Xarray.

`numba` with iteration:
* Pro: The fastest solution
* Con: Complicated logic, boiler plate index iteration will be repeated everywhere.
* Con: Not fast enough to make it worth it.

**Notes:**
* Dask also has a select functions: https://docs.dask.org/en/stable/generated/dask.array.select.html, can we abstract away the array type and have it work with both dask or numpy under-the-hood?

## If/else logic functions

In [24]:
def mock_where(air, air2, air3):
    combo = air * 0.01 * air2 * air3
    return xr.where(combo > 1243597, 1, 0)

@np.vectorize
def mock_where_vec(air, air2, air3):
    combo = air * 0.01 * air2 * air3
    return xr.where(combo > 1243597, 1, 0)

@numba.njit
def mock_where_loop(air, air2, air3):
    result = np.zeros_like(air)  # Create an array of zeros with the same shape as 'air'
    for i in range(air.shape[0]):
        for j in range(air.shape[1]):
            combo = mock_1(air[i, j], air2[i, j], air3[i, j])
            if combo > 1243597:
                result[i, j] = 1
    return result

def mock_where_select(air, air2, air3):
    combo = air * 0.01 * air2 * air3
    return np.select(
        condlist=[combo > 1243597],
        choicelist=[1],
    )

In [14]:
%%timeit
air_ds['mock_where'] = xr.apply_ufunc(
    mock_where,
    *input_list
)
air_ds['mock_where']

3.45 ms ± 31.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [16]:
%%timeit
air_ds['mock_where_vec'] = xr.apply_ufunc(
    mock_where_vec,
    *input_list
)
air_ds['mock_where_vec']

29.3 ms ± 2.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [17]:
%%timeit
air_ds['mock_where_loop'] = xr.apply_ufunc(
    mock_where_loop,
    *input_list
)
air_ds['mock_where_loop']

3.37 ms ± 387 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [25]:
%%time
air_ds['mock_where_select'] = xr.apply_ufunc(
    mock_where_select,
    *input_list
)
air_ds['mock_where_select']

CPU times: total: 15.6 ms
Wall time: 4 ms


In [31]:
assert np.max(air_ds.mock_where_select.values - air_ds.mock_where.values) == 0