# Xarray development notebook

**Author:** Xavier R Nogueira

**Things to explore:**
* Can we dynamically feed slices of an xarray into numba JIT functions using `xarray.apply_ufunc`?

**Performance Notes:**
* Numba JIT is slower w/o compiling a loop.
* Parallelizing via numba is slower at 100k iters (for a simple equation), but slightly faster at 1M.
* JIT compiling AROUND a JIT compiled function to apply the loop is by far the fastest.
* Could we loop and blast thru writing results, throw them in a stack of sorts, and async write into xarray? Avoiding the writing bottle neck could be fast.
* Another idea is to loop over a dictionary of functions_names and params. Pop from the dict and add to another dict. Basically an ordered dict would control the flow. **I like this idea, but how can we make it work for xarray**.

**Construction Notes:**
* One can use `xarray.apply_ufunc` programatically via `*args` syntax. This means we can rapidly evaluate a time step.
* It might be faster to project all constants into xarray first, passing one in via the `*args` syntax slows things down, I'm assuming it's because each time it's broadcasting.

In [1]:
import numba
import xarray as xr
import clearwater_modules

In [2]:
dir(clearwater_modules)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 'shared_processes',
 'tsm']

In [3]:
dir(clearwater_modules.tsm)

['EnergyBudget',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 'constants',
 'processes',
 'model']

In [4]:
dir(clearwater_modules.tsm.processes)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'dTdt_sediment_c',
 'dTdt_water_c',
 'density_air',
 'emissivity_air',
 'mixing_ratio_air',
 'numba',
 'q_latent',
 'q_net',
 'q_sediment',
 'q_sensible',
 'wind_function']

In [5]:
clearwater_modules.tsm.processes.q_sensible.__annotations__

{'wind_kh_kw': float,
 'ri_function': float,
 'cp_air': float,
 'density_water': float,
 'wind_function': float,
 'air_temp_k': float,
 'water_temp_k': float,
 'return': float}

# Numba comparison

In [None]:
[
    'func1'
]
o: OrderedDict() = {}
o

In [54]:
TEST_ITERS = 1000000

In [55]:
def q_sensible(
    wind_kh_kw: float,
    ri_function: float,
    cp_air: float,
    density_water: float,
    wind_function: float,
    air_temp_k: float,
    water_temp_k: float,
) -> float:
    # TODO: check if the return units are correct
    """Sensible heat flux (W/m2).

    Args:
        wind_kh_kw: Diffusivity ratio (unitless)
        ri_function: Richardson number (unitless)
        cp_air: Specific heat of air (J/kg/K)
        density_water: Water density (kg/m^3)
        wind_function: Wind function (unitless)
        air_temp_k: Air temperature (K)
        water_temp_k: Water temperature (K)
    """
    return (
        wind_kh_kw *
        ri_function *
        cp_air * density_water * wind_function *
        (air_temp_k - water_temp_k)
    )

In [56]:
%%time
for i in range(TEST_ITERS):
    clearwater_modules.tsm.processes.q_sensible(
        float(i),
        float(i*2),
        float(i*3),
        float(i*4),
        float(i*5),
        float(i*6),
        float(i*7),
    )

CPU times: total: 1.83 s
Wall time: 1.92 s


In [57]:
%%time
for i in range(TEST_ITERS):
    q_sensible(
        float(i),
        float(i*2),
        float(i*3),
        float(i*4),
        float(i*5),
        float(i*6),
        float(i*7),
    )

CPU times: total: 1.48 s
Wall time: 1.5 s


In [33]:
%%time
@numba.njit
def iter_numba(func: callable):
    for i in range(TEST_ITERS):
        func(
            float(i),
            float(i*2),
            float(i*3),
            float(i*4),
            float(i*5),
            float(i*6),
            float(i*7),
        )
iter_numba(clearwater_modules.tsm.processes.q_sensible)

CPU times: total: 109 ms
Wall time: 138 ms


In [58]:
ds = xr.tutorial.open_dataset('air_temperature')
ds

In [59]:
ds.air

In [46]:
test_list = [
        ds.air,
        ds.air*2,
        ds.air*3,
        ds.air*4,
        ds.air*5,
        ds.air*6,
        ds.air*6,
]

In [51]:
clearwater_modules.tsm.processes.q_sensible(*[float(i) for i in range(1, 8)])

-120.0

In [53]:
%%time
for i in range(TEST_ITERS):
    xr.apply_ufunc(
        clearwater_modules.tsm.processes.q_sensible,
        *[i for i in range(1, 8)],
    )

CPU times: total: 29.6 s
Wall time: 30.6 s


In [38]:
clearwater_modules.tsm.processes.q_sensible.__annotations__

{'wind_kh_kw': float,
 'ri_function': float,
 'cp_air': float,
 'density_water': float,
 'wind_function': float,
 'air_temp_k': float,
 'water_temp_k': float,
 'return': float}

In [41]:
%%time
xr.apply_ufunc(
        clearwater_modules.tsm.processes.q_sensible,
        *test_list,
        dask='parrallelized',
    )

CPU times: total: 125 ms
Wall time: 107 ms


In [15]:
out