# User Defined Functions

Python-Blosc2 implements a powerful way to operate with NDArray (and other flavors) objects.  In this section, we will see how to do computations with NDArray arrays using functions defined by ourselves (aka User-Defined-Functions).


In [1]:
import numba as nb
import numpy as np

import blosc2

## A simple example
First, let's create a NDArray which we will use to create and fill another one.

In [2]:
shape = (5_000, 10_000)
a = blosc2.linspace(0, 1, np.prod(shape), dtype=np.float32, shape=shape)

Now, let's define our function. This function can be executed for each chunk and will always receive 3 parameters. The first one is the inputs tuple to which we can pass any operand such as a NDArray, NumPy array or Python scalar. The second is the output buffer to be filled and the third is an offset corresponding to the start inside the array of the chunk being filled.

In [3]:
def add_one(inputs_tuple, output, offset):
    x = inputs_tuple[0]
    output[:] = x**3 + np.sin(x) + 1

As you can see, this function will take the first input, add one and save the result in output.

Now, to actually create a `LazyUDF` object (which also follows the [LazyArray interface](https://www.blosc.org/python-blosc2/reference/lazyarray.html)) we will use its constructor `lazyudf`.

In [4]:
larray = blosc2.lazyudf(add_one, (a,), a.dtype)
print(f"Class: {type(larray)}")

Class: <class 'blosc2.lazyexpr.LazyUDF'>


Next, to execute and get the result of your function you can choose between the `__getitem__` and `compute` methods.
The main difference is that the first one will return the computed result as a NumPy array whereas the second one will return a NDArray. Let's see `__getitem__` first.

In [5]:
%%time
npc = larray[:]
print(f"Type: {type(npc)}")

Type: <class 'numpy.ndarray'>
CPU times: user 351 ms, sys: 92.3 ms, total: 443 ms
Wall time: 396 ms



Now, let's use `compute` for the same purpose. The advantage of using this method is that you can pass some construction parameters for the resulting NDArray like the `urlpath` to store the resulting array on-disk.

In [6]:
%%time
c = larray.compute(urlpath="larray.b2nd", mode="w")
print(f"Type: {type(c)}")
print(c.info)

Type: <class 'blosc2.ndarray.NDArray'>
type    : NDArray
shape   : (5000, 10000)
chunks  : (100, 10000)
blocks  : (1, 10000)
dtype   : float32
cratio  : 21.12
cparams : CParams(codec=<Codec.ZSTD: 5>, codec_meta=0, clevel=1, use_dict=False, typesize=4,
        : nthreads=7, blocksize=40000, splitmode=<SplitMode.AUTO_SPLIT: 3>,
        : filters=[<Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>,
        : <Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>, <Filter.SHUFFLE: 1>], filters_meta=[0, 0,
        : 0, 0, 0, 0], tuner=<Tuner.STUNE: 0>)
dparams : DParams(nthreads=7)

CPU times: user 519 ms, sys: 90.7 ms, total: 610 ms
Wall time: 422 ms


In [10]:
larray.save("test.b2nd")

TypeError: LazyUDF.save() takes 1 positional argument but 2 were given

## Using Numba
Let's see how Python-Blosc2 can use Numba as an UDF. For this, let's decorate the same function with Numba.

In [7]:
@nb.jit(nopython=True, parallel=True)
def add_one_numba(inputs_tuple, output, offset):
    x = inputs_tuple[0]
    output[:] = x**3 + np.sin(x) + 1

In [8]:
larray2 = blosc2.lazyudf(add_one_numba, (a,), a.dtype)

Cool! We made our first Numba UDF function.  Now, let's evaluate it.

In [9]:
%%time
npc2 = larray2[:]

CPU times: user 569 ms, sys: 110 ms, total: 679 ms
Wall time: 519 ms


Incidentally, the pure Python version was faster than Numba.  This is because Numba has
large initialization overheads and the function is quite simple.  For more complex functions, or larger arrays, the difference will be less noticeable or favorable to it.

## Summary

In this section, we have seen how to execute user-defined function and get the result as a NumPy or NDArray. We have also seen how to make a Numba UDF.