## Generalized ufuncs

We've just seen how to make our own ufuncs using `vectorize`, but what if we need something that can operate on an arbitrary number of elements in an input array?  

Enter `guvectorize`.  

There are several important differences between `vectorize` and `guvectorize` that bear close examination.  Let's take a look at a few simple examples.

In [None]:
import numpy
from numba import guvectorize

In [None]:
@guvectorize('int64[:], int64, int64[:]', '(n),()->(n)')
def g(x, y, res):
    for i in range(x.shape[0]):
        res[i] = x[i] + y

* Declaration of input/output layouts
* No return statements

In [None]:
x = numpy.arange(10)

But wait!  We still call `g` as if it were defined as `def g(x, y)`

In [None]:
res = g(x, 5)
print(res)

However, if you _need_ to pass in a pre-allocated results array, you can (more on why you might do this in a few)

In [None]:
res = numpy.zeros_like(x)
res = g(x, 5, res)
print(res)

In [None]:
@guvectorize('float64[:,:], float64[:,:], float64[:,:]', 
            '(m,n),(n,p)->(n,p)')
def matmul(A, B, C):
    m, n = A.shape
    n, p = B.shape
    for i in range(m):
        for j in range(p):
            C[i, j] = 0
            for k in range(n):
                C[i, j] += A[i, k] * B[k, j]

In [None]:
a = numpy.random.random((500, 500))

In [None]:
out = matmul(a, a)

In [None]:
%timeit matmul(a, a, numpy.empty_like(a))

In [None]:
%timeit a @ a

And it also supports the `target` keyword argument

In [None]:
def g(x, y, res):
    for i in range(x.shape[0]):
        res[i] = x[i] + numpy.exp(y)
        
g_serial = guvectorize('float64[:], float64, float64[:]', '(n),()->(n)')(g)
g_par = guvectorize('float64[:], float64, float64[:]', '(n),()->(n)', target='parallel')(g)

In [None]:
%timeit res = g_serial(numpy.arange(1000000).reshape(1000, 1000), 3)
%timeit res = g_par(numpy.arange(1000000).reshape(1000, 1000), 3)

## [Exercise: Writing signatures](./exercises/08.GUVectorize.Exercises.ipynb#Exercise:-2D-Heat-Transfer-signature)

What's up with these boundary conditions?

```python
for i in range(I):
        Tn[i, 0] = T[i, 0]
        Tn[i, J - 1] = Tn[i, J - 2]

    for j in range(J):
        Tn[0, j] = T[0, j]
        Tn[I - 1, j] = Tn[I - 2, j]
```

We don't pass in `Tn` explicitly, which means Numba allocates it for us (thanks!) but it's allocated using `numpy.empty_like` so if we don't touch every value in `Tn` in the function, those empty values will stick around and cause trouble.  

Solutions?  The one above, or pass it in explicitly after doing something like `Tn = Ti.copy()`

## [Exercise: Remove the vanilla loops](./exercises/08.GUVectorize.Exercises.ipynb#Exercise:-2D-Heat-Transfer-Time-loop)

The example above loops in time outside of the `vectorize`d function.  That means it's looping in vanilla Python which is not the fastest thing in the world.  

What to do?  Move the time loop inside the function?  `jit` the `run_ftcs` function?  I suggest you try option #1.

## Demo: Why not `jit` the `run_ftcs` function?

Because, at the moment, it won't work.  (bummer).

In [None]:
@guvectorize('float64[:,:], float64[:,:]', '(n,n)->(n,n)')
def gucopy(a, b):
    I, J = a.shape
    for i in range(I):
        for j in range(J):
            b[i, j] = a[i, j]

In [None]:
from numba import jit

In [None]:
@jit
def make_a_copy():
    a = numpy.random.random((25,25))
    b = gucopy(a)
    
    return a, b

In [None]:
a, b = make_a_copy()
assert numpy.allclose(a, b)

In [None]:
make_a_copy.inspect_types()