## Grad on Vectors and Matrices

In real life machine learning projects, you're going to work more with vector and matrix data instead of simple numbers. In fact, the modern day GPUs have built in hardware parts to make matrix and vector computations lightning fast, hence the appeal to treat everything in terms of matrix computations. 

So let's get started.

In [1]:
import jax as J
import jax.numpy as jnp

Let's start with a vector and find it's derivative. 

$h(x) =\begin{vmatrix}
9x^{2}ln(x)
\\ 
10x
\end{vmatrix}$

In [2]:
def h(x):
    return jnp.array(
        [
            [9 * x**2 * jnp.log(x)],
            [10 * x]
        ]
    )

In [3]:
h(1.)

DeviceArray([[ 0.],
             [10.]], dtype=float32)

In [4]:
h(1.).shape

(2, 1)

In [5]:
d_h = J.grad(h)
d_h(1.)

TypeError: Gradient only defined for scalar-output functions. Output had shape: (2, 1).

Why isn't this working? It's because grad works on scalars (non vector / non list / non matrix) values or  just plain numbers. So we can't run  grad here. 

## Jacobian

An easy workaround is to get the Jacobian of h.

An Jacobian of a vector or matrix is another vector or matrix containing all the first order derivatives of that vector or matrix. Sounds complicated? You can find more about Jacobian Matrix [here](https://youtu.be/bohL918kXQk) .

Note: Second order derivatives are also known as Hessian

In [6]:
jc = J.jacobian(h)
jc(1.)

DeviceArray([[ 9.],
             [10.]], dtype=float32, weak_type=True)

Now let's verify this thing row by row. For the first row, 

$\frac{d}{dx}(9x^{2}ln(x)) = 9x + 18xln(x)$

Which for $x = 1$ becomes $9$

For the second row, 

$\frac{d}{dx}(10x) = 10$

Which will stay $10$ no matter what $x$ is.

## Vector to Vector

Now let's make this interesting. Let's take the derivative of a vector in terms of another vector. 


$$
x = [x_1 x_2]
$$

$$
g = \begin{vmatrix}
e^{-8x_{1} + 6} & \\ 
\frac{1}{3}e^{10x_{2}} & \\ 
3x_{1}9x_{2}
\end{vmatrix}
$$

In [7]:
def g(x):
    return jnp.array(
        [
            # exp is e
            # https://numpy.org/doc/stable/reference/generated/numpy.exp.html
            [jnp.exp(-8 * x[0] + 6)],
            [(1. / 3.) * jnp.exp(10 * x[1])],
            [3 * x[0] * 9 * x[1]]
        ]
    )

In [8]:
d_g = J.jacobian(g)

x = jnp.array([1., 2.])
d_g(x)

DeviceArray([[[-1.0826823e+00,  0.0000000e+00]],

             [[ 0.0000000e+00,  1.6172174e+09]],

             [[ 5.4000000e+01,  2.7000000e+01]]], dtype=float32)

![doge](../images/doge.jpg)

## Explaim Hooman!

Okay, let's do an "explaim"

For the first row, again, 

$\frac{d}{dx_{1}}(e^{-8x_{1} + 6}) = -8e^{-8x_{1} + 6}$

$\frac{d}{dx_{2}}(e^{-8x_{1} + 6}) = 0$

Second row, 

$\frac{d}{dx_{1}}(\frac{1}{3}e^{10x_{2}}) = 0$

$\frac{d}{dx_{2}}(\frac{1}{3}e^{10x_{2}}) = \frac{10}{3}e^{10x_{2}}$

Third row

$\frac{d}{dx_{1}}(3x_{1}9x_{2}) = 27x_{2}$

$\frac{d}{dx_{2}}(3x_{1}9x_{2}) = 27x_{1}$


Since we have a vector with 2 elements, x1 and x2, the final result will look like this:

$$
\begin{vmatrix}
-8e^{-8x_{1} + 6} & 0 \\ 
0 & \frac{10}{3}e^{10x_{2}} \\ 
27x_{2} &  27x_{1}
\end{vmatrix}
$$

The first column containes derivatives in terms of x1 and the second one from x2.

## Matrix?

You can use jacobian() on matrices as well. The procedure is same. (I can't ensure about verifying derivatives of large matrices in a simple manner though!)