# Approximating the derivative of a function of many variables

Application of Numerical Analysis in matrix derivatives. Specifically, we will apply the central difference formula with a degree of convergence equal to 2 at each point in the matrix:

$$f'(x_0) = \frac{f(x_0 + h) - f(x_0 - h)}{2h} + \mathcal{O}(h^2), \ h\rightarrow 0$$

In [3]:
import numpy as np

def check_grad(fn, gr, X):
    X_flat = X.reshape(-1) # convert X to an 1d array -> 1 for loop needed
    shape_X = X.shape # original shape of X
    grad_flat = np.zeros_like(X_flat) # 1d version of grad
    eps = 1e-6 # a small number, 1e-10 -> 1e-6 is usually good
    numElems = X_flat.shape[0] # number of elements in X

    # calculate numerical gradient
    for i in range(numElems): # iterate over all elements of X
        Xp_flat = X_flat.copy()
        Xn_flat = X_flat.copy()
        Xp_flat[i] += eps
        Xn_flat[i] -= eps
        Xp = Xp_flat.reshape(shape_X)
        Xn = Xn_flat.reshape(shape_X)
        grad_flat[i] = (fn(Xp) - fn(Xn))/(2*eps)

    num_grad = grad_flat.reshape(shape_X)

    diff = np.linalg.norm(num_grad - gr(X))
    print('Difference between two methods should be small:', diff)

In [4]:
m, n = 10, 20

# Check if grad(x^T * A * x) == (A + A^T) * x
A = np.random.rand(m, m)
x = np.random.rand(m, 1)

def fn1(x):
    return x.T.dot(A).dot(x)
def gr1(x):
    return (A + A.T).dot(x)
check_grad(fn1, gr1, x)
# Check if grad(trace(AX)) == A^T
A = np.random.rand(m, n)
X = np.random.rand(n, m)

def fn2(X):
    return np.trace(A.dot(X))
def gr2(X):
    return A.T
check_grad(fn2, gr2, X)

Difference between two methods should be small: 4.276982422081088e-09
Difference between two methods should be small: 2.3497874453351018e-08


  grad_flat[i] = (fn(Xp) - fn(Xn))/(2*eps)


___
## **References:**

$[1].$ Vũ Hữu Tiệp. (2018). _Machine Learning cơ bản_. Nhà xuất bản Khoa học và Kỹ thuật.