# Calculus Refresher

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import numpy as np

**Calculating Gradient and Hessian Feilds**: we will use the following code to help us visualize how gradients flow and how Hessians signal local extrema and saddle points. We caution that these calculations are being done in a discrete setting and not on a continuous differentiable function and so care has to be taken so as to get sensible results.

The actual field over which we will be calculating derivatives is

|       |       |       |       |       |       |       |       |       |
|:--:	|:--:	|:--:	|:--:	|:--:	|:--:	|:--:	|:--:	|:-:	|
| 3 	| 3 	| 3 	| 3 	| 3 	| 3 	| 3 	| 3 	| 3 	|
| 2 	| 2 	| 2 	| 3 	| 4 	| 3 	| 3 	| 2 	| 1 	|
| 1 	| 1 	| 1 	| 3 	| 3 	| 3 	| 1 	| 1 	| 1 	|
| 1 	| 0 	| 1 	| 1 	| 2 	| 1 	| 1 	| 0 	| 1 	|
| 1 	| 1 	| 1 	| 3 	| 3 	| 3 	| 1 	| 1 	| 1 	|
| 1 	| 2 	| 3 	| 3 	| 4 	| 3 	| 2 	| 2 	| 2 	|
| 3 	| 3 	| 3 	| 3 	| 3 	| 3 	| 3 	| 3 	| 3 	|

**Padding**: since derivatives are calculated by computing differences/fluctuations locally, we need to worry about what happens at the edges of our field. It is common to _pad_ such finite fields (something very similar is done to images in computer vision) so that we get interpretable gradient values at the edges as well. Padding simply involves adding dummy rows and columns surrounding the actual field.

In [3]:
# The first and last rows are dummy rows for padding
# The first and last columns are also dummy columns for padding
A = np.array([
    [1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1],
    [1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1],
    [1, 2, 2, 2, 3, 4, 3, 3, 2, 1, 1],
    [1, 1, 1, 1, 3, 3, 3, 1, 1, 1, 1],
    [1, 1, 0, 1, 1, 2, 1, 1, 0, 1, 1],
    [1, 1, 1, 1, 3, 3, 3, 1, 1, 1, 1],
    [1, 1, 2, 3, 3, 4, 3, 2, 2, 2, 1],
    [1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1],
    [1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1],
])

# The first order directional derivatives
dfdx = np.zeros( (7, 9) )
dfdy = np.zeros( (7, 9) )

# The second order pure and mixed derivatives
d2fdx2 = np.zeros( (5, 7) )
d2fdxdy = np.zeros( (5, 7) )
d2fdy2 = np.zeros( (5, 7) )

# First order derivatives are easiest to calculate
for i in range( 1, 8 ):
    for j in range( 1, 10 ):
        dfdx[i-1, j-1] = (A[i, j+1] - A[i, j-1])/2
        # Need a negative sign before dfdy since array indices increase when going down
        # whereas y coordinates should decrease when going down
        dfdy[i-1, j-1] = -(A[i+1, j] - A[i-1, j])/2

# Second pure derivatives are still easy enough to calculate
for i in range( 2, 7 ):
    for j in range( 2, 9 ):
        d2fdx2[i-2, j-2] = A[i, j+1] + A[i, j-1] - 2 * A[i,j]
        d2fdy2[i-2, j-2] = A[i+1, j] + A[i-1, j] - 2 * A[i,j]

# Second mixed derivatives need more care
for i in range( 1, 6 ):
    for j in range( 1, 8 ):
        tmpxy = -(dfdx[i+1, j] - dfdx[i-1, j])/2
        tmpyx = (dfdy[i, j+1] - dfdx[i, j-1])/2
        # Since our field is not continuous, Clairaut's theorem does not apply here
        # So we approximate by taking the averages to get a symmetric Hessian matrix
        d2fdxdy[i-1, j-1] = (tmpxy + tmpyx)/2
        
np.set_printoptions(formatter={'float': lambda x: "{0:0.1f}".format(x)})
print( A.astype(float) )
print()
for i in range( dfdx.shape[0] ):
    for j in range( dfdx.shape[1] ):
        print("(" + str(dfdx[i, j]) + "," + str(dfdy[i, j]) + ")  ", end = "")
    print()
print()
for i in range( d2fdx2.shape[0] ):
    for j in range( d2fdx2.shape[1] ):
        print("(" + str(d2fdx2[i, j]) + "," + str(d2fdxdy[i, j]) + "," + str(d2fdy2[i, j]) + ")  ", end = "")
    print()

[[1.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 1.0]
 [1.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 1.0]
 [1.0 2.0 2.0 2.0 3.0 4.0 3.0 3.0 2.0 1.0 1.0]
 [1.0 1.0 1.0 1.0 3.0 3.0 3.0 1.0 1.0 1.0 1.0]
 [1.0 1.0 0.0 1.0 1.0 2.0 1.0 1.0 0.0 1.0 1.0]
 [1.0 1.0 1.0 1.0 3.0 3.0 3.0 1.0 1.0 1.0 1.0]
 [1.0 1.0 2.0 3.0 3.0 4.0 3.0 2.0 2.0 2.0 1.0]
 [1.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 1.0]
 [1.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 1.0]]

(1.0,0.0)  (0.0,0.0)  (0.0,0.0)  (0.0,-0.5)  (0.0,-1.0)  (0.0,-0.5)  (0.0,-0.5)  (0.0,0.0)  (-1.0,0.5)  
(0.5,1.0)  (0.0,1.0)  (0.5,1.0)  (1.0,0.0)  (0.0,0.0)  (-0.5,0.0)  (-0.5,1.0)  (-1.0,1.0)  (-0.5,1.0)  
(0.0,0.5)  (0.0,1.0)  (1.0,0.5)  (1.0,1.0)  (0.0,1.0)  (-1.0,1.0)  (-1.0,1.0)  (0.0,1.0)  (0.0,0.0)  
(-0.5,0.0)  (0.0,0.0)  (0.5,0.0)  (0.5,0.0)  (0.0,0.0)  (-0.5,0.0)  (-0.5,0.0)  (0.0,0.0)  (0.5,0.0)  
(0.0,0.0)  (0.0,-1.0)  (1.0,-1.0)  (1.0,-1.0)  (0.0,-1.0)  (-1.0,-1.0)  (-1.0,-0.5)  (0.0,-1.0)  (0.0,-0.5)  
(0.5,-1.0)  (1.0,-1.0)  (0.5,-1.0)  (0.5,0.0)  (0.0