# Week 3: Calculus (cont'd)

In [1]:
# Load libraries
import numpy as np
import sympy as sp
import matplotlib.pyplot as plt
import seaborn as sns

## Day 2: Differentiation of Multivariable Functions
* So far we worked with functions of a single variable
* In practice, it is much more common to work with functions that have more than one input
* We call these functions multivariable functions

### Example 1: a function of two variables
* Let $x$ and $y$ are two variables which are inputs of the function $f$. We call $x$ and $y$ **independent variables**, and we denote $z = f(x, y)$. We call $z$ a **dependent variable**
* Let define the function $f(x, y) = x^2 + y^2$

In [2]:
# Define variables and the function
x, y = sp.symbols('x y', real=True)
f = sp.Function('f', real=True)

f = x**2 + y**2

f

x**2 + y**2

### Example 2
* Define the function:
\begin{equation} g(t_1, t_2, t_3) = \frac{1}{t_1} + \ln t_2 - e^{-t_3^2} \end{equation}

In [3]:
t1, t2, t3 = sp.symbols('t1 t2 t3', real=True)
g = sp.Function('g', real = True)

g = 1/t1 + sp.ln(t2) - sp.exp(-t3**2)

g

log(t2) - exp(-t3**2) + 1/t1

### Differentiation of multivariable functions
* With multivariable functions there is no single derivative like in the case of a single variable functions
* Instead, there is a **partial derivative** for each and every one of the independent variables
* For example, if the function is $f(x, y)$, which has two independent variables, then we have two partial derivatives
    * Partial derivative with respect to $x$, labeled as $f_x$ or $\frac{\partial f}{\partial x}$
    * Partial derivative with respect to $y$, labeled as $f_y$ or $\frac{\partial f}{\partial y}$

### Example 1a
* Calculate the partial derivatives of $f(x, y)$ defined in **Example 1**

In [4]:
# Differentiate with sympy.diff()
f_x = f.diff(x)
f_y = f.diff(y)

print(f_x)

print(f_y)

2*x
2*y


### Example 2a
* Calculate the partial derivatives of $g(t_1, t_2, t_3)$ defined in **Example 2**

In [5]:
# Differentiate with sympy.diff()
g_t1 = g.diff(t1)
g_t2 = g.diff(t2)
g_t3 = g.diff(t3)

print(g_t1)

print(g_t2)

print(g_t3)

-1/t1**2
1/t2
2*t3*exp(-t3**2)


### Example 3
* Calculate the partial derivatives $h_x$ and $h_y$ of
\begin{equation}
h(x, y) = \sqrt{x^2 + y^2}
\end{equation}
* Then evaluate $h_x (-1, 1)$ and $h_y (0, -1)$

In [6]:
# Define the function
h = sp.Function('h', real=True)
h = sp.sqrt(x**2 + y**2)

h_x = h.diff(x)
print(h_x)

h_y = h.diff(y)
print(h_y)

#Evaluating h_x(-1, 1)
print('h_x(-1, 1) = ', h_x.subs({x:-1, y:1}))

# Evaluating h_y(1, 0)
print('h_y(1, 1) = ', h_y.subs({x:0, y:-1}))

x/sqrt(x**2 + y**2)
y/sqrt(x**2 + y**2)
h_x(-1, 1) =  -sqrt(2)/2
h_y(1, 1) =  -1


### The Gradient of a function
* Let $f$ be a multivariable function. The **gradient** of $f$, labeled $\nabla f$ is a vector whose components are the partial derivatives of the function. The gradient has as many components as the function has variables
* For example: if $f(x, y)$ is a function, then its gradient is given by:
\begin{equation}
\nabla f(x, y) = \left( f_x, f_y \right)
\end{equation}
If $g(x, y, z, w)$ is a function, then its gradient is given by:
\begin{equation}
\nabla g(x, y, z, w) = \left( g_x, g_y, g_z, g_w \right)
\end{equation}
* For us the gradient will play a crucial role in extending the *Gradient Descent Method* to multivariable functions

### Example 4
* Find the gradient $\nabla f(x, y)$ of the function $f(x, y) = e^x \cos y$. Then evaluate $\nabla f(2, 0)$

In [7]:
# Define the function
f = sp.exp(x)*sp.cos(y)

# Get derivatives, form gradient
f_x = f.diff(x)
f_y = f.diff(y)

grad_f = [f_x, f_y]
print(grad_f)


# Evaluate the gradient
dict_val = {x:2, y:0}
grad_f_eval = [f_x.subs(dict_val).evalf(), f_y.subs(dict_val).evalf()]
print(grad_f_eval)

[exp(x)*cos(y), -exp(x)*sin(y)]
[7.38905609893065, 0]


### Example 5
* Find the gradient $\nabla g(x, y, z)$ of the function $g(x, y, z) = e^{x^2 + 2y - z^3}$. Then evaluate $\nabla g(-1, 1, 0)$

In [8]:
# Define function
z = sp.symbols('z', real=True)
g = sp.exp(x**2 + 2*y - z**3)

# Get derivatives, form gradient
g_x = g.diff(x)
g_y = g.diff(y)
g_z = g.diff(z)

grad_g = [g_x, g_y, g_z]
print(grad_g)

# Evaluate the gradient
dict_val = {x:-1, y:1, z:0}
grad_g_eval = [g_x.subs(dict_val).evalf(), g_y.subs(dict_val).evalf(), g_z.subs(dict_val).evalf()]
print(grad_g_eval)

[2*x*exp(x**2 + 2*y - z**3), 2*exp(x**2 + 2*y - z**3), -3*z**2*exp(x**2 + 2*y - z**3)]
[-40.1710738463753, 40.1710738463753, 0]


### Higer order derivatives
* Similar as in the case of a single variable functions, we can perform multiple consecutive (partial) differentiations
* What is different in this case is that there are multiple possibilities for performing the differentiation process: every partial derivative can be differentiated with respect to all independent variables
* For example, let $f(x, y)$ be a two-variable function. The **first partial derivatives** are $f_x (x, y)$ and $f_y (x, y)$. Differentiating these, we get the **second partial derivatives**. From $f_x$, we get $f_{xx}(x, y)$ and $f_{xy}(x, y)$, while from $f_y$ we get $f_{yx}(x, y)$ and $f_{yy}(x, y)$.

### Example 6
* Calculate all second partial derivatives of the function $f(x, y) = x^3 + 2x^2 y + y^2$

In [9]:
# Define the function
f = x**3 + 2*x**2*y + y**2


# First partial derivatives
f_x = f.diff(x)
f_y = f.diff(y)


# Second partial derivatives
f_xx = f_x.diff(x)
f_xy = f_x.diff(y)

f_yx = f_y.diff(x)
f_yy = f_y.diff(y)


# Printing the derivatives
print('f_xx = ', f_xx)
print('f_xy = ', f_xy) # Notice:
print('f_yx = ', f_yx) # f_xy = f_yx
print('f_yy = ', f_yy)

f_xx =  6*x + 4*y
f_xy =  4*x
f_yx =  4*x
f_yy =  2


### Example 7
* Calculate all second partial derivatives of the funciton $g(x, y) = -4x^3 - 3x^2y^3 + 2y^2$

In [10]:
# Define the function
g = -4*x**3 - 3*x**2*y**3 + 2*y**2


# First partial derivatives
g_x = g.diff(x)
g_y = g.diff(y)


# Second partial derivatives
g_xx = g.diff(x, 2)
g_xy = g.diff(x, y)
g_yx = g.diff(y, x)
g_yy = g.diff(y, 2)


# Printing the derivatives
print('g_xx = ', g_xx)
print('g_xy = ', g_xy)
print('g_yx = ', g_yx)
print('g_yy = ', g_yy)

g_xx =  -6*(4*x + y**3)
g_xy =  -18*x*y**2
g_yx =  -18*x*y**2
g_yy =  2*(-9*x**2*y + 2)


### Application: finding min and max of multivariable functions
* Finding minima and maxima mostly follows the process we outlined for the single variable case
* First, calculate all the first partial derivatives of the function. Next, set them equal to zero and solve the system that consists of these equations - one equation for every partial derivative, for example
\begin{equation}\left\{
\begin{array}{rcl}
f_x &=& 0\\
f_y &=& 0
\end{array}\right.
\end{equation}
* The solutions of this system are the critical points
* Finally, find the second partial derivatives, and test whether a critical point is a **local minimum**, a **local maximum**, a **saddle point** or neither of the three
* We will illustrate this only for functions of two variables

### Example 8 (part 1)
* Step 1: find the critical points of the function $f(x, y) = 9xy -x^3 - y^3 - 6$

In [11]:
# Define the function
f = 9*x*y - x**3 - y**3 - 6

# First partial derivatives
f_x = f.diff(x)
f_y = f.diff(y)

# Set the equations
eq_1 = sp.Eq(f_x, 0)
eq_2 = sp.Eq(f_y, 0)

# Solve the system
crit_pts = sp.solve((eq_1, eq_2), (x, y))

# State the critical points
print(f'Critical points are {crit_pts[0]} and {crit_pts[1]}')

Critical points are (0, 0) and (3, 3)


* Once we have identified a critical point $(x^*, y^*)$, we need to establish if it is an extremal point, and if it is -- then whether it is a minimum or a maximum point. To do this we need the second partial derivatives
* We calculate the quantity $d$, given by:
\begin{equation}
d = f_{xx}(x^*, y^*) \cdot f_{yy}(x^*, y^*) - \big[ f_{xy}(x^*, y^*) \big]^2
\end{equation}
* We based on the value of $d$ and of $f_{xx}(x^*, y^*)$ we can conclude:
    * If $d > 0$ and $f_{xx}(x^*, y^*) > 0$, then the point $(x^*, y^*)$ is a **local minimum**
    * If $d > 0$ and $f_{xx}(x^*, y^*) < 0$, then the point $(x^*, y^*)$ is a **local maximum**
    * If $d < 0$, then the point $(x^*, y^*)$ is a **saddle point** (think: Pringles)
    * If $d = 0$, the test is **inconclusive**

### Example 8 (part 2)
* Determine the nature of the critical points of $f(x, y) = 9xy -x^3 - y^3 - 6$

In [12]:
# Second partial derivatives
f_xx = f.diff(x, 2)
f_yy = f.diff(y, 2)
f_xy = f.diff(x, y)



# Working with the first critical point
print(f'Case 1: critical point = {crit_pts[0]}')
dict_val = {x:crit_pts[0][0], y:crit_pts[0][1]}
d = f_xx.subs(dict_val) * f_yy.subs(dict_val) - f_xy.subs(dict_val)
print(f'd = {d} < 0 and f_xx{crit_pts[0]} = {f_xx.subs(dict_val)}. We conclude that {crit_pts[0]} is a saddle point')



# Working with the second critical point
print(f'\nCase 2: critical point = {crit_pts[1]}')
dict_val = {x:crit_pts[1][0], y:crit_pts[1][1]}
d = f_xx.subs(dict_val) * f_yy.subs(dict_val) - f_xy.subs(dict_val)
print(f'd = {d} > 0 and f_xx{crit_pts[0]} = {f_xx.subs(dict_val)} < 0. We conclude that {crit_pts[1]} is a local maximum')

Case 1: critical point = (0, 0)
d = -9 < 0 and f_xx(0, 0) = 0. We conclude that (0, 0) is a saddle point

Case 2: critical point = (3, 3)
d = 315 > 0 and f_xx(0, 0) = -18 < 0. We conclude that (3, 3) is a local maximum


### Practice example
* Find the extremal points of the function $g(x, y) = 2xy + 1 - \frac{1}{2}\left(x^4 + y^4\right)$

In [13]:
# Define the function
g = 2*x*y + 1 - (1/2)*(x**4 + y**4)


# Partial derivatives
g_x = g.diff(x)
g_y = g.diff(y)
g_xx = g.diff(x, 2)
g_yy = g.diff(y, 2)
g_xy = g.diff(x, y)


# Find critical points
eq_1 = sp.Eq(g_x, 0)
eq_2 = sp.Eq(g_y, 0)
crit_pts = sp.solve((eq_1, eq_2), (x, y))
crit_pts = [(int(crit_pts[i][0]), int(crit_pts[i][1])) for i in range(3)]
print(f'Critical points are: {crit_pts[0]}, {crit_pts[1]} and {crit_pts[2]}')


# Establishing the nature of the critical points
# Working with the first critical point
print(f'\nCase 1: critical point = {crit_pts[0]}')
dict_val = {x:crit_pts[0][0], y:crit_pts[0][1]}
d = int(g_xx.subs(dict_val) * g_yy.subs(dict_val) - g_xy.subs(dict_val)**2)
print(f'd = {d} > 0 and g_xx{crit_pts[0]} = {int(g_xx.subs(dict_val))} < 0. We conclude that {crit_pts[0]} is a local maximum')

# Working with the second critical point
print(f'\nCase 2: critical point = {crit_pts[1]}')
dict_val = {x:crit_pts[1][0], y:crit_pts[1][1]}
d = g_xx.subs(dict_val) * g_yy.subs(dict_val) - g_xy.subs(dict_val)**2
print(f'd = {d} < 0 and g_xx{crit_pts[1]} = {g_xx.subs(dict_val)}. We conclude that {crit_pts[1]} is a saddle point')

# Working with the third critical point
print(f'\nCase 3: critical point = {crit_pts[2]}')
dict_val = {x:crit_pts[2][0], y:crit_pts[2][1]}
d = int(g_xx.subs(dict_val) * g_yy.subs(dict_val) - g_xy.subs(dict_val)**2)
print(f'd = {d} > 0 and g_xx{crit_pts[2]} = {int(g_xx.subs(dict_val))} < 0. We conclude that {crit_pts[2]} is a local maximum')

Critical points are: (-1, -1), (0, 0) and (1, 1)

Case 1: critical point = (-1, -1)
d = 32 > 0 and g_xx(-1, -1) = -6 < 0. We conclude that (-1, -1) is a local maximum

Case 2: critical point = (0, 0)
d = -4 < 0 and g_xx(0, 0) = 0. We conclude that (0, 0) is a saddle point

Case 3: critical point = (1, 1)
d = 32 > 0 and g_xx(1, 1) = -6 < 0. We conclude that (1, 1) is a local maximum


In [14]:
g_xx, g_yy, g_xy

(-6.0*x**2, -6.0*y**2, 2)