STAT 453: Deep Learning (Spring 2020)  
Instructor: Sebastian Raschka (sraschka@wisc.edu)  

Course website: http://pages.stat.wisc.edu/~sraschka/teaching/stat453-ss2020/  
GitHub repository: https://github.com/rasbt/stat453-deep-learning-ss20

In [1]:
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch

Sebastian Raschka 

CPython 3.7.5
IPython 7.10.2

torch 1.4.0


# Second Order Partial Derivatives (Optional)

For those who are interested, in a nutshell, the second order
partial derivative of a function is the partial derivative of the partial
derivative. For instance, we write the second derivative of a function $f$
with respect to $x$ as

\begin{equation}
\label{eq:partial-def}
\frac{\partial}{\partial x} \bigg(\frac{\partial f}{\partial x} \bigg) =
\frac{\partial^2 f}{\partial x^2}.
\end{equation}

For example, we compute the second partial derivative of a function
$f(x, y) = x^2y+y$ as follows:

\begin{equation}
\label{eq:partial-example}
\frac{\partial^2 f}{\partial x^2}  = \frac{\partial}{\partial x} \bigg( 
\frac{\partial}{\partial x} \big[ x^2y+y \big] \bigg) = \frac{\partial}{\partial x}  2xy  = 2y.
\end{equation}

Note that in the initial definition (1st equation)
and the example (2nd equation) both the first and second order
partial derivatives were computed with respect to the same input argument, $x$. However,
depending on what measurement we are interested in, the second order partial
derivative can involve a different input argument. For instance, 
given a multivariable function with two input arguments, we can in fact compute four
distinct second order partial derivatives:

\begin{equation}
\frac{\partial^2 f}{\partial x^2}, \; \frac{\partial^2 f}{\partial y^2}, \;
\frac{\partial^2 f}{\partial x \partial y}, \text{ and } \frac{\partial^2
f}{\partial y \partial x},
\end{equation}

where, for example, $\frac{\partial^2 f}{\partial y \partial x}$ is defined as

\begin{equation}
\frac{\partial^2 f}{\partial y \partial x} = \frac{\partial}{\partial y}
\bigg(\frac{\partial f}{\partial x} \bigg).
\end{equation}


### Example: First Order Derivative

In [2]:
import torch
from torch.autograd import grad

In [3]:
x = torch.tensor([3.], requires_grad=True)
y  = torch.tensor([4.])

f = x**2 * y + y

grad(f, x) # 2xy = 24

(tensor([24.]),)

### Example: Second Order Derivative

In [4]:
x = torch.tensor([3.], requires_grad=True)
y  = torch.tensor([4.])

f = x**2 * y + y

df_dx = grad(f, x, create_graph=True) # first order deriv
grad(df_dx, x) # 2y = 8

(tensor([8.]),)