# Derivatives in Tensorflow

```
----------------------------------------------------------------------
Filename : 01_intro_to_derivatives.ipynb
Date     : 15th March, 2017
Author   : Jaidev Deshpande
Purpose  : Intuition behind derivatives of functions and
           implementation in tensorflow.
Libraries: Tensorflow and its dependencies
----------------------------------------------------------------------
```

In [None]:
import numpy as np
import tensorflow as tf
import matplotlib as mpl
import matplotlib.pyplot as plt
mpl.rcParams['figure.figsize'] = 8, 6
plt.style.use('ggplot')
%matplotlib inline

# Introduction to Derivatives

### What is a derivative of a function?
**A measure of how the function changes when changes are made to its _dependent_ variable(s).**

**When this dependent variable is time, this is also called the _rate of change_ of the function.**

## Example: $$ f(x) = 2x $$

In [None]:
xx = np.linspace(0, 100, 100)
yy = 2 * xx
plt.plot(xx, yy)
plt.vlines(70, 0, yy[70], linestyles="dashed", colors="g")
plt.vlines(85, 0, yy[85], linestyles="dashed", colors="g")
plt.hlines(yy[70], 0, 70, linestyles="dashed", colors="g")
plt.hlines(yy[85], 0, 85, linestyles="dashed", colors="g")
plt.xticks([70, 85], [r"$a$", r"$a + \Delta a$"], fontsize=12, color="k")
plt.yticks([yy[70], yy[85]], [r"$f(a)$", r"$f(a + \Delta a)$"], fontsize=12, color="k")
plt.xlabel(r'$x$', fontsize=16, color="k")
plt.ylabel(r'$f(x)$', fontsize=16, color="k")

## Derivative of $f$
### also called _slope_ or _gradient_ of $f$

### $$ f'(x) = \frac{df}{dx} = \lim_{x \to 0}\frac{f(x + \Delta x) - f(x)}{\Delta x} $$

# Example: Derivative of the sigmoid function

In [None]:
def sigmoid(x):
    return 1.0 / (1 + np.exp(-x))

In [None]:
x = np.linspace(-6, 6, 100)
f = sigmoid(x)

plt.plot(x, f)

plt.xlabel(r'$x$', fontsize=20, color="k")
plt.ylabel(r'$\frac{1}{1 + e^{-x}}$', fontsize=20, color="k")

# Chain rule of differentiation:
# $$ \frac{d}{dx}[f(g(x))] = \frac{df}{dg}\frac{dg}{dx}$$

### Suppose $$g(x) = 1 + e^{-x}$$
### $$\therefore f(x) = \frac{1}{g}$$

### Thus by chain rule:
### $$f'(x) = \frac{df}{dg} g'(x)$$
### $$\therefore f'(x) = -\frac{df}{dg}e^{-x}$$
### $$\therefore f'(x) = -\frac{d}{dg}\frac{1}{g}e^{-x}$$
### $$\therefore f'(x) = \frac{1}{g^{2}}e^{-x}$$
### $$\therefore f'(x) = \frac{e^{-x}}{(1 + e^{-x})^{2}}$$
### Adding and subtracting unity from the numerator:
### $$f'(x) = \frac{1 + e^{-x} - 1}{(1 + e^{-x})^{2}}$$
### Splitting the fraction
### $$f'(x) = \frac{1 + e^{-x}}{(1 + e^{-x})^{2}} - \frac{1}{(1 + e^{-x})^{2}}$$
### Simplifying...
### $$f'(x) = \frac{1}{1 + e^{-x}} - \frac{1}{(1 + e^{-x})^{2}}$$
### $$f'(x) = \frac{1}{1 + e^{-x}}\bigg(1 - \frac{1}{1 + e^{-x}}\bigg)$$
### Substituting for sigmoid function:
### $$f'(x) = g(1 - g)$$

In [None]:
x = np.linspace(-6, 6, 100)
f = sigmoid(x)
df_dx = f * (1 - f)

plt.plot(x, f, label=r'$f(x)$')
plt.plot(x, df_dx, label=r'$\frac{df}{dx}$')

plt.xlabel(r'$x$', fontsize=20, color="k")
plt.legend()

# Enter Tensorflow

In [None]:
def sigmoid_tf(x):
    return 1 / (1 + tf.exp(-x))

xx = tf.constant(np.linspace(-6, 6, 100))
f = sigmoid_tf(xx)
df_dx = tf.gradients(f, xx)

## What Went Wrong?

In [None]:
with tf.GradientTape() as g:
    g.watch(xx)
    f = sigmoid_tf(xx)
df_dx = g.gradient(f, xx)

In [None]:
X = xx.numpy()
S = sigmoid_tf(xx).numpy()
dS = df_dx.numpy()
plt.plot(X, S, label=r'$f(x)$')
plt.plot(X, dS, label=r'$\frac{df}{dx}$')

plt.xlabel(r'$x$', fontsize=20, color="k")
plt.legend()

# Exercise: Plot the Hyperbolic Tangent function and its derivative
## $$tanh(x) = \frac{e^{x} - e^{-x}}{e^{x} + e^{-x}}$$

In [None]:
# Enter code here

# Multivariate Functions

## Example: $f(x, y) = x^{2}y + y$

In [None]:
xx = np.linspace(0, 10, 100)
yy = np.linspace(0, 10, 100)
X, Y = np.meshgrid(xx, yy)
f = Y * (X ** 2 + 1)

from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_surface(X, Y, f, cmap=plt.cm.coolwarm)
ax.set_zticklabels([])
ax.set_yticklabels([])
ax.set_xticklabels([])
ax.set_xlabel(r'$x$', fontsize=16, color="k")
ax.set_ylabel(r'$y$', fontsize=16, color="k")
ax.set_zlabel(r'$x^{2}y + y$', fontsize=20, color="k", labelpad=0)
ax.autoscale_view()
plt.tight_layout()

## Introducing Partial Derivatives
### A partial derivative of a multivariate function $f(x_{1}, x_{2}, ...)$, w.r.t. to one of it's dependent variables, say $x_{1}$ is derivative of $f$ w.r.t. $x_{1}$ assuming $x_{k} \forall k \neq 1$ to be constant.

### How many such derivatives?
### Thus, a partial derivative is always a vector.

### Given $f(x, y) = x^{2}y + y$
### Partial derivative of $f$ w.r.t $x$ is $\frac{\partial{f}}{\partial{x}}$
### Partial derivative of $f$ overall is $\nabla{f}$
### $$\nabla{f} = \begin{bmatrix}
\frac{\partial{f}}{\partial{x}}\\
\frac{\partial{f}}{\partial{y}}
\end{bmatrix}$$
### By derivation,
### $$\frac{\partial{f}}{\partial{x}} = 2xy$$
### $$\frac{\partial{f}}{\partial{y}} = x^{2} + 1$$
### Thus
### $$\nabla{f} = \begin{bmatrix}
2xy\\
x^{2} + 1
\end{bmatrix}$$


In [None]:
del_f = np.c_[2 * xx * yy, (xx ** 2) + 1].T

In [None]:
def func(x, y):
    return tf.pow(x, 2) * y + y

x = tf.constant(xx)
y = tf.constant(yy)
with tf.GradientTape() as g:
    g.watch(y)
    g.watch(x)
    f = func(x, y)
dx, dy = g.gradient(f, [x, y])

In [None]:
np.allclose(np.c_[dx.numpy(), dy.numpy()], del_f.T)

In [None]:
fig = plt.figure()
ax = fig.add_subplot(111, projection="3d")
ax.plot(xx, yy, dx.numpy(), label=r"$2xy$")
ax.plot(xx, yy, dy.numpy(), label=r"$x^{2} + 1$")
ax.legend()

# Exercise: $f(x, y) = \sin(\sqrt{x^{2} + y^{2}})$
# 1. Plot surface of $f$
# 2. Find and plot$\nabla{f}$

In [None]:
# Enter code here