In [1]:
%matplotlib inline
import numpy as np
from matplotlib_inline import backend_inline
from d2l import torch as d2l

# Calculus
## Derivatives and Differentiation

Put simply, a derivative tells us whether the value of a function would increase or decrease were we to modify each parameter an infinitesimal amount.
Formally, for functions $f: \mathbb{R} \rightarrow \mathbb{R}$ (i.e. functions that map a real-valued scalar to another real-valued scalar), the derivative at a point x is 

$$\lim_{h \rightarrow 0} \frac{f(x + h)-f(x)}{h}$$

When $f'(x)$ exists, the function $f$ is said to be differentiable at x. When $f'(x)$ exists for all $x$ on a set (e.g. the interval $[a,b]$, we say that $f$ is differentiable on this set. 

Unfortunately, not all functions are differentiable, including many we wish to optimise against. Including "accuracy" (?) and the AUC of a classifier. When this is the case, we often instead optimise a differentiable _surrogate_.

In [4]:
# Define u = f(x) = 3x^2 - 4x

def f(x):
    return 3*x**2 - 4 * x

In [14]:
x = 1
for h in 10.0 ** np.arange(1, -7, -1):
    print(f"Function value: {f(x + h):3.5f}. Limit: {h:2.5f}. Derivative: {(f(x + h)- f(x))/h:2.5f}")

Function value: 319.00000. Limit: 10.00000. Derivative: 32.00000
Function value: 4.00000. Limit: 1.00000. Derivative: 5.00000
Function value: -0.77000. Limit: 0.10000. Derivative: 2.30000
Function value: -0.97970. Limit: 0.01000. Derivative: 2.03000
Function value: -0.99800. Limit: 0.00100. Derivative: 2.00300
Function value: -0.99980. Limit: 0.00010. Derivative: 2.00030
Function value: -0.99998. Limit: 0.00001. Derivative: 2.00003
Function value: -1.00000. Limit: 0.00000. Derivative: 2.00000


There are several equivalent notations for derivatives, given $y = f(x)$ the following statements are equivalent:

$$f'(x) = y' = \frac{dy}{dx} = \frac{df}{dx} = \frac{d}{dx}f(x) = Df(x) = D_x f(x)$$

Where the symbols $D$ and $\frac{d}{dx}$ are _differentiation operators_.

Some common derivatives are:

$$\frac{d}{dx} C = 0$$
$$\frac{d}{dx} x^n = nx^{n-1}$$
$$\frac{d}{dx} e^x = e^x$$
$$\frac{d}{dx} \ln x = 1/x$$

Functions composed of other differential functions are often differentiable themselves. There are a few rules for working with these, in the following examples, $f$ and $g$ are funcitons, while C is a constant

Constant Multiple Rule:
$$ \frac{d}{dx}[Cf(x)] = C \frac{d}{dx}f(x)$$

Sum Rule:
$$ \frac{d}{dx}[f(x) + g(x)] = \frac{d}{dx}f(x) + \frac{d}{dx}g(x)$$ 

Product Rule: 
$$ \frac{d}{dx}[f(x)g(x) = g(x)\frac{d}{dx}f(x) + f(x)\frac{d}{dx}g(x)$$ 

Quotient Rule
$$ \frac{d}{dx}[\frac{f(x)}{g(x)}] = \frac{g(x) \frac{d}{dx} f(x) + f(x) \frac{d}{dx} g(x)}{g^2(x)}$$

## Visualisation utilities