### Derivatives & Differentiation

#### Derivative

- The shape of the graph generated by a nonlinear function is a curve (any shape other than a straight line). 
- To approximate the amount of change at any point on the curve, we can draw a tangent.
- A tangent is a line that touches a curve at one and only one point
- The derivative is the measurement of the slope of the line at that point
- So the derivate tells us the slope of a function at some point

**The derivative of function $f$ at a point** $x = a$ \
denoted by $f^\prime(a)$ is $f^\prime(a) = \lim\limits_{h \to 0} \frac{f(a+h)-f(a)}{h}$ \
Provided the limit exists. If the limit exists, we say $f$ is differentiable at $x = a$


For example: \
$f(x) = x^2 + 1$ and $a = 2$

\begin{aligned}
f^\prime(a) = \lim\limits_{h \to 0} \frac{f(a+h)-f(a)}{h} \\
f^\prime(2) = \lim\limits_{h \to 0} \frac{f(2+h)-f(2)}{h} \\
f^\prime(2) = \lim\limits_{h \to 0} \frac{((2+h)^2 + 1)-(2^2 + 1)}{h} \\
f^\prime(2) = \lim\limits_{h \to 0} \frac{4 + 4h + h^2 + 1-5}{h} \\
f^\prime(2) = \lim\limits_{h \to 0} \frac{4h + h^2}{h} = \lim\limits_{h \to 0} \frac{h(4+h)}{h} \\
f^\prime(2) = \lim\limits_{h \to 0} (4 + h) = 4 \\
\end{aligned}

**Note**: *There is an easier way for calculating derivatives*

##### Application of Derivatives in Machine Learning
- Gradient Descent: the optimization technique based on derivatives and iteration that is used to minimize or maximize a set of parameters against an objective.
- Other application includes: Linear regression, Logistic regression, and Neural Networks

The derivate of $f(x)$ i.e. $f^\prime(x)$ can also be expressed as $\frac{dy}{dx}$. It tells us slope of the line tangent to $f(x)$ at every point.

Because $f^\prime(x)$ is a function of $x$, it also has a derivative called the **Second Derivative** expressed as $f^{\prime \prime}(x)$ or $\frac{d^2y}{dx^2}$. This tells us how $f^\prime(x)$ changes at each $x$.

Higher-Order derivatives: second derivative and higher are referred to as higher-order derivatives


#### Seven Basic Rule of Derivatives

##### 1. The Constant Rule: 
The derivative of a constant $c$ is zero. For example: \
Let $c$ be a constant, if $f(x) = c$, then $f^\prime(x) = 0$

##### 2. The Power Rule:

To calculate the derivative of a power of $x$, where $n$ is an exponent. Let $n$ be a positive integer, then if $f(x) = x^n$, then $f^\prime(x) = nx^{n-1}$. For example

- $f(x) = x^4$, therefore $f^\prime(x) = 4x^3$
- $f(x) = x^7$, therefore $f^\prime(x) = 7x^6$
- $f(x) = x^1$, therefore $f^\prime(x) = 1x^0 = 1$

The Power Rule can be broken down into these three steps:
- Take the power and put it in front of the coefficient
- Reduce the power by 1
- Multiply 

##### 3. The Constant Multiple Rule:

For any function $f$ and any constant $c$:

\begin{aligned}
\frac{d}{dx}\left[ cf(x) \right] &= c \frac{d}{dx}\left[ f(x) \right] \\
\end{aligned}

For example:
- $f(x) = 5x^2$, therefore $f^\prime(x) = 5 \times 2 \times x^{2-1} = 10x$
- $f(x) = 12x^1$, therefore $f^\prime(x) = 12 \times 1 \times x^{1-1} = 12 \times 1 \times 1 = 12$

Note that constants in more advanced problems can appear as c, k, or (a,b,c) and same rule will apply as if they're numbers.

##### 4. The Sum Rule:

For any functions $f$ and $g$:

\begin{aligned}
\frac{d}{dx}\left[ f(x)+g(x) \right] &= \frac{d}{dx}\left[ f(x) \right] + \frac{d}{dx}\left[ g(x) \right] \\
\end{aligned}

For example:
- $f(x) = 5x^2, g(x) = 4x$, therefore $\frac{d}{dx}\left[ f(x)+g(x) \right] = \frac{d}{dx}\left[ 5x^2 \right] + \frac{d}{dx}\left[ 4x \right] = 10x + 4$
- $f(x) = 2x^5 - 4x^4 + 2x^3 - 5x^2 + 3x$, therefore $f^\prime(x) = 10x^4 - 16x^3 + 6x^2 - 10x + 3$

Hence we can say that the derivate of a polynomial of degree $n$ is another polynomial of degree $n - 1$

In [16]:
from sympy import *
import numdifftools as nd

In [17]:
x = Symbol('x')             # To define variables we must use symbols
Fx = 7 * x ** 3 + x**2 + 1
Derivative(Fx, x)

Derivative(7*x**3 + x**2 + 1, x)

In [18]:
# You have to call the doit() function to calculate the derivative using sympy
Derivative(Fx, x).doit()

21*x**2 + 2*x

In [20]:
# To calculate a derivate at a specific point
def f(x):
    return 7 * x ** 3 + x**2 + 1

# Use numdifftools to calculate the derivative when x = 1
derivative = nd.Derivative(f)
x_value = 1.0
result = derivative(x_value)
print(result)

23.0


##### 5. The Product Rule:

\begin{aligned}
\frac{d}{dx}\left( f(x)g(x) \right) &= \frac{d}{dx} f(x)g(x) + f(x)\frac{d}{dx}g(x) \\
\end{aligned}

The derivative of the product is the derivative of the first function multiplied by the second function plus the derivative of the second function multiplied by the first function

For example:
- $f(x) = x^2 - 9$ and $g(x) = x + 1$

\begin{aligned}
\frac{d}{dx}\left( f(x)g(x) \right) &= (2x) . (x+1) + (x^2-9) . 1 = 3x^2 + 2x - 9 \\
\end{aligned}

- $f(x) = x^4 - 2x^2 + 1 $ and $ g(x) = 5x + 5$

\begin{aligned}
\frac{d}{dx}f(x) &= 4x^3 - 4x  \\
\frac{d}{dx}g(x) &= 5 \\
\end{aligned}

\begin{aligned}
\frac{d}{dx}\left( f(x)g(x) \right) &= (4x^3 - 4x).(5x + 5) + (x^4 - 2x^2 + 1).5 \\
\frac{d}{dx}\left( f(x)g(x) \right) &= 20x^4+20x^3-20x^2-20x+5x^4-10x^2+5 \\
\frac{d}{dx}\left( f(x)g(x) \right) &= 25x^4-20x^3-30x^2-20x+5 \\
\end{aligned}


##### 6. The Product Rule:

\begin{aligned}
\frac{d}{dx}\left( \frac{f(x)}{g(x)} \right) &= \frac{\frac{d}{dx}f(x)g(x) - f(x)\frac{d}{dx}g(x)}{g^2(x)}\\
\end{aligned}

For example:
- $y = \frac{3x + 1}{5x + 2}$ which means $f(x) = 3x + 1$ and $g(x) = 5x + 2$

\begin{aligned}
\frac{d}{dx}f(x) &= 3, \frac{d}{dx}g(x) = 5\\
\frac{d}{dx}\left( \frac{f(x)}{g(x)} \right) &= \frac{3.(5x+2) - (3x+1).5}{(5x+2)^2} = \frac{1}{(5x+2)^2}\\
\end{aligned}

- $y = \frac{5x^3}{x^2+4}$ which means $f(x) = 5x^3$ and $g(x) = x^2+4$

\begin{aligned}
\frac{d}{dx}f(x) &= 15x^2, \frac{d}{dx}g(x) = 2x\\
\frac{d}{dx}\left( \frac{f(x)}{g(x)} \right) &= \frac{15x^2(x^2+4)-5x^3(2x)}{(x^2+4)^2} = \frac{5x^4+60x^2}{(x^2+4)^2}\\
\end{aligned}


##### 7. The Chain Rule:

Sometimes we use functions as the input of other functions. \
If $p = p(c)$ and $c = c(m)$, then 

\begin{aligned}
\frac{dp}{dc} . \frac{dc}{dm} = \frac{dp}{dm} \\
\end{aligned}

We are making a chain of derivative relationships.

For example:

- $p(c) = -\frac{1}{7}c^2+c+\frac{1}{6}$ \
$c(m) = \frac{1}{3}m$

\begin{aligned}
p(c(m)) &= -\frac{1}{7}\left(\frac{1}{3}m\right)^2+\frac{1}{3}m+\frac{1}{6} = -\frac{1}{63}m^2+\frac{1}{3}m+\frac{1}{6} \\
\frac{dp}{dm} &= -\frac{2}{63}m+\frac{1}{3}  \\
\end{aligned}

Alternatively (a more easier way)

\begin{aligned}
\frac{dp}{dc} &= \frac{2}{7}c+1 \\
\frac{dc}{dm} &= \frac{1}{3} \\
\frac{dp}{dm} &= \frac{dp}{dc}.\frac{dc}{dm} = \left(-\frac{2}{7}c+1\right).\frac{1}{3} = -\frac{2}{21}c+\frac{1}{3} \\
\end{aligned}

If you want, you can substitute for c in the equation:

\begin{aligned}
\frac{dp}{dm} &= -\frac{2}{21}.\frac{1}{3}m+\frac{1}{3} \\
\frac{dp}{dm} &= -\frac{2}{63}.m+\frac{1}{3} \\
\end{aligned}


##### The Power Rule on A Function Chain:

\begin{aligned}
\frac{d}{dx}(f(x)^n) &= n(f(x)^{n-1}).\frac{d}{dx}(f(x)) \\
\end{aligned}

For example:
$ y = (4x-1)^3 $ \
$ f(x) = 4x - 1 $ and $ g(x) = x^3 $ \
$ y^\prime = 3.(4x-1)^2 . 4 $ \
$ y^\prime = 12.(4x-1)^2 $