# Mathematics for Machine Learning

## Session 17: Differentiation (cont.)

### Gerhard Jäger


December 19, 2024

---
<br><br>

<small>Most material taken from Chapter 2 of Keisler, H. Jerome. *Elementary Calculus: An Infinitesimal Approach*. 2012.</small><br>
<small>Applets programmed with the help of ChatGPT</small>

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from ipywidgets import interact


## Increment Theorem

Let $y = f(x)$. Suppose $f'(x)$ exists at a certain point $x$, and $\Delta x$ is infinitesimal. Then $\Delta y$ is infinitesimal, and:

$$
\Delta y = f'(x)\Delta x + \epsilon \Delta x
$$

for some infinitesimal $\epsilon$, which depends on $x$ and $\Delta x$.

'


### Proof

#### Case 1: $\Delta x = 0$
In this case, $\Delta y = f'(x)\Delta x = 0$, and we put $\epsilon = 0$.

#### Case 2: $\Delta x \neq 0$
Then:

$$
\frac{\Delta y}{\Delta x} \approx f'(x),
$$

so for some infinitesimal $\epsilon$:

$$
\frac{\Delta y}{\Delta x} = f'(x) + \epsilon.
$$

Multiplying both sides by $\Delta x$:

$$
\Delta y = f'(x)\Delta x + \epsilon \Delta x.
$$


## Differentials

**DEFINITION**

Suppose $y$ depends on $x$, $y = f(x)$.

(i) The **differential** of $x$ is the independent variable ${dx} = \Delta x$.  
(ii) The **differential** of $y$ is the dependent variable ${dy}$ given by  
$$
{dy} = f'(x) \mathrm{dx}.
$$

When ${dx} \neq 0$, the equation above may be rewritten as  
$$
\frac{{dy}}{{dx}} = f'(x).
$$

Compare this equation with  
$$
\frac{\Delta y}{\Delta x} \approx f'(x).
$$

The quotient ${dy}/{dx}$ is a very convenient alternative symbol for the derivative $f'(x)$.  

The differential ${dy}$ depends on two independent variables $x$ and ${dx}$. In functional notation,  
$$
{dy} = {d}f(x, {dx}),
$$
where $df$ is the real function of two variables defined by  
$$
{d}f(x, {dx}) = f'(x) {dx}.
$$


**THEOREM**

The derivative of a linear function is equal to the coefficient of $x$. That is,

$$
\frac{d(bx + c)}{dx} = b, \quad d(bx + c) = b \, dx.
$$

**PROOF** Let $y = bx + c$, and let $\Delta x \neq 0$ be infinitesimal. Then:

$$
y + \Delta y = b(x + \Delta x) + c,
$$

$$
\Delta y = (b(x + \Delta x) + c) - (bx + c) = b \Delta x,
$$

$$
\frac{\Delta y}{\Delta x} = \frac{b \Delta x}{\Delta x} = b.
$$

Therefore:

$$
\frac{dy}{dx} = \text{st}(b) = b.
$$


**THEOREM (Sum Rule)**

Suppose $u$ and $v$ depend on the independent variable $x$. Then for any value of $x$ where $\frac{du}{dx}$ and $\frac{dv}{dx}$ exist:

$$
\frac{d(u + v)}{dx} = \frac{du}{dx} + \frac{dv}{dx}, \quad d(u + v) = du + dv.
$$

In other words, the derivative of the sum is the sum of the derivatives.

---

**PROOF**  
Let $y = u + v$, and let $\Delta x \neq 0$ be infinitesimal. Then:

$$
y + \Delta y = (u + \Delta u) + (v + \Delta v),
$$

$$
\Delta y = [(u + \Delta u) + (v + \Delta v)] - [u + v] = \Delta u + \Delta v,
$$

$$
\frac{\Delta y}{\Delta x} = \frac{\Delta u + \Delta v}{\Delta x} = \frac{\Delta u}{\Delta x} + \frac{\Delta v}{\Delta x}.
$$

Taking standard parts:

$$
\text{st}\left(\frac{\Delta y}{\Delta x}\right) = \text{st}\left(\frac{\Delta u}{\Delta x} + \frac{\Delta v}{\Delta x}\right) = \text{st}\left(\frac{\Delta u}{\Delta x}\right) + \text{st}\left(\frac{\Delta v}{\Delta x}\right).
$$

Thus:

$$
\frac{dy}{dx} = \frac{du}{dx} + \frac{dv}{dx}.
$$


**THEOREM (Constant Rule)**

Suppose $u$ depends on $x$, and $c$ is a real number. Then for any value of $x$ where $\frac{du}{dx}$ exists:

$$
\frac{d(cu)}{dx} = c \frac{du}{dx}, \quad d(cu) = c \, du.
$$

---

**PROOF**  
Let $y = cu$, and let $\Delta x \neq 0$ be infinitesimal. Then:

$$
y + \Delta y = c(u + \Delta u),
$$

$$
\Delta y = c(u + \Delta u) - cu = c \, \Delta u,
$$

$$
\frac{\Delta y}{\Delta x} = \frac{c \, \Delta u}{\Delta x} = c \frac{\Delta u}{\Delta x}.
$$

Taking standard parts:

$$
\text{st}\left(\frac{\Delta y}{\Delta x}\right) = \text{st}\left(c \frac{\Delta u}{\Delta x}\right) = c \, \text{st}\left(\frac{\Delta u}{\Delta x}\right).
$$

Thus:

$$
\frac{dy}{dx} = c \frac{du}{dx}.
$$


**THEOREM (Product Rule)**

Suppose $u$ and $v$ depend on $x$. Then for any value of $x$ where $\frac{du}{dx}$ and $\frac{dv}{dx}$ exist:

$$
\frac{d(uv)}{dx} = u \frac{dv}{dx} + v \frac{du}{dx}, \quad d(uv) = u \, dv + v \, du.
$$

---

**PROOF**  
Let $y = uv$, and let $\Delta x \neq 0$ be infinitesimal. Then:

$$
y + \Delta y = (u + \Delta u)(v + \Delta v),
$$

$$
\Delta y = (u + \Delta u)(v + \Delta v) - uv = u \Delta v + v \Delta u + \Delta u \Delta v,
$$

$$
\frac{\Delta y}{\Delta x} = \frac{u \Delta v + v \Delta u + \Delta u \Delta v}{\Delta x} = u \frac{\Delta v}{\Delta x} + v \frac{\Delta u}{\Delta x} + \frac{\Delta u \Delta v}{\Delta x}.
$$

$\Delta u$ is infinitesimal by the Increment Theorem, whence:

$$
\text{st}\left(\frac{\Delta y}{\Delta x}\right) = \text{st}\left(u \frac{\Delta v}{\Delta x} + v \frac{\Delta u}{\Delta x} + \frac{\Delta u \Delta v}{\Delta x}\right),
$$

$$
= u \cdot \text{st}\left(\frac{\Delta v}{\Delta x}\right) + v \cdot \text{st}\left(\frac{\Delta u}{\Delta x}\right) + 0 \cdot \text{st}\left(\frac{\Delta u}{\Delta x}\right).
$$

Thus:

$$
\frac{dy}{dx} = u \frac{dv}{dx} + v \frac{du}{dx}.
$$


## Example: $y = x^4$

We can find the result by using $\frac{dx^2}{dx} = 2x$ and the product rule.

$$
\begin{align}
\frac{dx^4}{dx} &= \frac{d(x^2 \times x^2)}{dx}\\
&= x^2\frac{dx^2}{dx} + x^2\frac{dx^2}{dx}\\
&= 2(x^2\frac{dx^2}{dx})\\
&= 2(x^2 \times 2x)\\
&= 4x^3
\end{align}
$$

# Generalization: $y = x^n$ for positive integers $n$

From the examples we have seen $(x^2, x^3, x^4)$, it seems there is a general rule

$$
\frac{dx^n}{dx} = nx^{n-1}
$$

Let's prove this. We will do this via **induction**.

## PRINCIPLE OF INDUCTION

*Suppose a statement P(n) about an arbitrary integer n is true when n = 1. Suppose further that for any positive integer m such that P(m) is true, P(m + 1) is also true. Then the statement P(n) is true of every positive integer n.*


- $n=1$

$$
\begin{align}
\frac{dx^1}{dx} &= 1 \\
&= 1x^0
\end{align}
$$

- $n>1$

    Suppose it holds that $\frac{dx^n}{dx} = nx^{n-1}$. We have to show that the law also holds for $n+1$, i.e., that $\frac{dx^{n+1}}{dx} = (n+1)x^n$.
    
    $$
    \begin{align}
    \frac{dx^{n+1}}{dx} &= \frac{d(x\times x^{n})}{dx}\\
    &= x\times nx^{n-1} + x^n & \text{(product rule)}\\
    &= nx^n + x^n\\
    &= (n+1)x^n
    \end{align}
    $$

**THEOREM (Quotient Rule)**

Suppose $u, v$ depend on $x$. Then for any value of $x$ where $\frac{du}{dx}$, $\frac{dv}{dx}$ exist and $v \neq 0$:

$$
\frac{d\left(\frac{u}{v}\right)}{dx} = \frac{v \frac{du}{dx} - u \frac{dv}{dx}}{v^2}, \quad d\left(\frac{u}{v}\right) = \frac{v \, du - u \, dv}{v^2}.
$$




**PROOF**  

$$
\begin{align}
y &= \frac{u}{v}\\
y + \Delta y &= \frac{u+\Delta u}{v+\Delta v}\\
\Delta y &= \frac{u+\Delta u}{v+\Delta v} - \frac{u}{v}\\
&= \frac{v(u+\Delta u) - u(v+\Delta v)}{v(v+\Delta v)}\\
&= \frac{uv+v\Delta u - uv - u \Delta v}{v^2+v\Delta v}\\
&= \frac{v\Delta u  - u \Delta v}{v^2+v\Delta v}\\
\frac{\Delta y}{\Delta x} &= \frac{v\frac{\Delta u}{\Delta x}  - u \frac{\Delta v}{\Delta x}}{v^2+v\Delta v}\\
\frac{dy}{dx} &= st\frac{\Delta y}{\Delta x}\\
&= st\frac{v\frac{\Delta u}{\Delta x}  - u \frac{\Delta v}{\Delta x}}{v^2+v\Delta v}\\
&= \frac{v\frac{d u}{d x}  - u \frac{d v}{d x}}{v^2}\\
\end{align}
$$

**THEOREM (Chain Rule)**

Suppose $u$ depends on $v$, and $v$ depends on $x$. Then for any value of $x$ where $\frac{du}{dv}$ and $\frac{dv}{dx}$ exist:


$$
\begin{align}
y &= u(v(x))\\
y + \Delta y &= u(v(x+\Delta x))\\
\end{align}
$$

**PROOF**


According to the Increment Theorem:
$$
v(x+\Delta x) = v(x) + \frac{dv}{dx}\Delta x + \varepsilon \Delta x
$$
for some infinitesimal $\varepsilon$. Therefore
$$
\begin{align}
y + \Delta y &= u(v(x) + \frac{dv}{dx}\Delta x + \varepsilon\Delta x)
\end{align}
$$
The Increment Theorem also entails
$$
u(z + \Delta z) = u(z) + \frac{du}{dz}\Delta z + \delta \Delta z
$$
for some infinitesimal $\delta$. If we let $z = v(x)$ and $\Delta z=\frac{dv}{dx}\Delta x + \varepsilon\Delta x$ , we get
$$
\begin{align}
y + \Delta y &= u(v(x)) + \frac{du}{dz}(\frac{dv}{dx}\Delta x + \varepsilon\Delta x) + \delta (\frac{dv}{dx}\Delta x + \varepsilon\Delta x)\\
\Delta y &= \frac{du}{dv}(\frac{dv}{dx}\Delta x + \varepsilon\Delta x) + \delta (\frac{dv}{dx}\Delta x + \varepsilon\Delta x)\\
\frac{\Delta y}{\Delta x} &= \frac{du}{dv}(\frac{dv}{dx} + \varepsilon) + \delta (\frac{dv}{dx} + \varepsilon)\\
\frac{dy}{dx} &=st\frac{\Delta y}{\Delta x}\\
&= st(\frac{du}{dv}(\frac{dv}{dx} + \varepsilon) + \delta (\frac{dv}{dx} + \varepsilon))\\
&= \frac{du}{dv}\frac{dv}{dx}
\end{align}
$$


