# Differential Calculus

## Limits

### Convergence
* $\epsilon$ (**epsilon**): An arbitrarily samll positive number

* Consider a sequence of numbers $a_1, a_2, a_3, \dots $, we say the **limit of the sequence** is $L$ if the numbers $a_n - L $ **converge** to zero, and we write: 

\begin{equation}
a_n \to L\quad \text{  or  }  \quad\lim a_n = L\quad  \text{  or  }\quad  \lim\limits_{n \to \infty} a_n = L
\end{equation}

* (Definition of **Convergence**) We say the numbers $a_n$ converge to $L$ if the following condition is statisfied

\begin{equation*}
             \text{  for any  }\;\epsilon \gt 0 \;\text{  there is an  }\; N \;\text{  such that  }\; |a_n - L| \lt\epsilon \quad\text{ if }\; n \gt N.
\end{equation*}

### The Limit of $f(x)$ as $x \to a$
#### epsilon-delta definition of limits
* We say the limit of a function $f(x)$ is $L$ when $x \to a$, i.e., $\lim \limits_{x\to a} f(x) = L$, if the following condition is satisfied

\begin{equation*}
\forall \epsilon \gt 0, \exists \delta \gt 0 \;\text{ such that }\; \forall x \in D \;\text{ if }\;0<|x-a|<\delta \;\text{ then }\;|f(x) - L| < \epsilon
\end{equation*}

Where $D$ is the domain of function $f(x)$.
* Or informally, A viable intuitive or provisional definition is that a "function $f$ approaches the limit $L$ near $a$, if we can make $f(x)$ as close as we like to $L$ by requiring that $x$ be sufficiently close to, but unequal to, $a$".

### The Squeeze Theorem:
Suppose $f(x) \le g(x) \le h(x)$ for $x$ near $a$. If $f(x) \to L$ and $h(x) \to L$ as $x \to a$, then the limit of $g(x)$ is also $L$.

## Continuous Functions
### Definition of Continuity
> The function $f(x)$ is continuous at a point $x_0$ of its domain if for every positive $\epsilon$ we can find a positive number $\delta$ such that $|f(x)-f(x_0)| \lt \epsilon$ for all values $x$ in the domain of $f$ for which $|x-x_0| \lt \delta$. 


### When $f(x)$ is **continuous** at $x=a$:
1. The number $f(a)$ exists ($f$ is defined at $a$)
1. The limit of $f(x)$ exists (it was called $L$)
1. The limit $L$ equals $f(a)$ ($f(a)$ is the right value)

The last two conditions can be combined as one: $f(x)\to f(a)\;as\;x \to a$

### If $f$ is continuous at every point where it is defined, it is a continuous function.

* N.B. However this definition makes $f(x) = \frac{1}{x}$ a continuous function although we know there's no finite limit at $x=0$. But since $f(x)$ is not defined at $x=0$, so its continuity can't fail. 

## Derivative
#### A central question of differential calculus is to know how fast the limit is approached. The speed of approach is exactly the information in the derivative.

### Differentiable Function
> Definition: A function $f(x)$ is differentiable at point $x$ if the limit $\frac{dy}{dx}=\lim_{h\to 0}\frac{f(x+h)-f(x)}{h}$ exists. Where $h$ can have any value $\neq0$ for which $x+h$ belongs to the domain of $f$. If $f$ is defined in a whole interval containing the point $x$ in its interior, then the limit must exist *independently of the manner in which $h$ tends to zero*.

* Differentiability expresses in a precise way what intuitively would be called **smoothness** of the graph of the function. 

#### Theorem: If a function is differentiable it is automatically continuous.
* At a point where $f(x)$ has a derivative, the function must be continuous at that point. 
* But $f(x)$ can be continuous but with no derivative at a point. 
  * $f(x)=|x|$ has no derivative at $x=0$.
  * $f(x) = x^{\frac{1}{3}}$ has infinite derivative at $x=0$.

## The three basic definitions of Calculus
* **Limit**: $n \to \infty \;or\; x \to a$
* **Continuity**: $at\; x=a$

\begin{equation}
f(x+\Delta x) - f(x) \to 0 \;as\; \Delta x \to 0
\end{equation}

* **Derivative**:  $at\; x=a$

\begin{equation}
\frac{f(x+\Delta x) - f(x)}{\Delta x} \to f'(x) \;as\; \Delta x \to 0
\end{equation}


## Directional Derivative
### Definition
The derivative of $f(x,y)$ in the direction $\vec{u}$ (a unit vector) at the point $P$ is $D_uf(P)$:

\begin{equation}
D_uf(P) = f'(P;\vec{u}) = \lim \limits_{\Delta s \to 0} \frac{\Delta f}{\Delta s}=\lim \limits_{\Delta s \to 0}\frac{f(P+\vec{u}\Delta s)-f(P)}{\Delta s}
\end{equation}

The step from $P=(x_0, y_0)$ has length $\Delta s$. It takes us to $(x_0 + u_1\Delta s, y_0 + u_2 \Delta s)$

### Calculation
The directional derivative $D_uf$ in the direction $\vec{u}=(u_1, u_2)$ equals

\begin{equation}
D_uf=\frac{\partial f}{\partial x}u_1 + \frac{\partial f}{\partial y}u_2
\end{equation}

N.B. Directional derivative is a scalar.

### The Gradient Vector
* The gradient of $f(x,y)$ is the vector whose components are $\frac{\partial f}{\partial x}$ and $\frac{\partial f}{\partial y}$:
\begin{equation}
grad f = \nabla f = \frac{\partial f}{\partial x}\vec{i} + \frac{\partial f}{\partial y}\vec{j}
\end{equation}
* The directional derivative is $D_u f=(\nabla f)\bullet \vec{u} = \nabla f^T \cdot \vec{u} $. 
* The level direction is perpendicular to $\nabla f$, since $D_uf=0$. 
* The first order approximation of $f(x)$ is:

\begin{align}
f(P+\vec{u}\Delta s) \approx f(P) + \Delta s \cdot f'(P; \vec{u}) \\
f(P+\vec{u}\Delta s) \approx f(P) + \Delta s \cdot \nabla f^T \cdot \vec{u}
\end{align}

* The slope $D_uf$ is largest when $u$ is parallel to $grad f$. That maximum slope is the length $|grad f| = \sqrt{f^2_x + f^2_y}$:

\begin{equation}
for \;\; \vec {u} = \frac{\nabla f}{|\nabla f|}, \;\; \text{the slope is }\; (\nabla f)\cdot \vec{u} = \frac{|\nabla f|^2}{|\nabla f|} = |\nabla f|
\end{equation}

N.B. This can be proved using [Cachy-Schwarz Inequality](https://en.wikipedia.org/wiki/Cauchy%E2%80%93Schwarz_inequality).

* Important point: The maximum of $(\nabla f)\cdot \vec{u}$ is the length $|\nabla f|$.

### Applications
* The existing of derivative at a point $P$ for function $f(P)$ indicates that if the derivative of $f(P)$ is negative at some direction, e.g. $f'(P;\vec{u}) \lt 0$, then there always exists a small enough $\Delta s$ such that $f(P+\Delta s \cdot \vec{u}) \lt f(P)$.
* The gradient vector can be used to find the steepest descent direction for a function, where the direction can be used to find the minimum of the function.

## The Chain Rule
* A composite function $z(x)$ is defined as $z(x) = f \circ g = f(g(x))$

### Chain Rule
* Suppose $g(x)$ has a derivative at $x$ and $f(y)$ has a derivative at $y=g(x)$. Then the derivative of $z=f(g(x))$ is:

\begin{equation}
\frac{dz}{dx} = \frac{dz}{dy} \frac{dy}{dx} = f'(g(x))g'(x).
\end{equation}

**The slope at $x$ is $\frac{df}{dy}$ (at $y$) times $\frac{dg}{dx}$ (at $x$).**

* **Very Important** The chain rule says the outside function must be evaluated at $y$ not at $x$.

* When $\Delta x$ approach zero, in the limit, 

\begin{equation}
\frac{\Delta z}{\Delta x} = \frac{\Delta z}{\Delta y} \frac{\Delta y}{\Delta x} \text{ becomes the chain rule } \frac{dz}{dx} = \frac{dz}{dy} \frac{dy}{dx}
\end{equation}


### Examples
* If $z=(\sin x)^2$ then $dz/dx = (2\sin x)(\cos x)$. Here $y = \sin x$ is inside.
* Suppose we have $g(x)$, and $f(x) = g(10x)$, then $\nabla f(x) = \nabla g(10x) \cdot 10$. Note the $\nabla g(x)$ has to be evaluated at $10x$, not $x$ here.

# References
1. [MIT Directional Directives](files/references/MIT18_02SC_notes_20_Directional Derivatives.pdf)
1. [Calculus, Gilbert Strang, MIT](files/references/Calculus_Gilbert Strang.pdf)