# Unit 2: Geometry of Derivatives

## Vector: a mathematical object determined by its length and direction.
### Our goal in this unit is to develop the multivariable analogue of the derivative, which is not the partial derivative, but a vector (whose components are partial derivatives) called the ***gradient***.

## Definition 2.1 A vector is something that has both magnitude (length) and direction.

# $$ \vec{v} = \langle v_1, v_2 \rangle = v_1 \hat{i} + v_2 \hat{j} $$
### where $\hat{i}$, $\hat{j}$ are unit vectors

Example:

### $$ \vec{v} = \langle -1, 1 \rangle  $$
### $$ \vec{w} = \langle -1, 1 \rangle  $$
then
### $$ \vec{v} = \vec{w}  $$


## Definition 4.1 The magnitude (or norm, or Euclidean norm, or length) of a vector $\vec{v}$ is equal to its length and is denoted by $\lvert{\vec{v}}\rvert$.

### the magnitude of the vector $\vec{v} = \langle v_1, v_2 \rangle$ is given by
# $$ \lvert \vec{v} \rvert = \sqrt{v_1^2 + v_2^2} $$

## Definition 5.1 A scalar is a (real or complex) number. 

### To multiply a vector $\vec{v} = \langle -1, 1 \rangle$ by a scalar $c$, we multiply each component by $c$ as follows:

# $$ c \vec{v} = c \langle v_1, v_2 \rangle = \langle c v_1, c v_2 \rangle $$

### If $\vec{v}$ is in the same direction as $\vec{w}$, then $\vec{v} = \lambda \vec{w}$ for some positive number $\lambda$.
### If $\vec{v}$ is in the opposite direction as $\vec{w}$, then $\vec{v} = \lambda \vec{w}$ for some negatice number $\lambda$.

## Definition 6.1 A vector $\vec{v}$ is a unit vector if $\lvert \vec{v} \rvert = 1$

### Note on notation: If $\vec{v}$ is any vector, then $\hat{v}$ is a unit vector pointing in the same direction as $\vec{v}$.

### Definition 7.1 The vector $\vec{v}$ has magnitude $\lvert \vec{v} \rvert$ and direction $\hat{v}$.

![Tangent Plane](img/vector-with-angle.png)

## For vector $\vec{v} = \langle v_1, v_2 \rangle$:
# $$ \theta = \operatorname{arctan} ( \frac{v_2}{v_1})  $$
# $$ \vec{v} = \lvert \vec{v} \rvert \hat{v} = \lvert \vec{v} \rvert \langle \operatorname{cos}(\theta), \operatorname{sin} (\theta) \rangle $$
# $$ \hat{v} = \frac{\vec{v}}{\lvert \vec{v} \rvert} $$

### Geometrically, we can add two vectors  and  by making a copy of one vector and putting it at the end of the other
![Tangent Plane](img/vector-sum-1.png)
### The sum of the two vectors is the vector that starts at the tail of $\vec{v}$ and ends at the tip of the copy of $\vec{w}$.
### By considering the components of these vectors, we can see how this picture corresponds to a computation.

![Tangent Plane](img/vector-sum-2.png)

# $$ \vec{v} + \vec{w} = \langle v_1 + w_1, v_2 + w_2 \rangle $$

### Vector addition is commutative:
# $$ \vec{v} + \vec{w} = \vec{w} + \vec{v} $$

### Vector subtruction:
# $$ \vec{v} - \vec{w} = \langle v_1 - w_1, v_2 - w_2 \rangle = \vec{v} + (-1) \vec{w} $$

## Definition 11.1 The dot product of two vectors $\vec{v} = \langle v_1, v_2 \rangle$ and $\vec{w} = \langle w_1, w_2 \rangle$ is defined as

# $$ \vec{v} \cdot \vec{w} = \langle v_1, v_2 \rangle \cdot \langle w_1, w_2 \rangle = v_1 w_1 + v_2 w_2 $$



![Tangent Plane](img/dot-product-1.png)

# $$ \vec{v} \cdot \vec{w} = \lvert \vec{v} \rvert \lvert \vec{w} \rvert \operatorname{cos} (\theta) $$
# $$ \operatorname{cos} (\theta) = \frac{\vec{v} \cdot \vec{w}} {\lvert \vec{v} \rvert \lvert \vec{w} \rvert} $$

## Theorem: A vector $\vec{v}$ is perpendicular to a vector $\vec{w}$ if and only if $\vec{v} \cdot \vec{w} = 0$.

### Notation: $ \vec{0} $ - vector of length zero - $ \langle 0, 0 \rangle$

### Suppose we have three points in the plane: $P$, $Q$, and $R$. We write $\vec{P Q}$ as the vector that connects $P$ to $Q$. Note that the vector that connects $Q$ to $P$ is the vector $\vec{Q P}$ so that $\vec{P Q} + \vec{Q P} = \vec{0}$ as vectors.
### For example, if $P$ is the point $\langle 1, 1 \rangle$ and $Q$ is the point $\langle 3, 2 \rangle$, then the vector $\vec{P Q}$ is given in coordinates by the vector that connects the point $P$ to the point $Q$, which is $ \langle 3 - 1, 2 - 1 \rangle = \langle 2, 1 \rangle $.

## The vector $ \langle a, b \rangle $ is normal to the line $a x + b y + c = 0$.

## Decomposing vector $\vec{u}$ into vectors $\vec{a}$ and $\vec{b}$

![Tangent Plane](img/vector-decomposition.png)

# $$ \vec{v} = \vec{a} + \vec{b} $$
### where

### - $\vec{a}$ is the component of $\vec{u}$ in the $\vec{u}$ direction, and
### - $\vec{b}$ is the component of $\vec{u}$ perpendicular to the $\vec{u}$ direction

# $$ \vec{a} = \left( \frac{\vec{u} \cdot \vec{v}}{\vec{u} \cdot \vec{u}} \right) \vec{u} $$
# $$ \vec{b} = \vec{v} - \vec{a} $$

## For unit vector $\hat{u}$:
# $$ \vec{a} = \left( \hat{u} \cdot \vec{v} \right) \hat{u} $$

## At any point $(x_0, y_0)$, the vector $\langle f_x(x_0, y_0), f_y(x_0, y_0) \rangle$ is perpendicular to the level curve of $f$ through $(x_0, y_0)$.

## Definition 4.1
## The vector $\langle f_x, f_y \rangle$ is called the gradient of $f$.
## The notation for the gradient of $f$ is $\nabla f$.

## Definition 6.1

## A vector field on the plane is a function that attaches a vector to each point $(x, y)$ in the plane.

### Equivalent definitions:

- A vector field is a function $F$ that maps each point in the plane to a vector:
# $$ F(x, y) = \langle F_1(x, y), F_2(x, y) \rangle $$
 
- A ***vector field*** is sometimes called a ***vector-valued function***.

## Example: 
### The gradient exists at all points where the multivariable function $f$ is defined and differentiable. Therefore we can think of the gradient as a function that attached a vector to every point $(x, y)$. In other words, it is a vector field.

![Tangent Plane](img/vector-field-gradient.png)

#### Gradients are important because they tell us how quickly a function changes. In data science, machine learning algorithms refine models by finding the direction that minimizes an error function of the model's predictive capabilities in the fastest way possible. The key element to finding the direction which minimizes the error function fastest is the gradient.

## Theorem

## - The gradient of $f$ is normal to the level curves of $f$.
## - $\nabla f$ points in the direction of steepest increase.
## - $ \lvert \nabla f \rvert $ is the "slope" of the tangent plane to $f$ in the direction of the gradient.

# $$ f(x_0 + \Delta x, y_0 + \Delta y) \approx f(x_0, y_0) + f_x(x_0, y_0) \Delta x + f_y(x_0, y_0) \Delta y $$
# $$ \vec{\Delta } = \langle \Delta x, \Delta y \rangle $$
# $$ \nabla f = \langle f_x(x_0, y_0), f_y(x_0, y_0) \rangle $$
# $$ f(x_0 + \Delta x, y_0 + \Delta y) \approx f(x_0, y_0) + \nabla f \cdot \vec{\Delta}  $$

## Linear approximation in the $y$ direction
# $$ f(x, y+ \Delta y) \approx f(x, y) + f_y(x, y) \Delta y \text{ (3.98)} $$
## Linear approximation in the $\Delta s =\hat{u}$ direction
### Now suppose we are at the point $(x, y)$ and we want to move a small amount $\Delta s$ in the direction of an arbitrary unit vector $\hat{u}$. We will denote this move by the vector $\vec{s}$ which lies along the unit vector $\hat{u} = \langle u_1, u_2 \rangle$ and has magnitude $\Delta s$.
![Tangent Plane](img/directional-derivative-1.png)
### Because $\vec{s}$ is parallel to $\hat{u}$, we know that $\vec{s}$ is a scalar multiple of $\hat{u}$. Since $\lvert \vec{s} \rvert = \Delta s$, we can say $\vec{s} = (\Delta s) \hat{u}$. Then
# $$ \langle x, y \rangle + \vec{s} = \langle x, y \rangle + ( \Delta s ) \hat{u} \text{ (3.100) } $$
# $$ = \langle x, y \rangle + \langle u_1 \Delta s, u_2 \Delta s \rangle  \text{ (3.101)}  $$
# $$ = \langle x + u_1 \Delta s, y + u_2 \Delta s \rangle \text{ (3.102) }  $$
![Tangent Plane](img/directional-derivative-2.png)
### Now we want to figure out how $f(x, y)$ changes when we move from $(x, y)$ to a point that is $\Delta s$ units along $\hat{u}$. This puts us at the end of the vector
# $$ \langle x, y \rangle + \vec{s} = \langle x + u_1 \Delta s, y + u_2 \Delta s \rangle $$
### So the quantity we want to approximate is
# $$ f(x + u_1 \Delta s, y + u_2 \Delta s) \text{ (3.104)} $$
### This gives
# $$ f(x + u_1 \Delta s, y + u_2 \Delta s) \approx f(x, y) + f_x u_1 \Delta s + f_y u_2 \Delta s \text{ (3.105)} $$
# $$ = f(x, y) + (f_x u_1  + f_y u_2) \Delta s \text{ (3.106)} $$
# $$ = f(x, y) + D_{\hat{u}} f \Delta s $$

## Directional derivative in the $\hat{u}$ direction
### By thinking of the directional derivative as the rate of change of $f$ when we move a distance $\Delta s$ in the direction of $\hat{u}$, we have

# $$ D_{\hat{u}}f(x, y) = f_x u_1 + f_y u_2 \text{ (3.107)} $$
### Notice that we can also write this as the dot product
# $$ D_{\hat{u}}f(x, y) = \nabla f \cdot \hat{u} \text{ (3.108) } $$

## Definition 3.1
## The directional derivative of a function $f(x, y)$ in the direction of the unit vector $\hat{u}$ at the point $(x, y)$ is given by

# $$ D_{\hat{u}}f(x, y) = \nabla f \cdot \hat{u} $$

## Properties of the directional derivative

### The directional derivative behaves in nearly identical ways as partial derivatives, and satisfies the same general properties.

### - For a function $f$ and a real number $c$, we have $ D_{\vec{u}} (c f) = c D_{\vec{u}} f$ 
### - For functions $f$ and $g$, we have $ D_{\vec{u}} (f + g) = D_{\vec{u}} f + D_{\vec{u}} g $
### - For functions $f$ and $g$, we have $ D_{\vec{u}} (f g) = g D_{\vec{u}} f + f D_{\vec{u}} g$