## 0. Basics

** Continuity & Differentiability **

* Contiuity: A function $f$ is *continous at a* if $lim_{x\rightarrow a}f(x) = f(a)$.
* Differentiability: A function $f$ is *differentiable at a* if $f'(a)$ exists.
* If $f$ is differentiable at a, then $f$ is continuous at a.

** Root Finder (Newton-Raphson) **

* Question: Given a function $f(x)$ (e.g. $x^3-2x-5=0$), solve for $x$ (i.e. the root of a cubic is involved).
* Idea: The closer a point $x_i$ is to the real root $x$, which lies on the $x$-axis, the closer the $x$-intercept of the tangent at $x_i$ is to the real root.
* Algorithm:
    * Given a function $f(x)$, take a random guess $x_0$,
    * Set up the tangent of the next value $x_1$, i.e. $y-f(x_1)=f'(x_1)(x_2-x_1)$,
    * Find the $x$-intercept of the tangent by setting $y=0$, i.e. $0-f(x_1)=f'(x_1)(x_2-x_1)$,
    * Solve for $x_2$: $x_2=x_1-\frac{f(x_1)}{f'(x_1)}$,
    * Iterate by the recursion $x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)}$ until converge.

In [54]:
import theano.tensor as T
import numpy as np
from theano import shared, function
from __future__ import division

In [80]:
x = T.fscalar()
y1 = T.pow(x,3) - 2*x - 5
y2 = T.pow(x,6) - 2
f1 = function([x], y1)
f2 = function([x], y2)
grad1, grad2 = T.grad(y1, x), T.grad(y2, x)
f_grad1, f_grad2 = function([x],grad1), function([x],grad2)

In [84]:
def find_root(f, f_grad, x_0, prec=1e-8, verbose=0):
    x_0 = np.array(x_0,dtype='float32')
    count = 0
    while True:
        x_1 = x_0 - f(x_0)/f_grad(x_0)
        if abs(x_1-x_0) < prec:
            break
        x_0 = x_1
        count += 1
    if verbose: print "#iters = %d" % count
    print "%.10f" % x_0

In [85]:
find_root(f1,f_grad1,2, verbose=1)
find_root(f2,f_grad2,1, verbose=1)

#iters = 3
2.0945515633
#iters = 4
1.1224620342


** Basic Max & Min **

* The Extreme Value Theorem: If $f$ is continuous on a closed interval $[a,b]$, then $f$ attains an absolute maximum value $f(c)$ and an absolute minimum value $f(d)$ at some number $c$ and $d$ in $[a,b]$.
* Fermat's Theorem: If $f$ has a local maximum or minimum at $c$, and if $f'(c)$ exists (i.e. differentiable), then $f'(c)=0$.
* Critical Number: 
    * A critical number (i.e. "point of transition" or "breaking point") of a function $f$ is a number $c$ in the domain of $f$ such that either $f'(c)=0$ or $f'(c)$ does not exist. 
    * If $f$ has a local maximum or minimum at $c$, then $c$ is a critical number of $f$.
* Closed Interval Method (for finding absolute extreme of a continuous function $f$ on $[a,b]$): 
    * Find the values of $f$ at the critical numbers of $f$ in $(a,b)$,
    * Find the values of $f$ at the endpoints of the interval,
    * The largest of the values from the previous steps is the absolute maximum value (the smallest the absolute minimum).

## I. Series

** Definitions **

* [DEF] A *sequence* $\{a_n\}$ converges if $lim_{n\rightarrow\infty}a_n = L$ exists. It otherwise diverges.
* [DEF] Let $\sum_{n=1}^\infty a_n = a_1 + ...$, and $s_n=\sum_{i=1}^n a_i = a_1 + ... + a_n$, then $\sum_{n=1}^\infty a_n$ is convergent if $lim_{n\rightarrow\infty}s_n=s$ exists as a real number. Then $\sum_{n=1}^\infty a_n = s$. $s$ is the *sum of series*.
* [DEF] A *series* (the sum of an infinite sequence) $\sum a_n$ is *absolutely convergent* if the series of absolute values $\sum|a_n|$, it is *conditionally convergent* if it is convergent but not absolutely convergent.
* [DEF] A *power series about $a$* is in the form $\sum_{n=0}^\infty c_n(x-a)^n = c_0 + c_1(x-a) + c_2(x-2)^2 + ...$.



** Theorems **

* [THM] If $lim_{n\rightarrow\infty}f(x)=L$ and $f(n)=a_n$ when $n$ is an integer, then $lim_{n\rightarrow\infty}a_n=L$.
* [THM] If $lim_{n\rightarrow\infty}a_n=L$ and the function $f$ is continuous at $L$, then $lim_{n\rightarrow\infty}f(a_n) = f(L)$.
* [THM] Every bounded, monotonic sequence is convergent.
* [THM] If $\sum_{n=1}^\infty a_n$ is convergent, then $lim_{n\rightarrow\infty}a_n =0$.
* [THM] If $\sum_{n=1}^\infty a_n$ and $\sum_{n=1}^\infty b_n$ are convergent, so are $c\sum_{n=1}^\infty a_n$, $\sum_{n=1}^\infty (a_n+b_n)$ and $\sum_{n=1}^\infty (a_n-b_n)$.
* [THM] If a series $\sum a_n$ is abolustely convergent, then it is convergent.
* [THM] For a given power series $\sum_{n=0}^\infty c_n(x-a)^n$ there are only three possibilities:
    * The series converges only when $x=a$,
    * The series converges for all $x$,
    * There is a positive number $R$ s.t. the series converges if $|x-a|<R$ and diverges if $|x-a|>R$ ($R$ is the power series' *radius of convergence*).
* [THM] If the power series $\sum_{n=0}^\infty c_n(x-a)^n$ has radius of convergence $R>0$, then the function $f$ defined by $f(x) = c_0 + c_1(x-a) + c_2(x-2)^2 + ... = \sum_{n=0}^\infty c_n(x-a)^n$ is differentiable (and therefore continuous) on the interval $(a-R,a+R)$ and 
    * $f'(x) = c_1 + 2c_2(x-a) + 3c_3(x-a)^2 + ... = \sum_{n=1}^\infty nc_n(x-a)^{n-1}$,
    * $\int f(x)dx = C + c_0(x-a) + c_1\frac{(x-a)^2}{2} + c_2\frac{(x-a)^3}{3} + ... = C + \sum_{n=0}^\infty c_n\frac{(x-a)^{n+1}}{n+1}$.
* [THM] If $f$ has a power series representation (expansion) at $a$, that is , if $f(x) = \sum_{n=0}^\infty c_n(x-a)^n,|x-a|<R$, then its coefficient are given by the formula $c_n = \frac{f^{(n)}(a)}{n!}$ (pf. 777).
* [THM] If $f(x) = T_n(x)+R_n(x)$, where $T_n$ is the $n$th-degree Taylor polynomial of $f$ at $a$ and $lim_{n\rightarrow\infty}R_n(x) = 0$ for $|x-a|<R$,then $f$ is equal to the sum of its Taylor series on the interval $|x-a|<R$. (NB: this is for knowing if the Taylor approx. of a function is legitimate)
* [THM] *Taylor's Inequality*: If $|f^{(n+1)}(x)|\leq M$ for $|x-a|\leq d$, then the remainder $R_n(x)$ of the Taylor series satisfies the inequality $|R_n(x)| \leq\frac{M}{(n+1)!}|x-a|^{n+1}$ for $|x-a|\leq d$. (NB: used together with the previous theorem).

** Examples **

* [EXP] $\sum_{n=1}^\infty \frac{1}{n(n+1)} = 1$ (pf. 732).
* [EXP] *Geometric Series*: $\sum_{n=1}^\infty ar^{n-1} = a + ar + ar^2 + ...$. It is convergent if $|r| < 1$ and its sum is $\sum_{n=1}^\infty ar^{n-1} = \frac{a}{1-r},|r|<1$ (pf. 730).
* [EXP] *p-Series*: $\sum_{n=1}^\infty \frac{1}{n^p}$ is convergent if $p > 1$, divergent otherwise.
* [EXP] *Taylor Series*: $f(x) = \sum_{n=0}^\infty \frac{f^{(n)}(a)}{n!}(x-a)^n = f(a) + \frac{f'(a)}{1!}(x-a) + \frac{f''(a)}{2!}(x-a)^2 + ...$. (NB: this is for expanding and compute derivative/integral of functions which do not have elementary derivatives).
* [EXP] *Maclaurin Series*: $f(x) = \sum_{n=0}^\infty \frac{f^{(n)}(0)}{n!}x^n = f(0) + \frac{f'(0)}{1!}x + \frac{f''(0)}{2!}x^2 + ...$. 
* [EXP] *Important Maclaurins*:
    * $\frac{1}{1-x} = \sum_{x=0}^\infty x^n = 1 + x + x^2 + x^3 + ...$, $R=1$.
    * $e^x = \sum_{x=0}^\infty \frac{x^n}{n!} = 1 + \frac{x}{1!} + \frac{x^2}{2!} + \frac{x^3}{3!} + ...$, $R=\infty$.
    * $\mathtt{sin}x = \sum_{x=0}^\infty (-1)^n\frac{x^{2n+1}}{(2n+1)!} = x - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} + ...$, $R=\infty$.
    * $\mathtt{cos}x = \sum_{x=0}^\infty (-1)^n\frac{x^{2n}}{(2n)!} = x - \frac{x^2}{2!} + \frac{x^4}{4!} - \frac{x^6}{6!} + ...$, $R=\infty$.
    * $\mathtt{tan}^{-1}x = \sum_{x=0}^\infty (-1)^n\frac{x^{2n+1}}{2n+1} = x - \frac{x^3}{3} + \frac{x^5}{5} - \frac{x^7}{7} + ...$, $R=1$. 
    * $\mathtt{ln}(1+x) = \sum_{x=1}^\infty (-1)^{n-1}\frac{x^n}{n} = x - \frac{x^2}{2} + \frac{x^3}{3} - \frac{x^4}{4} + ...$, $R=1$. 
    * $(1+x)^k = \sum_{x=0}^\infty\binom{k}{n}x^n = 1 + kx + \frac{k(k-1)}{2!}x^2 + \frac{k(k-1)(k-2)}{3!}x^3 + ...$, $R=1$. 

** Tests **


* [TST] *Test for Divergence*: If $lim_{n\rightarrow\infty}a_n$ does not exist or if $lim_{n\rightarrow\infty}\neq 0$, then the series $\sum_{n=1}^\infty a_n$ is divergent.
* [TST] *Integral Test*: Let $f$ be continuous, positive, decreasing on $[1,\infty)$ and let $a_n=f(n)$, then the series $\sum_{n=1}^\infty a_n$ is,
    * convergent if $\int_1^\infty f(x)dx$ is convergent,
    * divergent if $\int_1^\infty f(x)dx$ is divergent.
    * *Remainder*: $\int_{n+1}^\infty f(x)dx \leq R_n \leq \int_n^\infty f(x)dx$ (grpf. 742), which implies $s_n + \int_{n+1}^\infty f(x)dx \leq s \leq s_n + \int_n^\infty f(x)dx$, since $R_n = s - s_n$.
* [TST] *Comparison Test*: Let $\sum a_n$ and $\sum b_n$ be series with positive terms,
    * If $\sum b_n$ is convergent and $a_n\leq b_n$ for all $n$, then $\sum a_n$ is also convergent,
    * If $\sum b_n$ is divergent and $a_n\geq b_n$ for all $n$, then $\sum a_n$ is also divergent.
* [TST] *Limit Comparsion Test*: Let $\sum a_n$ and $\sum b_n$ be series with positive terms. If $lim_{n\rightarrow\infty}\frac{a_n}{b_n} = c$, where $c$ is a finite number and $c>0$, then either both series converge or both diverge.
* [TST] *Alternating Series Test*: Given the series $\sum_{n=1}^\infty (-1)^{n-1} b_n = b_1-b_2+b_3-b_4+b_5-b_6+...,b_n>0$, if 
    * $b_{n+1}\leq b_n,\forall n$,
    * $lim_{n\rightarrow\infty}b_n=0$.
    
    then the series of convergent.
* [TST] *Ratio Test*:
    * If $lim_{n\rightarrow\infty}\left|\frac{a_{n+1}}{a_n}\right| = L < 1$, then the series $\sum_{n=1}^\infty a_n$ is absolutely convergent.
    * If $lim_{n\rightarrow\infty}\left|\frac{a_{n+1}}{a_n}\right| = L > 1$ or $lim_{n\rightarrow\infty}\left|\frac{a_{n+1}}{a_n}\right| = \infty$, then the series $\sum_{n=1}^\infty a_n$ is divergent.
    * If $lim_{n\rightarrow\infty}\left|\frac{a_{n+1}}{a_n}\right| = 1$, then the test is inconclusive.
* [TST] *Root Test*:
    * If $lim_{n\rightarrow\infty}\sqrt[n]{|a_n|} = L < 1$, then the series $\sum_{n=1}^\infty a_n$ is absolutely convergent.
    * If $lim_{n\rightarrow\infty}\sqrt[n]{|a_n|} = L > 1$ or $lim_{n\rightarrow\infty}\sqrt[n]{|a_n|} = \infty$, then the series $\sum_{n=1}^\infty a_n$ is divergent.
    * If $lim_{n\rightarrow\infty}\sqrt[n]{|a_n|} = 1$, then the test is inconclusive.

## II. Vector & Geometry

** Definitions **

* [DEF] A *vector* $v,v\in\mathbf{R}^d$ is a quantity that has both magnitude (length of vector) and direction.
* [DEF] A *unit vector* is a vector whose length is $1$: $u = \frac{1}{|a|}a = \frac{a}{|a|}$. If $a\in\mathbf{R}^3$, with the angles between $a$ and $x,y,z$ axes being $\alpha,\beta,\gamma$, then $u = <\mathtt{cos}\alpha,\mathtt{cos}\beta,\mathtt{cos}\theta>$, because, $\mathtt{cos}\alpha = \frac{a\cdot i}{|a||i|} = \frac{a_1}{|a|}$, for instance.
* [DEF] *Projection*:
    * The *scalar projection* (i.e. the magnitude of the projection) of vector $b$ onto vector $a$, where the angle between $a$ and $b$ is $\theta$: $\mathtt{comp}_ab = \frac{a\cdot b}{|a|} = \frac{|a||b|\mathtt{cos}\theta}{|a|} = |b|\mathtt{cos}\theta$. (cf. 828).
    * The corresponding *vector projection* (i.e. the projection as a vector in the direction of $a$): $\mathtt{proj}_ab = \mathtt{comp}_ab\frac{a}{|a|} = \left(\frac{a\cdot b}{|a|}\right)\frac{a}{|a|} = \frac{a\cdot b}{|a|^2}a$.
* [DEF] *Cross Product*: If $a = <a_1,a_2,a_3>$, $b=<b_1,b_2,b_3>$, then the cross product of $a$ and $b$ is $a\times b = <a_2b_3-a_3b_2,a_3b_1-a_1b_3,a_1b_2-a_2b_1>$, obtained by solving $\begin{cases}a\cdot c = a_1c_1+a_2c_2+a_3c_3=0\\b\cdot c = b_1c_1+b_2c_2+b_3c_3=0\end{cases}$. For mnemonics, $a\times b = \begin{vmatrix} i&j&k\\a_1&a_2&a_3\\b_1&b_2&b_3 \end{vmatrix} = \begin{vmatrix}a_2&a_3\\b_2&b_3\end{vmatrix}i - \begin{vmatrix}a_1&a_3\\b_1&b_3\end{vmatrix}j + \begin{vmatrix}a_1&a_2\\b_1&b_2\end{vmatrix}k$.
* [DEF] *Scalar Triple Product*: $a\cdot (b\times c) = \begin{vmatrix}a_1&a_2&a_3\\b_1&b_2&b_3\\c_1&c_2&c_3\end{vmatrix}$, and the volume of the parallelepiped determined by $a,b,c$ $V = Ah = |b\times c||a||\mathtt{cos}\theta| = |a\cdot(b\times c)|$, where $\theta$ is the angle between $a$ and $b\times c$.
* [DEF] *Vector Equation of Line*: Given a point $r_0,r_0\in\mathbf{R}^3$ on a line and a direction vector $v,v\in\mathbf{R}^3$ which is parallel to the line, the vector equation of the line is written as $r = r_0 + \mathsf{t}v$, where $\mathsf{t}$ is a scalar. Changes in $\mathsf{t}$ trace out the line. Note that the vector equation of a line is not unique (it changes with different point and direction vector). Let $r = <x,y,z>, r_0=<x_0,y_0,z_0>$, and $v=<a,b,c>$ then
    * *Parametric Equations*: $x=x_0+a\mathsf{t},y=y_0+b\mathsf{t},z=z_0+c\mathsf{t}$,
    * *Symmetric Equations*: 
        * point-vector: $\frac{x-x_0}{a}=\frac{y-y_0}{b}=\frac{z-z_0}{c}$.
        * point-point: $\frac{x-x_0}{x_1-x_0}=\frac{y-y_0}{y_1-y_0}=\frac{z-z_0}{z_1-z_0}$, where the direction vector is actually $\vec{r_0r_1}=<x_1-x_0,y_1-y_0,z_1-y_0>$.
        * In a space $\mathbf{R}^3$, if we only have $2$ of $x,y,z$ in the symmetric equations (i.e. $2$ sym.eq.s), the equations represent a plane with the missing variable can be any value.
    * *Line Segment (between $[0,1]$)*: $r(\mathsf{t})=r_0+\mathsf{t}(r_1-r_0) = (1-\mathsf{t})r_0+\mathsf{t}r_1,0\leq \mathsf{t}\leq 1$.
    * *Skew Lines*: Lines that do not intersect nor parallel (the former is proved by showing that their parametric equations have no unique solution, the latter the direction vectors of the lines are not parallel -- their components are not proportional).
* [DEF] *Vector Equation of Plane*: Given a point $r_0,r_0\in\mathbf{R}^3$ on a plane and a direction vector $n,n\in\mathbf{R}^3$ which is orthogonal to the plane, the vector equation of the plane is written as $n\cdot(r-r_0) =0$, where $r,r\in\mathbf{R}^3$ (or $n\cdot r=n\cdot r_0$) is another point on the plane (i.e. $r-r_0$ is a vector on the plane, and the fact $n\cdot(r-r_0) =0$ implies that $n$ is orthogonal to the plain, as it is orthogonal to a vector on the plane), (cf. 844:fig.6).
    * *Scalar Equation*: $<a,b,c>\cdot<x-x_0,y-y_0,z-z_0> = 0\Rightarrow a(x-x_0) + b(y-y_0) + c(z-z_0) = 0$ (cf. 844),
    * *Linear Equation*: $ax+by+cz+d=0$, which is arithmetically derived from the scalar equation. Note that if $a,b,c$ can be shown to not all be $0$, then the linear equation represents a plane with the normal vector $<a,b,c>$.
* [DEP] *Distance between a Point $p = (x_1,y_1,z_1)$ and a Plane $ax + by + cz + d = 0$*:
    * $D = \frac{|ax_1+by_1+cz_1+d|}{\sqrt{a^2+b^2+c^2}}$.
    * Derivation:
        * Find an arbitrary point $p_0 = (x_0,y_0,z_0)$ on the plane,
        * Find the vector determined by $p_0,p$: $b = <x_1-x_0,y_1-y_0,z_1-z_0>$,
        * $\begin{align} D &= |\mathtt{comp}_nb| = \frac{|n\cdot b|}{|n|} \\ &= \frac{|a(x_1-x_0)+b(y_1-y_0)+c(z_1-z_0)|}{\sqrt{a^2+b^2+c^2}} \\ &= \frac{|(ax_1+by_1+cz_1)-(ax_0+by_0+cz_0)|}{\sqrt{a^2+b^2+c^2}} \\ &= \frac{|ax_1+by_1+cz_1+d|}{\sqrt{a^2+b^2+c^2}}, \text{ where, because } p_0 \text{ lies in the plane, therefore satisfies } ax_0+by_0+cz_0+d=0.\end{align}$.
* [DEF] *Surface*:
    * *Cylinder*: A surface that consists of all lines that are parallel to a given line and pass through a given plane curve.
    * *Quadratic Surface*: The graph of a second-degree equation in three variables $x,y,z$. Formulaically, $Ax^2+By^2+Cz^2+J=0$ or $Ax^2+By^2+Iz=0$, where the capitalized letters represent constants.
* [DEF] *Trace* (of a surface): A trace is the curve of intersection of the surface with planes parallel to the coordinate planes.

** Theorems **

* [THM] *Angle between Vectors*: If $\theta$ is the angle between the vectors $a$ and $b$, then 
    * $a\cdot b = |a||b|\mathtt{cos}\theta$. Therefore $a$ and $b$ are perpendicular/orthogonal if $a\cdot b=0$.
    * $|a\times b| = |a||b|\mathtt{sin}\theta$. Therefore $a$ and $b$ are parallel if $a\times b = 0$.
* [THM] *Cross Product Rules*:
    * $a\times b = -b\times a$,
    * $(\mathsf{c}a)\times b = \mathsf{c}(a\times b) = a\times (\mathsf{c}b)$, $\mathsf{c}$ is a scalar,
    * $a\times (b+c) = a\times b + a\times c$,
    * $(a+b)\times c = a\times c+b\times c$,
    * $a\cdot (b\times c) = (a\times b)\cdot c$,
    * $a\times (b\times c) = (a\cdot c)b-(a\cdot b)c$.

** Examples **

* [EXP] Vector as the linear combination of the *standard basis vectors*: $<1,-2,6> = i - 2j + 6k$, where $i = <1,0,0>, j=<0,1,0>, k=<0,0,1>$.
* [EXP] Finding a plane given $3$ points: 
    * Find two vectors $a,b$ with the points,
    * Find the vector that's orthogonal to the two vectors using cross product: $n = a\times b$,
    * Find the plane using the scalar equation.

** Interpretations **

* [INT] *Dot Product*: "Combined magnitude" of vectors (with respective unit lengths and $\mathtt{cos}$ angles).
* [INT] *Cross Product*: The vector that is perpendicular/orthogonal to the vectors. The length of a cross product $a\times b$ is the area of the parallelogram determined by $a$ and $b$.
* [INT] *Line in 2D & 3D*: 
    * Line in 2D Space: determined by a point in $\mathbf{R}^2$ on the line and the direction (i.e. slope) of the line, written in point-slope form.
    * Line in 3D Space: determined by a point in $\mathbf{R}^3$ on the line and the parallel direction vector in $\mathbf{R}^3$, where the line is parallel to the direction vector (cf. 841:fig.1,2).

## III. Vector Functions

** Definitions **

* [DEF] *Vector Function*: A mapping from a set of real numbers $\mathsf{t}\in\mathbf{R}$ to a set of vectors $r\in\mathbf{R}^d$.
    * $d=3$ case: $r(\mathsf{t}) = <f(\mathsf{t}),g(\mathsf{t}),h(\mathsf{t})> = f(\mathsf{t})i+g(\mathsf{t})j+h(\mathsf{t})k$, where $f,g,h$ are the *component functions of $r$*.
    * *Limit of Vector function*: $\lim_{\mathsf{t}\rightarrow a}r(\mathsf{t}) = \left<\lim_{\mathsf{t}\rightarrow a}f(\mathsf{t}),\lim_{\mathsf{t}\rightarrow a}g(\mathsf{t}),\lim_{\mathsf{t}\rightarrow a}h(\mathsf{t})\right>$. $r(\mathsf{t})$ is *continuous at $a$* if $\lim_{\mathsf{t}\rightarrow a}r(\mathsf{t}) = r(a)$.
* [DEF] *Space Curve*: Let $x=f(\mathsf{t}),y=g(\mathsf{t}),z=h(\mathsf{t})$ for a set $C$ of points $\{x,y,z\}$, and $\mathsf{t}$ varies throughout the interval $I$, the curve produced is a space curve, where the equations are the  *parametric equations of $C$* and $\mathsf{t}$ is a *parameter*. 
* [DEF] *Derivative*: $\frac{dr}{d\mathsf{t}} = r'(\mathsf{t}) = lim_{h\rightarrow 0}\frac{r(\mathsf{t}+h)-r(\mathsf{t})}{h}$, which is the *tangent vector* to the curve defined by $r$ at the point parametrized at $\mathsf{t}$, and a *unit tangent vector* is $T(\mathsf{t}) = \frac{r'(\mathsf{t})}{|r'(\mathsf{t})|}$.
* [DEF] *Integral*: $\int_a^br(\mathsf{t})d\mathsf{t} = \left(\int_a^bf(\mathsf{t})d\mathsf{t}\right)i + \left(\int_a^bg(\mathsf{t})d\mathsf{t}\right)j + \left(\int_a^bh(\mathsf{t})d\mathsf{t}\right)k$.
* [DEF] *Arc Length*:
    * $\begin{align}L &= \int_a^b\sqrt{[f'(\mathsf{t})]^2+[g'(\mathsf{t})]^2+[h'(\mathsf{t})]^2}dt \\ &= \int_a^b\sqrt{\left(\frac{dx}{d\mathsf{t}}\right)^2 + \left(\frac{dy}{d\mathsf{t}}\right)^2 + \left(\frac{dz}{d\mathsf{t}}\right)^2}dt\end{align}$.
    * Compact Form: $L = \int_a^b|r'(\mathsf{t})|dt$.
* [DEF] *Smooth*: 
    * A parametrization $r(\mathsf{t})$ is smooth on an interval $I$ if $r'$ is continuous and $r'(\mathsf{t})\neq 0$ on $I$. 
    * A curve is smooth if it has a smooth parametrization.
* [DEF] *Curvature*:
    * Curvature measures how quickly a curve changes direction at that point, which is defined as the magnitude of the rate of change of the unit tangent vector (which indicates direction) wrt. arc length (i.e. per unit of arc length, how much does the direction change).
    * $\kappa = \left|\frac{dT}{ds}\right| = \left|\frac{dT/d\mathsf{t}}{ds/d\mathsf{t}}\right|$, where $T(\mathsf{t}) = \frac{r'(\mathsf{t})}{|r'(\mathsf{t})|}$. As $\frac{ds}{d\mathsf{t}} = |r'(\mathsf{t})|$, $\kappa = \frac{|T'(\mathsf{t})|}{|r'(\mathsf{t})|}$.
    * Easy computation: $\kappa(\mathsf{t}) = \frac{|r'(\mathsf{t})\times r''(\mathsf{t})|}{|r'(\mathsf{t})|^3}$ (pf. 880).
    * Plane Curve: For a plane curve $y=f(x)$, we have $r(x) = xi+f(x)j$, therefore $\kappa(x) = \frac{|f''(x)|}{[1+(f'(x))^2]^{3/2}}$. (cf. 881)
* [DEF] *Normality*: 
    * The *principal unit normal vector* $N(\mathsf{t})$ is the unit vector of $T'(\mathsf{t})$: $N(\mathsf{t}) = \frac{T'(\mathsf{t})}{|T'(\mathsf{t})|}$.
    * The *binormal vector* is a unit vector $B$ that is perpendicular to both $T$ and $N$, where $B(\mathsf{t}) = T(\mathsf{t})\times N(\mathsf{t})$.
    * The plane determined by $B$ and $N$ at a point $P$ on a curve $C$ is called the *normal plane of $C$ at $P$*. It consists of all the lines that are orthogonal to the tangent vector $T$. (NB: for related concepts *osculating plane/circle*, cf. 883).
    

** Theorems **

* [THM] *Derivative*: If $r(\mathsf{t}) = <f(\mathsf{t}),g(\mathsf{t}),h(\mathsf{t})> = f(\mathsf{t})i+g(\mathsf{t})j+h(\mathsf{t})k$, where $f,g,h$ are differentiable, then $r'(\mathsf{t}) = <f'(\mathsf{t}),g'(\mathsf{t}),h'(\mathsf{t})> = f'(\mathsf{t})i+g'(\mathsf{t})j+h'(\mathsf{t})k$.
* [THM] *Orthogonality with Constant*: If $\mathsf{t}$ is a constant, then $r(t)\times r'(\mathsf{t}) = 0$. (i.e. If a curve lies on a sphere with center the origin, then the tangent vector will always be perpendicular to the position vector.

## IV*. Partial Derivatives 

** Definitions **

* [DEF] *Graph*: If $f$ is a function of two variables with domain $D$, then the graph of $f$ is the set of all points $(x,y,z)$ in $\mathbf{R}^3$ such that $z=f(x,y)$ and $(x,y)$ is in $D$.
* [DEF] *Level Curve*: The level curve of a function $f$ of two variables are the curves with equation $f(x,y)=k$, where $k$ is a constant (in the range of $f$).
* [DEF] *Limit of Bivariate Function*: 
    * Let $f$ be a function of two variables whose domain $D$ includes points arbitrarily close to $(a,b,)$. Then the limit of $f(x,y)$ as $(x,y)$ approaches $(a,b)$ is $L$, i.e. $lim_{(x,y)\rightarrow(a,b)}f(x,y)=L$ if for every number $\varepsilon > 0$ there is a corresponding number $\delta > 0$ s.t. if $(x,y)\in D$ and $0<\sqrt{(x-a)^2+(y-b)^2}<
\delta$ then $|f(x,y)-L|<\varepsilon$.  
    * For the limit to exist, $f(x,y)=L_i$ as $(x,y)\rightarrow(a,b)$ along any path $i$ should produce the same $L_i$. Otherwise the limit doesn't exist.
* [DEF] *Continuity of Bivariate Function*:
    * A bivariate function $f$ is continuous at $(a,b)$ if $lim_{(x,y)\rightarrow(a,b)}f(x,y)=f(a,b)$. 
    * $f$ is *continuous on $D$* if $f$ is continous at every point $(a,b)$ in $D$.
* [DEF] *Bivariate Polynomial Function*: A sum of terms of the form $cx^my^n$.
* [DEF] *Rational Function*: A ratio of polynomials.
* [DEF] *Tangent Plane* (space=$\mathbf{R}^3$, derivation: 939): 
    * The tangent plane to the surface $S$ at the point $P$ is the plane that contains all the tangent lines at $P$.
    * Suppose $f$ has continous partial derivatives. An equation of the tangent plane to the surface $z=f(x,y)$ at the point $P(x_0,y_0,z_0)$ is $z-z_0 = f_x(x_0,y_0)(x-x_0) + f_y(x_0,y_0)(y-y_0)$, where $f_x(x_0,y_0)$ and $f_y(x_0,y_0)$ are the slopes of the curves on the surface when $y=y_0$ and the one when $x=x_0$, respectively.
* [DEF] *Linear Approximation* (or *Tangent Plane Approximation*):
    * Given $z=f(x,y)$ for a surface $S$, we know that the tangent plane of the surface at $(a,b)$ is $z-f(a,b) = f_x(a,b)(x-a)+f_y(a,b)(y-b) \Rightarrow z = f(a,b)+f_x(a,b)(x-a)+f_y(a,b)(y-b)$. Then, the *linearization of $f$ at $(a,b)$* (i.e. linear approx.) is the function $f(x,y)\simeq L(x,y) = f(a,b)+f_x(a,b)(x-a)+f_y(a,b)(y-b)$.
* [DEF] *Differentiability*: 
    * If $z=f(x,y)$, then $f$ is differentiable at $(a,b)$ if $\Delta z$ can be expressed in the form $\Delta z = f_x(a,b)\Delta x + f_y(a,b)\Delta y + \varepsilon_1\Delta x + \varepsilon_2\Delta y$, where $\varepsilon_1$ and $\varepsilon_2$ $\rightarrow 0$ as $(\Delta x,\Delta y)\rightarrow (0,0)$.
    * A differentiable function is one for which the linear approximation is a good approximation when $(x,y)$ is near $(a,b)$.
* [DEF] *Derivative of $z=f(x,y)$* (visualization. 944:fig.7):
    * Recall that for a single variable differentiable function $y=f(x)$, $dy = f'(x)dx$.
    * $dz = f_x(x,y)dx + f_y(x,y)dy = \frac{\partial z}{\partial x}dx + \frac{\partial z}{\partial y}dy$.
    * We know that in linear approx., $f(x,y) \simeq f(a,b)+f_x(a,b)(x-a)+f_y(a,b)(y-b)$. Now this can be written as $f(x,y) \simeq f(a,b) + dz$. 

** Theorems **

* [THM] *Continuity Production* The sums, ratios and compositions of continous functions are continous too.
* [THM] *Clairaut's Theorem*: Suppose $f$ is defined on a disk $D$ that contains the point $(a,b)$. If the functions $f_{xy}$ and $f_{yx}$ are both continuous on $D$, then $f_{xy}(a,b) = f_{yx}(a,b)$.
* [THM] *Differentiability*: If the partial derivatives $f_x$ and $f_y$ exist near $(a,b)$ and are continuous at $(a,b)$, then $f$ is differentiable at $(a,b)$.
* [THM] *Chain Rule*:
    * Case 1: If $z=f(x,y)$, where $x=g(\mathsf{t}),y=h(\mathsf{t})$, all functions are differentiable, then $\frac{dz}{d\mathsf{t}}=\frac{\partial z}{\partial x}\frac{dx}{d\mathsf{t}}+\frac{\partial z}{\partial y}\frac{dy}{d\mathsf{t}}$.
    * Case 2: If $z=f(x,y)$, where $x=g(\mathsf{s},\mathsf{t}),y=h(\mathsf{s},\mathsf{t})$, all functions are differentiable, then $\frac{dz}{d\mathsf{s}}=\frac{\partial z}{\partial x}\frac{dx}{d\mathsf{s}}+\frac{\partial z}{\partial y}\frac{dy}{d\mathsf{s}}$ and $\frac{dz}{d\mathsf{t}}=\frac{\partial z}{\partial x}\frac{dx}{d\mathsf{t}}+\frac{\partial z}{\partial y}\frac{dy}{d\mathsf{t}}$.
    * General: If $u=f(x_1,...,x_i,...,x_n)$, where each $x_i=g_i(\mathsf{t}_1,...,\mathsf{t}_j,...,\mathsf{t}_m)$, and all functions are differentiable, then $\frac{dz}{d\mathsf{t}_j} = \frac{\partial z}{\partial x_1}\frac{dx_1}{d\mathsf{t}_j} + \frac{\partial z}{\partial x_n}\frac{dx_n}{d\mathsf{t}_j}$.
* [THM] *Implicit Function Theorem*:
    * If $y=f(x)$, we can write function $F=(x,f(x))=0$. Then we have $\frac{dy}{dx} = -\frac{\partial F/\partial x}{\partial F/\partial y} = -\frac{F_x}{F_y}$. (pf. "advanced", 953)
    * If $z=f(x,y)$, we can write function $F=(x,y,f(x,y))$. Then we have,
        * $\frac{dz}{dx} = -\frac{\partial F/\partial x}{\partial F/\partial z}$, and
        * $\frac{dz}{dy} = -\frac{\partial F/\partial y}{\partial F/\partial z}$.

### *Topic A. Gradient

** Definitions **

* [DEF] *Directional Derivative*: 
    * The directional derivative of $f$ at $(x_0,y_0)$ in the direction of a unit vector $u = <a,b>$ is $D_uf(x_0,y_0) = lim_{h\rightarrow0}\frac{f(x_0+ha,y_0+hb)-f(x_0,y_0)}{h}$, if this limit exists.
    * The partial derivatives wrt. $x$ and $y$ are just special cases of the directional derivative, where $f_x = D_if$ with $u=j=<1,0>$ and $f_y=D_jf$ with $u=j=<0,1>$.
    * General Form: $D_if(\mathbf{x}_0) = lim_{h\rightarrow0}\frac{f(\mathbf{x}_0+h\mathbf{u})-f(\mathbf{x}_0)}{h}$.
* [DEF] *Gradient*:
    * Given $f(x,y)$, the gradient of $f$ is the vector function $\nabla f(x,y) = <f_x(x,y),f_y(x,y)> = \frac{\partial f}{\partial x}i+\frac{\partial f}{\partial y}j$.
    * The directional derivative in the direction of $u=<a,b>$ can be written as $D_uf(x,y) = f_x(x,y)a + f_y(x,y)b = <f_x(x,y),f_y(x,y)>\cdot <a,b> = \nabla f(x,y)\cdot u$. i.e. $D_uf(x,y)$ is the scalar projection of $\nabla f(x,y)$ onto $u$.

** Theorems **

* [THM] *Derivative*:
    * If $f(x,y)$ is differentiable, then $f$ has a directional derivative in the direction of any unit vector $u=<a,b>$ and $D_uf(x,y) = f_x(x,y)a+f_y(x,y)b$.
    * Derivation: Write $g(h)=f(x,y)$ with $x=x_0+ha,y=y_0+hb$, then by the chain rule we have $g'(h) = \frac{\partial f}{\partial x}\frac{dx}{dh} + \frac{\partial f}{\partial y}\frac{dy}{dh} = f_x(x,y)a+f_y(x,y)b$.
    * If $u$ makes an angle $\theta$ with the positive $x$-axis, then $u=<\mathtt{cos}\theta,\mathtt{sin}\theta>$.
* [THM] *Maximal Directional Derivative*:
    * Given $f(x,y)$, the maximum value of the directional derivative $D_uf(\mathbf{x})$ is $|\nabla f(\mathbf{x})|$ and it occurs when $\mathbf{u}$ has the same direction as the gradient vector $\nabla f(\mathbf{x})$.
    * Derivation: $D_uf=\nabla f\cdot\mathbf{u}=|\nabla f||\mathbf{u}|\mathtt{cos}\theta=|\nabla f|\mathtt{cos}\theta$, therefore $D_uf$ has the maximum $|\nabla f|$ when $\theta=0$.
* [THM] *Tangent Plane* (visual. 964:fig.9):
    * A surface $S$ is defined by $F(x,y,z)=k$. If we have a point $P(x_0,y_0,z_0)$ on the plane, and a continuous vector function describing curve $C$ on $S$: $r(\mathsf{t})=<x(\mathsf{t}),y(\mathsf{t}),z(\mathsf{t})>$, we have,
        * $F'(x(\mathsf{t}),y(\mathsf{t}),z(\mathsf{t})) = \nabla F\cdot r'(\mathsf{t}) = 0$, where the gradient $\nabla F$ is in the direction perpendicular to the surface at the point $(x(\mathsf{t}),y(\mathsf{t}),z(\mathsf{t}))$, and $r'(\mathsf{t})$ is a tagent line through the point. Therefore the line in the direction of the gradient (*the normal line to $S$*) is the line that is orthogonal to the tangent plane through the point.
        * The Normal Line: Given that the direction of the normal line at point $(x_0,y_0,z_0)$ is $\nabla F(x_0,y_0,z_0)$, by symmetric equation, the line is described by $\frac{x-x_0}{F_x(x_0,y_0,z_0)}=\frac{y-y_0}{F_y(x_0,y_0,z_0)}=\frac{z-z_0}{F_z(x_0,y_0,z_0)}$.
        * By scalar equation, the tangent plane at point $(x_0,y_0,z_0) $is $\nabla F(x_0,y_0,z_0)\cdot <(x-x_0),(y-y_0),(z-z_0)> = F_x(x_0,y_0,z_0)(x-x_0)+F_y(x_0,y_0,z_0)(y-y_0)+F_z(x_0,y_0,z_0)(z-z_0) = 0$, where $\nabla F(x_0,y_0,z_0)$ is the direction of the normal line to $S$ and $<(x-x_0),(y-y_0),(z-z_0)>$ is a curve on $S$ (NB: the vector obtained by subtraction the point $(x_0,y_0,z_0)$ from the point $(x,y,z)$).