# The Binomial Theorem - raising binomials to powers

$(a+b)^2 = (a^2 + 2ab + b^2)$

$(a+b)^3 = (a^3 + 3a^2b + 3ab^2 + b^3)$

$(a+b)^4 = (a^4 + 4a^3b + 6a^2b^2 + 4ab^3 + b^4)$

So there's a pattern.  

## Variable Portion of each term

If $n$ is the power we're raising the binomial to, then for each subsequent term in the resulting polynomial, the exponent of $a$ starts at $n$ and goes down to $0$, while the exponent of $b$ starts at $0$ and climbs to $n$.

Also, for any $n$, you end up with a polynomial with $n + 1$ terms.

So, zero-indexing the terms where $n$ is the power, and $k$ is the zero-based index of the term, each term's variable portion is:
$$a^{n-k}b^{k}$$

## Coefficients

If we lay out all of the polynomials obtained by taking $a+b$ from power $0$ and up:
$$1$$
$$a+b$$
$$a^2 + 2ab + b^2$$
$$a^3 + 3a^2b + 3ab^2 + b^3$$
$$a^4 + 4a^3b + 6a^2b^2 + 4ab^3 + b^4$$

...and then extract just the coefficients, we get
$$1$$
$$1 1$$
$$1 2 1$$
$$1 3 3 1$$
$$1 4 6 4 1$$

Which is pascal's triangle.  And, taking the row of that triangle whose zero-based index is equal to the power we're after, we get the correct coefficients.

## Why?

When we're calculating a binomial to a power, it expands to a multiplication of binomials:
$$(a+b)^3 = (a+b)(a+b)(a+b)$$

When you multiply binomials, you're saying "every term in every binomial needs to be multiplied by every term in every other binomial". To do this manually, you'd go through a systematic approach:
$$
(a+b)(a+b)(a+b) 
= (a \times a \times a) + (a \times a \times b ) + (a \times b \times a) + (a \times b \times b) + ... + (b \times b \times b)
$$

So, all possible combinations of terms whose powers add up to $n$ will exist in our answer:
- $a^3$
- $a^2b$
- $ab^2$
- $b^3$

Some of them will occur more than once.  We'll only get one $a^3$, but we'll get three $ab^2$.

Each term's coefficient is determined by how many times that combination arises in the full multiplying-out. for a value of $(a + b)^5$, the factor $(a+b)$ is multiplied $5$ times:
$$(a+b)(a+b)(a+b)(a+b)(a+b)$$

Pascal's triangle gives the correct values for the coefficient values for the given power and term index; however, so does ${n \choose k}$.

The function ${n \choose k}$ ("$n$ choose $k$") is a function that returns the number of possible distinct sets of $k$ elements from a set of $n$.  So, if you have a set of 3 elements, ${3 \choose 1}$ will give you 3: for any set of 3 elements, there are three subsets that contain one element.

Let's randomly choose some term that will be in the result, say $a^2b^3$.  What's the coefficient? We know that, when multiplying out, we would have chosen the $a$ from two of the $(a+b)$ factors, and the $b$ from the remaining three. So, given a subset of $5$ factors, how many distinct subsets are there with three members? It's an ${n \choose k}$ problem.

Also, note that $5 \choose 3$ returns the same as $5 \choose 2$ ($10$), hence the palindromic shape of a line of Pascal's triangle.

### Proof for ${n \choose k}$

This proof relies on thinking about the total possible set of orderings of any given set.

To calculate ${n \choose k}$, imagine the total possible set of orderings of the set $n$ as a space. There will be $n!$ orderings in this space.

Now, imagine that we figure out by some means the complete set of subsets of $k$ elements from $n$. If we have a set of (1, 2, 3), the total set of ${3 \choose 2}$ subsets is (1, 2), (1, 3) and (2,3). We then break the total ordering space up into groups: one group for each possible subset. That group of orderings will contain all of the orderings of $n$ where the first $k$ numbers are that set.

In our example:
* Group 1:
    * (1,2)(3)
    * (2,1)(3)
* Group 2:
    * (1,3)(2)
    * (3,1)(2)
* Group 3:
    * (2,3)(1)
    * (3,2)(1)
    
We'll have ${n \choose k}$ groups, because that's how many subsets we have. So how many orderings are going to be in each group?

Each $k$ element prefix will have $k!$ possible orderings. Each $n-k$ element suffix will have $(n-k)!$ possible orderings; so, each group will have $k!(n-k)!$ possible orderings.

As the complete set is divide up into ${n \choose k}$ groups, we now know that:

$$n! = {n \choose k}k!(n-k)!$$

Solve for ${n \choose k}$, and you get

$$ {n \choose k} = {n! \over {k!(n-k)!}} $$

...which is the ${n \choose k}$ formula.


## In an equation
So, any expression of $(a+b)^n$ will give rise to a set of terms: each will be some combination of the variables where the powers add up to n, and the coefficient will be equal to $n \choose k$, where $n$ is the power being raised to and $k$ is the power of any one of the variables:
$${n \choose k}(a^{n-k}b^k)$$

The final answer will be the sum of all such terms, with $k$ starting at $0$ and ending at $n$:
$$(a+b)^k = \sum_{k=0}^{n}{{n \choose k}(a^{n-k}b^k)}$$

# Taylor Series

[This video](https://www.youtube.com/watch?v=3d6DsjIBzJ4)

Given a quadratic equation $f(x)=ax^2 + bx +c$, the impact of the constants is:
* a = the "steepness" of the parabola
* b = has a disproportionate effect when $x$ is small. Biggest impact is at the y-intercept (ie. $x \to 0$), so it determines the derivitave of $x$ at the intercept.
* c = the intercept.

## Approximating a function with a polynomial

* You can set the intercept by setting $c$
* Set the highest y of the parabola to a given value by setting its derivative to be 0 at that point; if you don't know the intercept, this gives you a function that relates $a$ and $b$. Plugging values for that in lets you solve for $c$.
* Set the curve by setting the second derivative at the turnaround to be the same as that of the function you're approximating

The zen of powers and coefficients in an equation is that the higher powers have a greater effect with larger values of x.  So, you can use  the lower powers to give you the curve early on, then the higher powers later but with very small co-efficients to minimise their impact in the early stages.

## At $x=0$ only one term impacts any derivative
Adding higher order terms to a polynomial has no effect on the derivatives of that polynomial at $x = 0$. 
Eg. for $f(x) = ax^4 + bx^3 + cx^2 + dx + e$, if you take the second derivative then the power rule will take $dx$ and $e$ to zero, the $cx$ term will become a constant (because it'll contain $x^0$), and the higher order terms will evaluate to zero because they still have an $x$ in them.

So, at $x = 0$, only one term in a polynomial controls a given derivative.

To achieve the same effect at values of $x$ that are not zero, you have to offset the value of $x$ by the value you're optimising for. In this case, it's $\pi$:

$f(x) = a + b(x - \pi) + c(x^2 - \pi) + ...$

## Taylor polynomials

A Taylor series is an infinite series expressed in terms of a function's derivatives at a particular point, starting at with the function itself and then summing the higher order derivatives at that point. In the following, $a$ is the point the series is optimised for:

$$g(x) = f(a) + {f'(a).x \over 1!} + {f''(a).x \over 2!} + {f'''(a).x \over 3!} + ...$$

So, the derivatives of cosine and their values at $x=0$ are:

$\cos(0) = 1$

$-\sin(0) = 0$

$-\cos(0) = -1$

$\sin(0) = 0$

$\cos(0) = 1$

$...$

...and putting this into a polynomial that approximates $cos$ at x = 0 gives us the Taylor polynomial:

$f(x) = 1 + 0{ x^1 \over 1! } + -1{ x^2 \over 2! } + 0{ x^3 \over 3! } + 1{ x^4 \over 4! } + ... $

Each derivative of this polynomial, at $x = 0$, will only return a value from the single term that corresponds to that derivative due to the cancelling out effect above. 

Put generally, if you want to approximate a given function $g$ at some value of x, you do

$$f(x) = g(x) + {dg \over dx}(0){x^1 \over 1!} + {d^2g \over dx^2}(0){x^2 \over 2!} + {d^3g \over dx^3}(0){x^3 \over 3!} + ... $$

ie. compute all of the derivatives of the function you want to approximate. Each nth term of the approximation function becomes the nth derivative of the original function, evaluated at $x=0$ (or some other value, if that's where you're optimising), divided by $n!$.

The same function, including the possibility of optimising at some value of $x$ other than 0 ($a$):
$$f(x) = g(x) + {dg \over dx}(a){(x - a)^1 \over 1!} + {d^2g \over dx^2}(a){(x - a)^2 \over 2!} + {d^3g \over dx^3}(a){(x - a)^3 \over 3!} + ... $$

All of the terms in a taylor polynomial, to $\infty$, are known as a Taylor series. For some functions, such as $f(x) = a^x$, the Taylor series will actually completely converge on the function for all possible values of x. We say that the function equals its Taylor series in this case.

For others, it only converges within a small distance of the optimisation point; this range is called the radiance of convergence for the Taylor series.

### Euler's number


### Alternative notation for $f'(x)$

* Lagrainge's notation
  * For $f(x) = x + 2$, derivatives are $f'(x) = ...$, $f'''(x)$, $f^{(4)}(x)$
  * For $y = x + 2$, derivatives are $y'$, ...

* Leibniz is the original. Very common for $ y = x + 2 $, etc.
  * ${{dy} \over {dx}} = [ x + 2 ]$, ${{d^2y} \over {dx^2}} = [...], ...$
  * Also sometimes just $d \over {dx}$, $d^2 \over dx^2$, ...
  
* Euler's notation
  * $D_xf(x) = x + 2$
  * $D_x^2f(x) = 1$
  * $D_x^3f(x) = 0$
  
* Newton's notation / dot notation; never used.
  * for $f(x)$, derivatives are $\dot{v}$, $\ddot{v}$, $\dddot{v}$, $\ddddot{v}$, ...

## Sequences and series

Notations for a sequence are:
1. $\{ a_1, a_2, a_3, a_..., a_n \}$
2. $\{ a_n \}$ - denotes the complete sequence where the starting value is implied
3. $\{a_n\}^\infty_{n=1}$

Sequences often have a function that is the same as the sequence. In this, $n$ is the $x$/input value, and the value is the $y$.

$$ \Bigl\{ { {n + 1} \over {n^2} } \Bigr\}_{n=1}^\infty \equiv f(x)={ {n + 1} \over {n^2} }$$

Limits: https://www.youtube.com/watch?v=XLtkDkqk54Y


## Negative and fractional exponents

Negative and fractional components cannot be expressed as finite polynomials; they're all series.  This video explains using the Taylor series to solve:

https://www.youtube.com/watch?v=6kykYQK19No

Working through this:
https://tutorial.math.lamar.edu/Classes/CalcII/SeriesIntro.aspx