# Big-O Math Deep Dive

## Lesson Overview

*NOTE: **This lesson is entirely optional.** The content covers the mathematics behind much of the complexity analysis in the standard material. The questions are extremely difficult and out of scope.*

**Limit notation**

When analyzing the efficiency of an algorithm, it is important to understand how the time and space requirements of the algorithm change as it handles more data. For example, suppose you have an algorithm for sorting integers that you want to deploy to production. You should know how long it takes and how much memory it requires both for small $n$ (where $n$ is the number of integers to sort) and also for increasingly large $n$, or as $n \to \infty$.

You may have seen **limit notation** such as this before. If you haven't, don't worry. Writing $n \to \infty$ is just shorthand for writing "as $n$ gets larger and larger".

> A statement $S(n)$ is true as $n \to \infty$ if there exists an $N$ such that $S(n)$ is true for all $n \geq N$.

Remember that it is not sufficient for the statement to be true *at* a large value of $N$, but for all values of $n \geq N$.

**Conceptual definition**

In a mathematical context, big-O notation is used to compare the growth of two functions. The growth of a function is, conceptually, how the function behaves as the input increases towards infinity.

> $f(n) \in O(g(n))$ if $f(n)$ grows at most as quickly as $g(n)$, as $n \to \infty$.

$O(g(n))$ is the set of all of the functions that grow at most as quickly as $g(n)$. The $\in$ notation is used to denote membership in a set, and $f(n)$ is one of the functions in the set.

In most contexts in computer science, it is more common to write $f(n) = O(g(n))$ than $f(n) \in O(g(n))$. Therefore throughout this lesson, we will use $=$ instead of $\in$ for big-O comparisons.

**Mathematical definition 1**

The mathematical definition of big-O is just a formalization of the conceptual definition above.

> $f(n) \in O(g(n))$ as $n \to \infty$ if, for any given $N$, there exists a positive number $M$ such that $|f(n)| \leq Mg(n)$ for all $n \geq N$.

Using this definition, $n^2 = O(2^n)$ because, for any $N$, you can find an $M$ such that $n^2 \leq M \cdot 2^n$ for all $n \geq N$. For example:

- If $N = 3$, you can choose $M = 2$ so that $n^2 \leq 2 \cdot 2^n$ for all $n \geq 3$. (In fact, there are smaller values of $M$ that you could choose. $M = 1.5$ would work too, as would any value $M \geq \frac{9}{8}$.)
- For any $N \geq 4$, you can choose $M = 1$ so that $n^2 \leq 1 \cdot 2^n$ for all $n \geq 4$.

**Mathematical definition 2**

This alternate mathematical definition of big-O may be simpler to understand, but contains more rigorous notation.

> $f(n) \in O(g(n))$ if $\lim\limits_{n \to \infty} \frac{f(n)}{g(n)} < \infty$.

- $\lim\limits_{n \to \infty} \frac{f(n)}{g(n)}$ denotes the value that $\frac{f(n)}{g(n)}$ approaches or becomes closer and closer to as $n \to \infty$.
- $< \infty$ just means that the limit should be a finite number.

Here's another way to write this:

> For any $N$, there exists an $M$ such that $\frac{f(n)}{g(n)} < M$ for all $n > N$.

This definition means that the ratio of $f(n)$ to $g(n)$ must *not* grow towards infinity as $n \to \infty$. For example, $\lim\limits_{n \to \infty} \frac{n^2}{2^n} = 0$, so $n^2 = O(2^n)$.

But the limit of the ratio does not need to be zero. Even if $f(n) > g(n)$ for all $n$, it can still be true that $f(n) = O(g(n))$, as long as $f(n)$ does not *grow* faster than $g(n)$. For example, $100n = O(n)$, since $\lim\limits_{n \to \infty} \frac{100n}{n} = \lim\limits_{n \to \infty} 100 = 100$. This can also be shown using the initial definition of big-O by choosing $M = 100$.

As a general rule, constants can be ignored when applying big-O notation. More formally, $M f(n)$ always has the same big-O properties as $f(n)$ itself, so you can ignore $M$.

**Using derivatives to compare growth**

One of the most effective ways to examine a function's growth is through its **derivative**. The derivative of a function $f(n)$ is itself a function, denoted $f'(n)$ that tells you the rate of change of $f(n)$. ([Here is a quick guide](https://www.mathsisfun.com/calculus/derivatives-rules.html) to derivatives of common functions.) For example, if $f(n) = n^2$ and $g(n) = 2^n$,

\begin{align*}
f'(n) &= 2n \\
g'(n) &= \ln(2) \cdot 2^n \\
\end{align*}

where $\ln$ is the [natural logarithm](https://en.wikipedia.org/wiki/Natural_logarithm). Remember that constants like $\ln(2)$ can be ignored in big-O analysis. 

While it may be harder to tell by inspection that $g(n)$ grows faster than $f(n)$, it should be more straightforward to see that $g'(n) > f'(n)$ for all large $n$, and therefore $f(n) = O(g(n))$.

## Assessment Questions

## Question 1

Using derivatives, relate $f(n) = \sqrt n$ and $g(n) = \log_2(n)$ using big-O notation. See *Understanding Limiting Behavior* above for an example.

### Hint

Remember that $\sqrt n = n^{\frac{1}{2}}$.

### Solution

Using [this guide for calculating derivatives](https://www.mathsisfun.com/calculus/derivatives-rules.html):

\begin{align*}
f'(n) &= \frac{1}{2\sqrt n} \\
g'(n) &= \frac{1}{n \ln(2)} \\
\end{align*}

In general for large $n$, $\sqrt{n} < n$, so $\frac{1}{\sqrt n} > \frac{1}{n}$. As always, constants like $\frac{1}{2}$ and $\frac{1}{\ln(2)}$ can be ignored. Therefore, since $f'(n) > g'(n)$ for all large $n$, $g(n) = O(f(n))$.

## Question 2

Use any method to show that $f(n) = n^{\frac{3}{2}}$ grows slower than $g(n) = n\log_2(n)$.

### Solution

There are a few ways to do this. The first way is a heuristic. As per the common complexities table in the Lesson Overview, $\sqrt{n}$ grows faster than $\log_2(n)$, so if both expressions are multipled by $n$, it follows that $n^{\frac{3}{2}}$ grows faster than $n \log_2(n)$.

A second more formal approach uses derivatives. As seen in the lesson that defines big-O notation, if we can show that $f'(n) > g'(n)$ for all large values of $n$, then $f$ grows faster than $g$. Given $f$ and $g$ defined as in the question, we have
 
\begin{align*}
f'(n) &= \frac{3}{2} n^{\frac{1}{2}} \\
&= O(\sqrt{n}). \\
\end{align*}

Using [log laws](https://en.wikipedia.org/wiki/List_of_logarithmic_identities#Using_simpler_operations), $\log_2(n) = \frac{\ln(n)}{\ln(2)}$. Thus we can rewrite $g$ as

\begin{align*}
g(n) &= \frac{1}{\ln(2)} n \ln(n). \\
\end{align*}

Using the chain rule to calculate the derivative of $g$ yields

\begin{align*}
g'(n) &= \frac{1}{\ln(2)} \left( \ln(n) \cdot 1 + \frac{1}{n} \cdot n \right) \\
&= \frac{1}{\ln(2)}(\ln(n)+1) \\
&= \log_2(n) + \frac{1}{\ln(2)} \\
&= O(\log_2(n)). \\
\end{align*}

As shown via derivatives in an exercise from the big-O definition lesson, $\sqrt{n}$ grows faster than $\log_2(n)$ for large $n$, so eventually for large $n$ $\sqrt{n} > \log_2(n)$. Therefore, since $f'(n) > g'(n)$ for large $n$, $f$ grows faster than $g$.

## Question 3

[Advanced] Relate $f(n) = 100n^{100}$ and $g(n) = 2^n$ using big-O notation.

### Hint

What happens if you take the derivative over and over again?

### Solution

This can be simplified by recognizing that constants can be ignored, so we can compare $f(n) = n^{100}$ to $g(n) = 2^n$. We covered in the Lesson Overview that $n^2 = O(2^n)$, so how does that change as the exponent of $n$ changes?

These functions exemplify why we can't always solely use a visualization. See below for a graph of both functions on a log scale for values up to 100. For this range, it appears as if $n^{100}$ is growing much faster than $2^n$. If we expand the $x$-axis to larger values of $n$, we may hit computational issues. ($100^{100}$ is already a very large number, equal to 2 [googol](https://en.wikipedia.org/wiki/Googol).)

In [None]:
N = 100
n = [i for i in range(1, N+1)]

f = [i**100 for i in n]
g = [2**i for i in n]

plt.plot(n, f, color='blue', label='n^100')
plt.plot(n, g, color='red', label='2^n')
plt.yscale('log')
plt.legend()
plt.show()

For this comparison, let's try taking the derivative:

\begin{align*}
f'(n) &= 100n^{99} \\
g'(n) &= \ln(2) \cdot 2^n \\
\end{align*}

And the second derivative:

\begin{align*}
f''(n) &= 9900n^{98} \\
g''(n) &= \ln(2)^2 \cdot 2^n \\
\end{align*}

Let $f^{(m)}(n)$ be the $m^{\textrm{th}}$ derivative of $f(n)$. By repeating this, we will see that:

\begin{align*}
f^{(101)}(n) &= 0 \\
g^{(101)}(n) &= \ln(2)^ {101} \cdot 2^n \\
\end{align*}

So while all derivatives of $g(n)$ grow exponentially, the derivatives of $f(n)$ eventually have no growth. We can deduce from this that while $f(n)$ may have larger values than $g(n)$ for some $n$, eventually $g(n)$ will grow faster than $f(n)$, so $n^{100} = O(2^n)$.

## Question 4

[Advanced] Relate $f(n)$ to $g(n)$ where:

\begin{align*}
f(n) &= 0.135 \cdot 2^{2n+3} + 10^{100} \sqrt n - 56 \\
g(n) &= 10n^{1000} + 9n^{999} + 8\pi \\
\end{align*}

### Hint

When a function is the sum of many functions, it only grows as fast as its fastest growing term. This can be formally stated as follows:

> If $f(n) = \sum\limits_{i=1}^n f_i(n)$ and there exists some $m$ such that $f_i(n) = O(f_m(n))$ for all $i$, then $f(n) = O(f_m(n))$.

### Solution

First, let's drastically simplify this problem by ignoring the multiplicative and additive constants.

\begin{align*}
f(n) &= 0.135 \cdot 2^{2n+3} + 10^{100} \sqrt n - 56 \\
&= O(2^{2n+3} + \sqrt n) \\
&= O(2^3 \cdot (2^2)^n + \sqrt n) \\
&= O(4^n + \sqrt n) \\
g(n) &= 10n^{1000} + 9n^{999} + 8\pi \\
&= O(n^{1000} + n^{999}) \\
\end{align*}

Now we can compare $f(n) = 4^n + \sqrt n$ to $g(n) = n^{1000} + n^{999}$.

Using the hint, if $f(n)$ can be broken down into a sum of other functions, then we only need to know which of those functions grows the fastest. Once we find that function $f_m(n)$, we know that $f(n) = O(f_m(n))$.

Using similar approaches to above (either by derivatives or a visualization), we can show that $\sqrt n = O(n^2)$, and as we saw in the Lesson Overview, $n^2 = O(2^n)$. It should then make intuitive sense that $2^n = O(4^n)$ since $4^n = (2^n)^2$, therefore $\sqrt n = O(n^2) = O(2^n) = O(4^n)$. Using the above logic, $f(n) = O(4^n)$.

Since $n^{1000} = n \cdot n^{999}$, it should make sense that $n^{999} = O(n^{1000})$, therefore $g(n) = O(n^{1000})$. As we saw in Question 3, exponential growth beats any polynomial growth in the long run, so $n^{1000} = O(4^n)$, and $f(n) = O(g(n))$.

## Question 5

Show that if $f(n) \to \infty$ as $n \to \infty$, then $f(n) + K = O(f(n))$ for a constant $K$.

This is a formalization of what you have already seen above, namely that additive constants can be ignored for big-O comparisons. For example, $O(n^2 + 3) = O(n^2)$.

### Solution

For this example, it is probably easiest to use the second definition of big-O. We need to show that if $f(n) \to \infty$ as $n \to \infty$ then $\lim\limits_{n \to \infty} \frac{f(n) + K}{f(n)} < \infty$.

\begin{align*}
\lim_{n \to \infty} \frac{f(n) + K}{f(n)} &= \lim_{n \to \infty} \left( \frac{f(n)}{f(n)} + \frac{K}{f(n)} \right) \\
&= \lim_{n \to \infty} \left( 1 + \frac{K}{f(n)} \right) \\
&= \lim_{n \to \infty} 1 + \lim_{n \to \infty} \frac{K}{f(n)} \\
&= 1 + K \lim_{n \to \infty} \frac{1}{f(n)} \\
\end{align*}

Since $f(n) \to \infty$ as $n \to \infty$, $\frac{1}{f(n)} \to 0$ as $n \to \infty$. Therefore the above expression reduces to 1, which is $< \infty$.

## Question 6

Show that if $Kf(n) = O(f(n))$ for a constant $K$.

Again, this is a formalization of what you have seen above, that multiplicative constants can be ignored for big-O comparisons. For example, $O(3n^2) = O(n^2)$.

### Solution

Again, this is most easily shown using the second mathematical definition above. We need to show that $\lim\limits_{n \to \infty} \frac{Kf(n)}{f(n)} < \infty$.

\begin{align*}
\lim_{n \to \infty} \frac{Kf(n)}{f(n)} &= K \lim\limits_{n \to \infty} \frac{f(n)}{f(n)} \\
&= K \lim_{n \to \infty} 1 \\
&= K \\
\end{align*}

Since $K$ is a constant, it is $< \infty$.

## Question 7

Show that if $f_1(n) = O(g_1(n))$ and $f_2(n) = O(g_2(n))$, then $f_1(n) f_2(n) = O(g_1(n) g_2(n))$.

This is an important result since it shows that if you can break up a function into distinct parts, you can analyze the big-O notation of each part independently, then take the product. For example, $O(n^2 2^n) = O(n^2) O(2^n)$.

### Solution

This is easiest to show when using the first definition of big-O.

- $f_1(n) = O(g_1(n))$ therefore there exists $M_1$ such that $|f_1(n)| \leq M_1g_1(n)$ for all $n \geq N_1$.
- $f_2(n) = O(g_2(n))$ therefore there exists $M_2$ such that $|f_2(n)| \leq M_2g_2(n)$ for all $n \geq N_2$.

We need to show that for any given $N$, there exists an $M$ such that $|f_1(n) f_2(n)| \leq Mg_1(n)g_2(n)$.

Since the [absolute value of a product is the produce of absolute values](https://proofwiki.org/wiki/Absolute_Value_of_Product), we know that:

$$|f_1(n) f_2(n)| = |f_1(n)||f_2(n)|$$

Plugging in the inequalities above, we have that for $n \geq \max(N_1, N_2)$:

\begin{align*}
|f_1(n)||f_2(n)| &\leq M_1g_1(n) M_2 g_2(n) \\
&= M_1M_2 g_1g_2(n)
\end{align*}

We can therefore choose $M = M_1M_2$ and $n = \max(N_1, N_2)$ to satisfy the inequality to show that $f_1(n) f_2(n) = O(g_1(n) g_2(n))$.

## Question 8

Show that if $f_1(n) = O(g_1(n))$ and $f_2(n) = O(g_2(n))$ then $f_1(n) + f_2(n) = O(g_1(n) + g_2(n))$.

This is another result that you have already seen above. It shows that if a function can be split into a sum of other functions, it only grows as quickly as its fastest growing component. For example, $O(n^2 + 2^n) = O(2^n)$.

### Solution

This is a similar proof to the solution to Question 3.

- $f_1(n) = O(g_1(n))$ therefore there exists $M_1$ such that $|f_1(n)| \leq M_1g_1(n)$ for all $n \geq N_1$.
- $f_2(n) = O(g_2(n))$ therefore there exists $M_2$ such that $|f_2(n)| \leq M_2g_2(n)$ for all $n \geq N_2$.

By the [triangle inequality](https://en.wikipedia.org/wiki/Triangle_inequality):

$$|f_1(n) + f_2(n)| \leq |f_1(n)| + |f_2(n)|$$

Plugging in the inequalities above, we have that for $n \geq \max(N_1, N_2)$:

\begin{align*}
|f_1(n)| + |f_2(n)| &\leq M_1g_1(n) + M_2g_2(n) \\
&\leq 2\max(M_1, M_2) \max(g_1(n), g_2(n)) \\
\end{align*}

We have now found the constant $M = 2\max(M_1M_2)$ to show that $f_1(n) + f_2(n) = O(\max(g_1(n)g_2(n))$.