Skip to content

Commit

Permalink
feat(book/big-o): add new chapter on how to determine big o from code.
Browse files Browse the repository at this point in the history
  • Loading branch information
amejiarosario committed Sep 29, 2020
1 parent 2e77826 commit 68c73d4
Show file tree
Hide file tree
Showing 8 changed files with 309 additions and 73 deletions.
77 changes: 42 additions & 35 deletions book/content/part01/algorithms-analysis.asc

Large diffs are not rendered by default.

78 changes: 41 additions & 37 deletions book/content/part01/big-o-examples.asc
Expand Up @@ -5,7 +5,7 @@ endif::[]

=== Big O examples

There are many kinds of algorithms. Most of them fall into one of the eight time complexities that we are going to explore in this chapter.
There are many kinds of algorithms. Most of them fall into one of the eight-time complexities that we will explore in this chapter.

.Eight Running Time Complexities You Should Know
- Constant time: _O(1)_
Expand All @@ -22,9 +22,10 @@ We are going to provide examples for each one of them.
Before we dive in, here’s a plot with all of them.

.CPU operations vs. Algorithm runtime as the input size grows
image::image5.png[CPU time needed vs. Algorithm runtime as the input size increases]
// image::image5.png[CPU time needed vs. Algorithm runtime as the input size increases]
image::big-o-running-time-complexity.png[CPU time needed vs. Algorithm runtime as the input size increases]

The above chart shows how the running time of an algorithm is related to the amount of work the CPU has to perform. As you can see O(1) and O(log n) are very scalable. However, O(n^2^) and worst can make your computer run for years [big]#😵# on large datasets. We are going to give some examples so you can identify each one.
The above chart shows how the algorithm's running time is related to the work the CPU has to perform. As you can see, O(1) and O(log n) is very scalable. However, O(n^2^) and worst can convert your CPU into a furnace 🔥 for massive inputs.

[[constant]]
==== Constant
Expand Down Expand Up @@ -53,13 +54,13 @@ As you can see in both examples (array and linked list), if the input is a colle
==== Logarithmic
(((Logarithmic)))
(((Runtime, Logarithmic)))
Represented in Big O notation as *O(log n)*, when an algorithm has this running time it means that as the size of the input grows the number of operations grows very slowly. Logarithmic algorithms are very scalable. One example is the *binary search*.
Represented in Big O notation as *O(log n)*, when an algorithm has this running time, it means that as the input size grows, the number of operations grows very slowly. Logarithmic algorithms are very scalable. One example is the *binary search*.
indexterm:[Runtime, Logarithmic]

[[logarithmic-example]]
===== Searching on a sorted array

The binary search only works for sorted lists. It starts searching for an element on the middle of the array and then it moves to the right or left depending if the value you are looking for is bigger or smaller.
The binary search only works for sorted lists. It starts searching for an element in the middle of the array, and then it moves to the right or left depending on if the value you are looking for is bigger or smaller.

// image:image7.png[image,width=528,height=437]

Expand All @@ -68,15 +69,15 @@ The binary search only works for sorted lists. It starts searching for an elemen
include::{codedir}/runtimes/02-binary-search.js[tag=binarySearchRecursive]
----

This binary search implementation is a recursive algorithm, which means that the function `binarySearchRecursive` calls itself multiple times until the solution is found. The binary search splits the array in half every time.
This binary search implementation is a recursive algorithm, which means that the function `binarySearchRecursive` calls itself multiple times until the program finds a solution. The binary search splits the array in half every time.

Finding the runtime of recursive algorithms is not very obvious sometimes. It requires some tools like recursion trees or the https://adrianmejia.com/blog/2018/04/24/analysis-of-recursive-algorithms/[Master Theorem]. The `binarySearch` divides the input in half each time. As a rule of thumb, when you have an algorithm that divides the data in half on each call you are most likely in front of a logarithmic runtime: _O(log n)_.
Finding the runtime of recursive algorithms is not very obvious sometimes. It requires some tools like recursion trees or the https://adrianmejia.com/blog/2018/04/24/analysis-of-recursive-algorithms/[Master Theorem]. The `binarySearch` divides the input in half each time. As a rule of thumb, when you have an algorithm that divides the data in half on each call, you are most likely in front of a logarithmic runtime: _O(log n)_.

[[linear]]
==== Linear
(((Linear)))
(((Runtime, Linear)))
Linear algorithms are one of the most common runtimes. It’s represented as *O(n)*. Usually, an algorithm has a linear running time when it iterates over all the elements in the input.
Linear algorithms are one of the most common runtimes. Their Big O representation is *O(n)*. Usually, an algorithm has a linear running time when it visits every input element a fixed number of times.

[[linear-example]]
===== Finding duplicates in an array using a map
Expand All @@ -91,19 +92,19 @@ include::{codedir}/runtimes/03-has-duplicates.js[tag=hasDuplicates]
----

.`hasDuplicates` has multiple scenarios:
* *Best-case scenario*: first two elements are duplicates. It only has to visit two elements.
* *Best-case scenario*: the first two elements are duplicates. It only has to visit two elements and return.
* *Worst-case scenario*: no duplicates or duplicates are the last two. In either case, it has to visit every item in the array.
* *Average-case scenario*: duplicates are somewhere in the middle of the collection. Only half of the array will be visited.
* *Average-case scenario*: duplicates are somewhere in the middle of the collection.

As we learned before, the big O cares about the worst-case scenario, where we would have to visit every element on the array. So, we have an *O(n)* runtime.

Space complexity is also *O(n)* since we are using an auxiliary data structure. We have a map that in the worst case (no duplicates) it will hold every word.
Space complexity is also *O(n)* since we are using an auxiliary data structure. We have a map that, in the worst case (no duplicates), it will hold every word.

[[linearithmic]]
==== Linearithmic
(((Linearithmic)))
(((Runtime, Linearithmic)))
An algorithm with a linearithmic runtime is represented as _O(n log n)_. This one is important because it is the best runtime for sorting! Let’s see the merge-sort.
You can represent linearithmic algorithms as _O(n log n)_. This one is important because it is the best runtime for sorting! Let’s see the merge-sort.

[[linearithmic-example]]
===== Sorting elements in an array
Expand All @@ -117,7 +118,7 @@ The ((Merge Sort)), like its name indicates, has two functions merge and sort. L
----
include::{codedir}/algorithms/sorting/merge-sort.js[tag=splitSort]
----
<1> If the array only has two elements we can sort them manually.
<1> If the array only has two elements, we can sort them manually.
<2> We divide the array into two halves.
<3> Merge the two parts recursively with the `merge` function explained below

Expand All @@ -134,15 +135,15 @@ The merge function combines two sorted arrays in ascending order. Let’s say th
.Mergesort visualization. Shows the split, sort and merge steps
image::image11.png[Mergesort visualization,width=500,height=600]

How do we obtain the running time of the merge sort algorithm? The mergesort divides the array in half each time in the split phase, _log n_, and the merge function join each splits, _n_. The total work is *O(n log n)*. There are more formal ways to reach this runtime, like using the https://adrianmejia.com/blog/2018/04/24/analysis-of-recursive-algorithms/[Master Method] and https://www.cs.cornell.edu/courses/cs3110/2012sp/lectures/lec20-master/lec20.html[recursion trees].
How do we obtain the running time of the merge sort algorithm? The merge-sort divides the array in half each time in the split phase, _log n_, and the merge function join each splits, _n_. The total work is *O(n log n)*. There are more formal ways to reach this runtime, like using the https://adrianmejia.com/blog/2018/04/24/analysis-of-recursive-algorithms/[Master Method] and https://www.cs.cornell.edu/courses/cs3110/2012sp/lectures/lec20-master/lec20.html[recursion trees].

[[quadratic]]
==== Quadratic
(((Quadratic)))
(((Runtime, Quadratic)))
Running times that are quadratic, O(n^2^), are the ones to watch out for. They usually don’t scale well when they have a large amount of data to process.
Quadratic running times, O(n^2^), are the ones to watch out for. They usually don’t scale well when they have a large amount of data to process.

Usually they have double-nested loops, where each one visits all or most elements in the input. One example of this is a naïve implementation to find duplicate words on an array.
Usually, they have double-nested loops, where each one visits all or most elements in the input. One example of this is a naïve implementation to find duplicate words on an array.

[[quadratic-example]]
===== Finding duplicates in an array (naïve approach)
Expand All @@ -165,34 +166,37 @@ Let’s say you want to find a duplicated middle name in a phone directory book
==== Cubic
(((Cubic)))
(((Runtime, Cubic)))
Cubic *O(n^3^)* and higher polynomial functions usually involve many nested loops. An example of a cubic algorithm is a multi-variable equation solver (using brute force):
Cubic *O(n^3^)* and higher polynomial functions usually involve many nested loops. An example of a cubic algorithm is a multi-variable equation solver (using brute force) or finding three elements on an array that add up to a given number.

[[cubic-example]]
===== Solving a multi-variable equation
===== 3 Sum

Lets say we want to find the solution for this multi-variable equation:
Let's say you want to find 3 items in an array that add up to a target number. One brute force solution would be to visit every possible combination of 3 elements and add them up to see if they are equal to target.

_3x + 9y + 8z = 79_

A naïve approach to solve this will be the following program:

//image:image13.png[image,width=528,height=448]

.Naïve implementation of multi-variable equation solver
[source, javascript]
----
include::{codedir}/runtimes/06-multi-variable-equation-solver.js[tag=findXYZ]
function threeSum(nums, target = 0) {
const ans = [];
for(let i = 0; i < nums.length -2; i++)
for(let j = i + 1; j < nums.length - 1; j++)
for(let k = j + 1; k < nums.length; k++)
if (nums[i] + nums[j] + nums[k] === target)
ans.push([nums[i], nums[j], nums[k]]);
return ans;
}
----

WARNING: This is just an example, there are better ways to solve multi-variable equations.
As you can see, three nested loops usually translate to O(n^3^). If we had four nested loops (4sum), it would be O(n^4^) and so on. A runtime in the form of _O(n^c^)_, where _c > 1_, we refer to this as a *polynomial runtime*.

As you can see three nested loops usually translates to O(n^3^). If you have a four variable equation and four nested loops it would be O(n^4^) and so on. When we have a runtime in the form of _O(n^c^)_, where _c > 1_, we refer to this as a *polynomial runtime*.
NOTE: You can improve the runtime of 3sum from _O(n^3^)_ to _O(n^2^)_, if we sort items first and then use one loop and two pointers to find the solutions.

[[exponential]]
==== Exponential
(((Exponential)))
(((Runtime, Exponential)))
Exponential runtimes, O(2^n^), means that every time the input grows by one the number of operations doubles. Exponential programs are only usable for a tiny number of elements (<100) otherwise it might not finish in your lifetime. [big]#💀#
Exponential runtimes, _O(2^n^)_, means that every time the input grows by one, the number of operations doubles. Exponential programs are only usable for a tiny number of elements (<100); otherwise, it might not finish in your lifetime. [big]#💀#

Let’s do an example.

Expand All @@ -209,21 +213,21 @@ Finding all distinct subsets of a given set can be implemented as follows:
include::{codedir}/runtimes/07-sub-sets.js[tag=snippet]
----
<1> Base case is empty element.
<2> For each element from the input append it to the results array.
<2> For each element from the input, append it to the results array.
<3> The new results array will be what it was before + the duplicated with the appended element.

//.The way this algorithm generates all subsets is:
//1. The base case is an empty element (line 13). E.g. ['']
//2. For each element from the input append it to the results array (line 16)
//2. For each element from the input, append it to the results array (line 16)
//3. The new results array will be what it was before + the duplicated with the appended element (line 17)

Every time the input grows by one the resulting array doubles. That’s why it has an *O(2^n^)*.
Every time the input grows by one, the resulting array doubles. That’s why it has an *O(2^n^)*.

[[factorial]]
==== Factorial
(((Factorial)))
(((Runtime, Factorial)))
Factorial runtime, O(n!), is not scalable at all. Even with input sizes of ~10 elements, it will take a couple of seconds to compute. It’s that slow! [big]*🍯🐝*
The factorial runtime, `O(n!)`, is not scalable at all. Even with input sizes of ~10 elements, it will take a couple of seconds to compute. It’s that slow! [big]*🍯🐝*

.Factorial
****
Expand All @@ -240,7 +244,7 @@ A factorial is the multiplication of all the numbers less than itself down to 1.
===== Getting all permutations of a word
(((Permutations)))
(((Words permutations)))
One classic example of an _O(n!)_ algorithm is finding all the different words that can be formed with a given set of letters.
One classic example of an _O(n!)_ algorithm is finding all the different words formed with a given set of letters.

.Word's permutations
// image:image15.png[image,width=528,height=377]
Expand All @@ -251,7 +255,7 @@ include::{codedir}/runtimes/08-permutations.js[tag=snippet]

As you can see in the `getPermutations` function, the resulting array is the factorial of the word length.

Factorial starts very slow, and quickly becomes uncontrollable. A word size of just 11 characters would take a couple of hours in most computers!
Factorial starts very slow and quickly becomes unmanageable. A word size of just 11 characters would take a couple of hours in most computers!
[big]*🤯*

==== Summary
Expand All @@ -265,7 +269,7 @@ We went through 8 of the most common time complexities and provided examples for
|===
|Big O Notation
|Name
|Example(s)
| example (s)

|O(1)
|<<part01-algorithms-analysis#constant>>
Expand Down

0 comments on commit 68c73d4

Please sign in to comment.