# Lesson 6: Big O Notation 
---


# Review
---

1. What is an array?
2. Specifically in Python there are differences in syntax regarding arrays and lists. Do you remember what differs?
2. What do you think are linked lists?
3. Below, instantiate an object from the class BeautifulSoup and pass in a document called mywiki using html.parser.

In [None]:
soup = BeautifulSoup(mywiki, 'html.parser')

# Concept 1: Big O Notation
---



## What is it?

Big O Notation is used to describe the complexity and performance of an algorithm. There are two types of complexity: time complexity and space complexity.

1. Time complexity refers to how long the algorithm takes to run given
any input size.
2. Space complexity refers to how much extra space is needed (to store things like temporary variables, buers, etc).

Efficiency is a big part of software engineering and in life too. We don't want software apps to operate slowly or take up a huge amount of memory on our phones. By knowing Big O Notation or more mathematically speaking, asymptotic notation, we can gauge the efficiency of algorithms.

What we really want to know is how long these algorithms take. We're interested in time. The running time of an algorithm depends on how long it takes a computer to run the lines of code of the algorithm and that depends on the speed of the computer, the programming language, and the compiler that translates the program from the programming language into code that runs directly on the computer, among other factors.

Big O Notation allows us to check the efficiency of algorithms.

## Examples:
---

1. Look at the code below. This code will run until it traverses the entire array. Let's say this array has N elements, where N is the total amount of elements in the array. For simplicity, we can say this algorithm runs in N time. Since the array takes up N spaces in memory, this has a space complexity of N.

In [None]:
# Returns the index if the value is found in the array
def findElement(array, targetValue):
  for n in range(len(array)): 
    if array[n] == targetValue: 
      return n 
    
  return -1

2. Look at the code below. Given what you know from the previous example, what can you tell about this? 
> Hint: The nested loops are a big deal.

In [None]:
adj = ["red", "big", "tasty"]
fruits = ["apple", "banana", "cherry"]

for a in adj:
  for b in fruits:
    print(a, b)

red apple
red banana
red cherry
big apple
big banana
big cherry
tasty apple
tasty banana
tasty cherry


Let O be each element of adj.
Let X be ecah element of fruits.

O | X X X 

O | X X X 

O | X X X 

You can see we have a total of 9 X's. 3^2 is 9. Using this small example, we show that this is N^2.
> Note: N is an arbitrary number. In production this can be in the millions or billions.

> Note 2: Just because this is a nested loop doesn't always mean it will be N^2. 

## DIY:
---

1. What is the run time of this code?

In [None]:
x = [2, 4, 6, 8, 10, 12]

y = [2, 2, 2, 2, 2, 2]

plot(x, y, 'Rainfall in inches')

# Concept 2: The Notation
---




## What are they?

There are 3 types of asymptotic notation that you should know about:
1. Big Theta (Lower Bound)- Best case running time for a running algorithm
2. Big Omega (Tight Bound)- Average case running time for a running algorithm
3. Big O (Upper Bound)- Worst case running time for a running algorithm

There is more to it than knowing what is best or worst, but you will learn more about this in your advanced math classes.

Click on [this](https://images.app.goo.gl/QQgTkHk99557WktDA) to see a graph.
For this instance, the higher the graph is, the more inefficient it is. In mathematical terms, visualize y=x^2 and y=3. Since y=x^2 has more "height," it is rather inefficient. 

Here is an example:
>  Suppose you have 10 dollars in your pocket. You go up to your friend and say, "I have an amount of money in my pocket, and I guarantee that it's no more than one million dollars." Your statement is absolutely true, though not terribly precise. One million dollars is an upper bound or (big O) on 10 dollars. Think of efficiency here. Let's imagine that dollars is equivalent to seconds. In computer science terms, we can say the worst case this algorithm runs is a million seconds. The lower bound is 0 dollars because you can't have a negative amount of money. The tight bound is \$10 because it shows you the \~exact~ amount. 


You should know the general differences between these 3 but in an industry setting and for simplicity, we normally use Big O so we can prepare for the "worst case" but also acknowledge better options.

Example:
> Someone can say "This algorithm is generally O(n^2) but in some cases it can be O(n).

## The following are some of the most common Big-O functions:

Organized by efficiency.
> Note: In most cases this is the general rule of thumb. However in other times, the logs can perform better or worse that already stated.

* Name	| Big O
* Constant	O(1)
* Logarithmic	O(log(n))
* Linear	O(n)
* Log Linear	O(nlog(n))
* Quadratic	O(n^2)
* Cubic	O(n^3)
* Exponential	O(2^n)
* Factorial O(n!)
* O(n^n)


### Example of O(1)

No matter what, it always take the same time.

O(1) describes an algorithm that will always execute in the same time (or space) regardless of the size of the input data set.

```
def IsFirstElementNone(elements):
    return elements[0] is None
```
### Example of O(n)

Scales with the size of a list.

O(N) describes an algorithm whose performance will grow linearly and in direct proportion to the size of the input data set. The example below also demonstrates how Big O favours the worst-case performance scenario; a matching string could be found during any iteration of the for loop and the function would return early, but Big O notation will always assume the upper limit where the algorithm will perform the maximum number of iterations.

```
def ContainsValue(elements,value):
    for element in elements:
      if element == value:
       return True
    return False
}
```
### Example of O(n^2)

Twice as long as the input size.

O(N2) represents an algorithm whose performance is directly proportional to the square of the size of the input data set. This is common with algorithms that involve nested iterations over the data set. Deeper nested iterations will result in O(N3), O(N4) etc.
```
def printElements(elements):
    for i in range(len(elements)):
      for j in elements:
        print(i,j)
}
```
### Example of Logarithms

Halves everytime.

Click [here](https://images.app.goo.gl/bHZJ8pULzWWbmNHA6) for an image of a log graph. Just know for algorithms running in logarithmic time, they are growing proportional to the log function. Logarithms are expressions of halving (dividing). 

> Example: Log based 2 of 8 => $log_2$(8) = 3. How many 2s do we multiply to get 8? We are dividing 8 by 2, 3 times. 8 -> 4 -> 2 -> 1.

We will see the examples of logarithmic running times in more complex algorithms.

### Significant Numbers

Drop constants and have priority for dominant cases.

Since we are looking at Big O or the upper bound, we only consider the highest upper bound there is for an algorithm. When the input (which has length n in this case) becomes extremely large, the constants become insignificant i.e. twice or half of the infinity still remains infinity. Therefore, we can ignore the constants. If an algorithm is O(3n +5), we drop the insignificant numbers and just say that the algorithm is O(n).  Let's look at the first example below.




## Examples:
---

In [None]:
# Returns the index if the value is found in the array

def findElement(array, targetValue):
  for n in range(len(array)): # O(n)
    if array[n] == targetValue: # O(1)
      return n # O(1)
    
  return -1 # O(1)

# O(n) + O(1) + O(1) + O(1) = O(n) 

Here are the running times of some operations we might perform on the phone book, from fastest to slowest:

* O(1) (in the worst case): Given the page that a business's name is on and the business name, find the phone number.

* O(1) (in the average case): Given the page that a person's name is on and their name, find the phone number.

* O(log n): Given a person's name, find the phone number by picking a random point about halfway through the part of the book you haven't searched yet, then checking to see whether the person's name is at that point. Then repeat the process about halfway through the part of the book where the person's name lies. (We halve the book.)

* O(n): Find all people whose phone numbers contain the digit "5".


## DIY:
---

1. What is the running time of this algorithm? space complexity?

In [None]:
def linear_algo(items):
    for item in items:
        print(item)

    for item in items:
        print(item)

linear_algo([4, 5, 6, 8])

2. What is the running time of this algorithm? space complexity?

In [None]:
def complex_algo(items):

    for i in range(5):
        print ("Python is awesome")

    for item in items:
        print(item)

    for item in items:
        print(item)

    print("Big O")
    print("Big O")
    print("Big O")

complex_algo([4, 5, 6, 8])

# Concept 3: Analyze our Previous Code
---



## Importance of Analysis

Answer this: Why do you think it is important to analyze your code?

## The Common List Operations And Their Big-O Notations
1. Insertion O(n)
2. Deletion O(n)
3. Getting an item: O(1)
4. Iteration: O(n)
5. Get length: O(1)

## The Common Stack Operations And Their Big-O Notations
1. Insertion O(1)
2. Deletion O(1)
3. Getting an item: O(n)
4. Iteration: O(n)
5. Get length: O(n)

## The Common Queue Operations And Their Big-O Notations
1. Insertion O(1)
2. Deletion O(1)
3. Getting an item: O(n)
4. Iteration: O(n)
5. Get length: O(n)

## Examples:
---

For each example, describe the time and space complexity.

In [None]:
from datetime import date

# date object of today's date
today = date.today() 

print("Current year:", today.year)
print("Current month:", today.month)
print("Current day:", today.day)

In [None]:
html_doc = """
<html><head><title>The Dormouse's story</title></head>
<body>
<p class="title"><b>The Dormouse's story</b></p>
 
<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
and they lived at the bottom of a well.</p>
 
<p class="story">...</p>
"""
 
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc, 'html.parser')
soup.find_all('a')

In [None]:
presidents = ["Washington", "Adams", "Jefferson", "Madison", "Monroe", "Adams", "Jackson"]
for num, name in enumerate(presidents, start=1):
    print("President {}: {}".format(num, name))

In [None]:
class Dinner:
  def __init__(self, food, drink):
    self.f = food
    self.d = drink
init = Dinner(1, 2)

In [None]:
def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

# Summary:
---


1. What is Big O notation?
2. Differences between big theta, omega, and O?
3. Which one is faster: O(n) or O(log(n))?
4. Explain time complexity.
5. Explain space complexity.

# Homework:
---



1. Watch this [video](https://www.youtube.com/watch?v=__vX2sjlpXU) on Big-O Notation (5 min). 
2. You are given two sets. Set A= 1,2,3,4,5,6 and Set B= 2,3,4,5,6,7,8.
How many elements are present in A union B? A intersection B? (Refer to Lesson 3 for help)
3. Create a function that calculates an area of a triangle. Formatting code is up to you.
4. Analyze the time and space complexity of the code below. In your homework file, just print out a statement stating the times. i.e.
```
The time complexity is O(23) and the space complexity is O(n^3).
```

In [None]:
def bubbleSort(arr):
    n = len(arr)
 
    # Traverse through all array elements
    for i in range(n):
 
        # Last i elements are already in place
        for j in range(0, n-i-1):
 
            # traverse the array from 0 to n-i-1
            # Swap if the element found is greater
            # than the next element
            if arr[j] > arr[j+1] :
                arr[j], arr[j+1] = arr[j+1], arr[j]
 
# Driver code to test above
arr = [64, 34, 25, 12, 22, 11, 90]
 
bubbleSort(arr)

# Notes on homework:
---



I will check in on Thursday,  through email to check on your progress. Respond with any questions you might have. Otherwise, a simple “all good” is appropriate if you have no questions or comments. 

You will need to upload your coding homework assignments to GitHub.
1. In gitbash, change directories to the homework directory: tomas_python/homework
* TIP: use ‘cd’ to change directories
* Use ‘cd ..’ to return to the previous directory
* Use ‘pwd’ to show full pathname of the current working directory 
* Use ‘ls’ to list all your directories
2. Once you’re in that directory, type in ‘git pull’
* This ensures you have all updated files
* If there is an error involved, email me immediately so we can try resolving it.
* Otherwise, type your code below and we’ll resolve issues next class
3. To create a new file, type in ‘touch hw01.py’ or the appropriate file name
* ‘Touch’ creates a new file
4. Open up the python file and start coding!

Note: Become familiar with these actions. This is essentially what happens in the backend when you right-click and create a new folder/file!