## Analysis of Algorithms

## Outline

- Justification for analysis
- Growth functions
- Counting machine instructions
- Landau symbols
- Big-$\Theta$ as an equivalence relation
- Little-$o$ as a weak ordering

## Background

- Suppose we have two algorithms, how can we tell which is better?

- We could implement both algorithms, run them both
    - Expensive and error prone

- Preferably, we should analyze them mathematically
    - Algorithm analysis

## Asymptotic Analysis

- In general, we will always analyze algorithms with respect to one or more variables

- We will begin with one variable:
    - The number of items $n$ currently stored in an array or other data structure
    - The number of items expected to be stored in an array or other data structure
    - The dimensions of an $n \times n$ matrix
    
- Examples with multiple variables:
    - Dealing with $n$ objects stored in $m$ memory locations
    - Multiplying a $k \times n$ and an $m \times n$ matrix


## Maximum Value

- For example, the time taken to find the largest object in an array of $n$ random integers will take $n$ operations

```cpp
int find_max( int *array, int n ) {
    int max = array[0];
    for ( int i = 1; i < n; ++i ) {
        if ( array[i] > max ) {
            max = array[i];
        }
    }
    return max;
}
```

## Linear and binary search

- There are other algorithms which are significantly faster as the problem size increases
- This plot shows maximum and average number of comparisons to find an entry in a sorted array of size $n$
    - Linear search (blue)
    - Binary search (read)
![](../img/search-comp.png)

## Asymptotic Analysis

- Given an algorithm:
    - We need to be able to describe these values mathematically
    - We need a systematic means of using the description of the algorithm together with the properties of an associated data structure
    - We need to do this in a machine-independent way

- For this, we need Landau symbols and the associated asymptotic analysis

## Quadratic Growth

- Consider the two functions <span style="color:red"> $f(n) = n^2$ </span> and <span style="color:blue"> $g(n) = n^2 - 3n + 2$ </span>
- Around $n = 0$, they look very different
![](../img/qgrowth1.png)

- Yet on the range $n = [0, 1000]$, they are (relatively) indistinguishable:
![](../img/qgrowth2.png)

- The **absolute difference** is large, for example,

    - $f(1000) = 1 000 000$
	- $g(1000) =   997 002$
    
- but the **relative difference** is very small

$$\left| \frac{f(1000)-g(1000)}{f(1000)} \right| = 0.002998 < 0.3\%$$

- and this difference goes to zero as $n  \to \infty$

## Polynomial Growth

- To demonstrate with another example,
  - <span style="color:red"> $f(n) = n^6$ </span>
  - <span style="color:blue"> $g(n) = n^6 - 23n^5 + 193n^4 -729n^3 + 1206n^2 - 648n$ </span>

- Around $n = 0$, they look very different
![](../img/pgrowth1.png)

- Still, around $n = 1000$, the relative difference is less than 3%
![](../img/pgrowth2.png)

- The justification for both pairs of polynomials being similar is that, in both cases, they each had the same leading term:
	- $n^2$ in the first case, $n^6$ in the second

- Suppose however, that the coefficients of the leading terms were different
    - In this case, both functions would exhibit the same rate of growth, however, one would always be proportionally larger


## Comparing Growth Rates

| constant | logarithm | linear | n-log-n | quadratic | cubic | exponential |
|:--------:|:---------:|:------:|:-------:|:---------:|:-----:|:-----------:|
| 1        | $\log n$  | $n$    |$n$ $\log n$| $n^2$     | $n^3$ | $a^n$       |

![](../img/growth.png)

## Counting Instructions

- Suppose we had two algorithms which sorted a list of size $n$ and the run time (in ms) is given by
    - Bubble sort 
        - <span style="color:red"> $b_{worst}(n)= 4.7n^2 - 0.5^n + 5$ </span>
        - <span style="color:red"> $b_{best}(n) = 3.8n^2 + 0.5^n + 5$ </span>
    - Selection sort
        - <span style="color:blue"> $s(n)	= 4n^2 + 14^n + 12$	</span>

- The smaller the value, the fewer instructions are run
    - For $n \leq 21$, <span style="color:red"> $b_{worst}(n)$ </span> $<$ <span style="color:blue"> $s(n)$ </span>
    - For $n \geq 22$, <span style="color:red"> $b_{worst}(n)$  </span> $>$ <span style="color:blue"> $s(n)$ </span>

- With small values of $n$, the algorithm described by  <span style="color:blue"> $s(n)$ </span> requires more instructions than even the worst-case for bubble sort
![](../img/perf1.png)


- 	Near $n = 1000$, <span style="color:red"> $b_{worst}(n)$ </span> ≈ 1.175 <span style="color:blue"> $s(n)$ </span>, and <span style="color:red"> $b_{best}(n)$ </span>  ≈  0.95 <span style="color:blue"> $s(n)$ </span>

![](../img/perf2.png)

- Is this a serious difference between these two algorithms?
    - Because we can count the number instructions, we can also estimate how much time is required to run one of these algorithms on a computer
    
- Suppose we have a 1 GHz computer
    - The time (s) required to sort a list of up to $n = 10 000$ objects is under 0.5s
	- To sort a list with one million elements, it will take about 1 hour    
    ![](../img/perf3.png)

- What if we run selection sort on a faster computer?
    - For large values of $n$, selection sort on a faster computer will always be faster than bubble sort.
    ![](../img/perf4.png)

- Why? Justification?

## Counting Instructions (cont.)

- If  <span style="color:red"> $f(n)$ </span> $= a_kn^k + \cdots$ and  <span style="color:blue"> $g(n)$  </span> $= b_kn^k + \cdots$, for large enough $n$, it will always be true that
$$f(n) < Mg(n)$$
    - where we choose $M = a_k/b_k + 1$

- In this case, we only need a computer which is $M$ times faster (or slower)

## Weak Ordering

- Consider the following definitions:

    - We will consider two functions to be equivalent, $f \sim g$, if
$$\lim_{n \to \infty} \frac{f(n)}{g(n)} = c, \text{ where } 0 < c < \infty $$
				
    - We will state that $f < g$ if
$$\lim_{n \to \infty} \frac{f(n)}{g(n)} = 0 $$


- For functions we are interested in, these define a **weak ordering**

- Let $f(n)$ and $g(n)$ describe either the run-time of two algorithms
    - If $f(n) \sim g(n)$, then it is always possible to improve the performance of one function over the other by purchasing a faster computer
    - If $f(n) < g(n)$, then you can **never** purchase a computer fast enough so that the second function always runs in less time than the first

- Note that for small values of $n$, it may be reasonable to use an algorithm that is asymptotically more expensive, but we will consider these on a one-by-one basis

## [Big-O Notation](https://en.wikipedia.org/wiki/Big_O_notation)

- A function $f(n) = O(g(n))$ if there exists $n_0$ and $c$ such that
$$f(n) \leq c g(n)	\text{,  whenever  } n \geq n_0$$
- The function $f(n)$ has a rate of growth **no greater than** that of $g(n)$ up to a constant factor and in the **asymptotic** sense as $n$ grows toward infinity.    

![](../img/big-o.png)

## Assumptions    

- Our functions will describe the time or memory required to solve a problem of size $n$
- We conclude we are restricting ourselves to certain functions:
    - They are defined for $n \geq 0$
    - They are strictly positive for all $n$
        - In fact, $f(n) > c$ for some value $c > 0$
        - That is, any problem requires at least one instruction and byte
    - They are *increasing* (monotonic increasing)

## Example

- If $f(n)$ is a polynomial of degree $d$,
$$f(n) = a_dn^d + \cdots + a_1n +a_0$$
and $a_d > 0$, then $f(n)$ is $O(n^d)$

- Justification
    - for $n \geq 1$, we have $n^d \geq n^{d-1} \geq \cdots \geq n \geq 1$; hence
    $$a_dn^d + \cdots + a_1n +a_0 \leq (a_d + \cdots + a_1 +a_0)n^d$$
    - therefor $f(n)$ is $O(n^d)$ by defining $c = a_d + \cdots + a_1 +a_0$ and $n_0 = 1$

## Landau Symbols: Big Theta

- A function $f(n) = \Theta(g(n))$ if there exists $n_0$, $c_1$ and $c_2$ such that
$$c_1 g(n)< f(n) < c_2 g(n)	\text{,  whenever  } n \geq n_0$$
- The function $f(n)$ has a *rate of growth* ***equal*** to that of $g(n)$.
- Limit Definition
    $$\lim_{n \to \infty}\frac{f(n)}{g(n)} = c \text{, where } 0 < c < \infty$$


- Knuth: "For all the applications I have seen so far in computer science, a stronger requirement ... is much more appropriate"

## Limits of polynomial functions

- If $f(n)$ is a polynomial of degree $d$,

$$f(n) = a_d n^d + \cdots + a_1 n +a_0$$

- Then
$$\lim_{n\to \pm \infty} f(n) = \lim_{n\to \pm \infty} a_d n^d = \pm \infty$$

$$\lim_{n\to \infty} f(n) = \lim_{n\to \infty} (a_d n^d + \cdots + a_1 n +a_0)$$
$$ = \lim_{n\to \infty} a_d n^d \left(1 + \frac{a_{d-1}n^{d-1}}{a_dn^d} \cdots + \frac{a_1 n}{a_d n^d} + \frac{a_0}{a_d n^d} \right)$$
$$ = \lim_{n\to \infty} a_d n^d \lim_{n\to \infty}\left(1 + \frac{a_{d-1}}{a_d n} \cdots + \frac{a_1}{a_d n^{d-1}} + \frac{a_0}{a_d n^d} \right) $$
$$ = \lim_{n\to \infty} a_d n^d \cdot 1$$
$$ = \lim_{n\to \infty} a_d n^d = \infty$$

## Example

- If $f(n) = 42n^2 + 32$ then $f(n)$ is $\Theta(n^2)$

- Justification
$$\lim_{n \to \infty}\frac{f(n)}{g(n)} = \lim_{n \to \infty}\frac{42n^2 + 32}{n^2}$$
$$ = \lim_{n \to \infty} \left(42 + \frac{32}{n^2} \right) = \lim_{n \to \infty} (42 + 0) = 42$$

In [6]:
%%html
<style>
table.custom td.custom {
    border: 1px solid black;
}
th.custom {
    border: 1px solid black;
    text-align: center;
}
td.custom {
    border: 1px solid black;
    text-align: center;    
}
</style>

## Landau Notation

We will at times use five possible descriptions

<table class="custom">
    <tr>
        <th class="custom">Symbol</th>
        <th class="custom">Limit</th>
        <th class="custom">Description</th>
        <th class="custom">Analogous Relational Operator</th>
    </tr>
    <tr>
        <td class="custom" style="width:25%">$f(n) = \omega(g(n))$</td>
        <td class="custom" style="width:25%">$$\lim_{n \to \infty}\frac{f(n)}{g(n)}=\infty$$</td>
        <td class="custom" style="width:35%; text-align:center">$f$ grows significantly faster than $g$</td>
        <td class="custom" style="width:15%; text-align:center">$>$</td>
    </tr>
    <tr>
        <td class="custom" style="width:25%">$f(n) = \Omega(g(n))$</td>
        <td class="custom" style="width:25%">$$ 0 < \lim_{n \to \infty}\frac{f(n)}{g(n)} $$</td>
        <td class="custom" style="width:35%; text-align:center">$f$ grows at the same rate as or faster than $g$</td>
        <td class="custom" style="width:15%; text-align:center">$\geq$</td>
    </tr>
    <tr>
        <td class="custom" style="width:25%">$f(n) = \Theta(g(n))$</td>
        <td class="custom" style="width:25%"> $$0 < \lim_{n \to \infty}\frac{f(n)}{g(n)} < \infty$$</td>
        <td class="custom" style="width:35%; text-align:center">$f$ grows at the same rate as $g$</td>
        <td class="custom" style="width:15%; text-align:center">$=$</td>
    </tr>
    <tr>
        <td class="custom" style="width:25%">$f(n) = O(g(n))$</td>
        <td class="custom" style="width:25%">$$\lim_{n \to \infty}\frac{f(n)}{g(n)}  < \infty $$</td>
        <td class="custom" style="width:35%; text-align:center">$f$ grows at the same rate as or slower than $g$</td>
        <td class="custom" style="width:15%; text-align:center">$\leq$</td>
    </tr>
    <tr>
        <td class="custom" style="width:25%">$f(n) = o(g(n))$</td>
        <td class="custom" style="width:25%">$$\lim_{n \to \infty}\frac{f(n)}{g(n)}=0$$</td>
        <td class="custom" style="width:35%; text-align:center">$f$ grows significantly slower than $g$</td>
        <td class="custom" style="width:15%; text-align:center">$<$</td>
    </tr>
</table>

Graphically, we can summarize these as follows:
![](../img/landau.png)

Some other observations we can make are:
$$f(n) = \Theta(g(n)) \Leftrightarrow g(n) = \Theta(f(n))$$
$$f(n) = O(g(n)) \Leftrightarrow g(n) = \Omega(f(n))$$
$$f(n) = o(g(n))  \Leftrightarrow g(n) = \omega(f(n))$$


## Big-$\Theta$ as an Equivalence Relation

- If we look at the first relationship, we notice that $f(n) = \Theta(g(n))$ seems to describe an equivalence relation:
1. $f(n) = \Theta(g(n))$ if and only if $g(n) = \Theta(f(n))$
2. $f(n) = \Theta(f(n))$
3. If $f(n) = \Theta(g(n))$ and $g(n) = \Theta(h(n))$, it follows that $f(n) = \Theta(h(n))$

- Consequently, we can group all functions into equivalence classes, where all functions within one class are big-theta $\Theta$ of each other

## Example

- For example, all of these functions are big-$\Theta$ of each other
$$100000 n^2 - 4n + 19$$
$$n^2 + 1000000$$
$$323 n^2 - 4 n ln(n) + 43 n + 10$$
$$42n^2 + 32$$
$$n^2 + 61 n \log^2(n) + 7n + 14 \log^3(n) + \log(n)$$


- For example
$$42n^2 + 32 = \Theta( 323 n^2 - 4 n \log(n) + 43 n + 10)$$

## Logarithms and Exponentials

- Recall that all logarithms are scalar multiples of each other
    - Therefore $\log_b(n) = \Theta(\ln(n))$ for any base $b$
- Alternatively, there is no single equivalence class for exponential functions:
    - If $1 < a < b$,
    $$ \lim_{n\to\infty} \frac{a^n}{b^n} = \lim_{n\to\infty} \left(\frac{a}{b}\right)^n = 0$$ 

    - Therefore $a^n = o(b^n)$ 

- However, we will see that it is almost universally undesirable to have an exponentially growing function!

- We can show that, for any $p > 0$
$$\ln( n ) = o( n^p )$$


- Proof: Using [l’Hôpital's rule](https://en.wikipedia.org/wiki/L%27H%C3%B4pital%27s_rule), we have
$$\lim_{n\to\infty} \frac{\ln(n)}{n^p} = \lim_{n\to\infty} \frac{1/n}{p n^{p-1}} = \lim_{n\to\infty} \frac{1}{p n^p} = \frac{1}{p}\lim_{n\to\infty} n^{-p} = 0$$

- Conversely, $1 = o( \ln( n ))$

- Other observations:
    - If $p$ and $q$ are real positive numbers where $p < q$, it follows that
    $$n^p = o(n^q)$$

- For example, matrix-matrix multiplication is $\Theta(n^3)$ but a refined algorithm is $\Theta(n^{\log_2(7)})$ where $\log_2(7) \approx 2.81$

- Also, $n^p = o(\ln(n)n^p)$, but $\ln(n)n^p = o(n^q)$
    - $n^p$ has a slower rate of growth than $\ln(n)n^p$, but
    - $\ln(n)n^p$ has a slower rate of growth than $n^q$ for $p < q$

## Little-$o$ as a Weak Ordering

- If we restrict ourselves to functions $f(n)$ which are $\Theta(n^p)$ and $\Theta(\ln(n)n^p)$, we note:
    - It is never true that $f(n) = o(f(n))$
    - If $f(n) \neq \Theta(g(n))$, it follows that either 
      $$f(n) = o(g(n)) \text{ or } g(n) = o(f(n))$$
    - If $f(n) = o(g(n))$ and $g(n) = o(h(n))$, it follows that $f(n) = o(h(n))$

- This defines a *weak ordering*!




Graphically, we can shown this relationship by marking these against the real line
![](../img/weak-o.png)

## Algorithms Analysis

- The goal of algorithm analysis is to take a block of code and determine the **asymptotic run time** or **asymptotic memory requirements** based on various parameters
- E.g. given an array of size $n$:
    - Selection sort requires $\Theta(n^2)$ time 
    - Merge sort, quick sort, and heap sort all require $\Theta(n \ln(n))$ time
    - However:
        - Merge sort requires $\Theta(n)$ additional memory 
        - Quick sort requires $\Theta(\ln(n))$ additional memory
        - Heap sort requires  $\Theta(1)$ memory 

- We will use Landau notation to describe the complexity of algorithms
    - E.g., adding a list of $n$ doubles will be said to be a $\Theta(n)$ algorithm

- An algorithm is said to have **polynomial time complexity** if its run-time may be described by $O(n^d)$ for some fixed $d \geq 0$
    - We will consider such algorithms to be *efficient*

- Problems that have no known polynomial-time algorithms are said to be **intractable**
    - Traveling salesman problem: find the shortest path that visits $n$ cities
    - Best run time: $\Theta(n^2 2^n)$


## Motivation

- The asymptotic behavior of algorithms indicates the ability to **scale**
    - Suppose we are sorting an array of size $n$
    - Selection sort has a run time of $\Theta(n^2)$
        - $2n$ entries requires $(2n)^2 = 4n^2$
            - Four times as long to sort
        - $10n$ entries requires $(10n)^2 = 100n^2$
            - One hundred times as long to sort

- The other sorting algorithms have $\Theta(n \ln(n))$ run times
    - $2n$ entries require $(2n) \ln(2n)$ = $(2n) (\ln(n) + 1) = 2(n \ln(n)) + 2n$
    - $10n$ entries require $(10n) \ln(10n) = (10n) (\ln(n) + 1) = 10(n \ln(n)) + 10n$

- In each case, it requires $\Theta(n)$ more time

- However
    - Merge sort will require twice and 10 times as much memory
    - Quick sort will require one or four additional memory locations
    - Heap sort will not require any additional memory       

- If we are required to store both objects and relations, both memory and time will **increase**
    - Our goal will be to *minimize* this increase

- To properly investigate the determination of run times asymptotically, we'll discuss
    - Machine instructions
    - Operations
    - Control statements
    - Conditional statements and loops
    - Functions
    - Recursive functions

## Machine Instructions

- Given any processor, it is capable of performing only a limited number of *operations*
- These operations are called **instructions**
- Any instruction runs in a **fixed** amount of time (an integral number of CPU cycles)
- Assembly language has an almost one-to-one translation to machine instructions
    - Assembly language is a low-level programming language
- The C programming language (C++ without objects and other abstractions) can be referred to as a mid-level programming language
    - There is abstraction, but the language is closely tied to the standard capabilities
    - There is a closer relationship between operators and machine instructions

## Operators

- Because each machine instruction can be executed in a fixed number of cycles, we may assume each operation requires a *fixed number of cycles*
- The time required for any operator is $\Theta(1)$ including:
    - Retrieving/storing variables from memory
    - Variable assignment, `=`
    - Integer operations, `+ - * / % ++ --`
    - Logical operations, `&& || !`
    - Bitwise operations, `& | ^ ~`
    - Relational operations, `== != < <= => >`
    - Memory allocation and deallocation, `new delete`
- Memory allocation and deallocation are the slowest by a significant factor
    - Note that after memory is allocated, the constructor is run
        - The constructor may not run in $\Theta(1)$ time

## Blocks of Operations

- If each operation runs in $\Theta(1)$ time, then any fixed number of operations also run in $\Theta(1)$ time, for example:
```c
// Swap variables a and b
int tmp = a;
a = b;
b = tmp;
```

- Seldom will you find large blocks of operations without any additional control statements:
    - Remove node from the DL list
```cpp
p->prev->next = p->next;
p->next->prev = p->prev;
delete p;
```
    - Run time: $\Theta(1)$

## Blocks in Sequence

- Suppose you have now analyzed a number of blocks of code run in sequence

```cpp
template <typename T>
void update_capacity( int delta ) {
	T *array_old = array;                      //-------
	int capacity_old = array_capacity;         //  Θ(1)
    array_capacity += delta;                   //  
	array = new T[array_capacity];             //-------
	for ( int i = 0; i < capacity_old; ++i ) { //-------        
		array[i] = array_old[i];               //  Θ(n)
	}                                          //-------
	delete[] array_old;                        // Θ(1) or Ω(𝑛)
}
```
- To calculate the total run time, add the entries: $\Theta(1 + n + 1) = \Theta(n)$
    - When considering a sum, *take the dominant term*


## Control Statements

- These are statements which potentially alter the execution of instructions
    - Conditional statements `if, switch`
    - Condition-controlled loops `for, while, do-while`
    - Collection-controlled loops `for ( auto i : array ) {...}`
    
- Given any collection of nested control statements, it is always necessary to **work inside out**
    - Determine the run times of the inner-most statements and work your way out

## Conditional Statement

- Given
```cpp
if ( condition ) {
    // true body
} else {
    // false body
}
```
- The run time of a conditional statement is:
    - the run time of the *condition* (the test), **plus**
    - the run time of the *body which is run*

- In most cases, the run time of the condition is $\Theta(1)$

## Example

Sometimes, it's less obvious
```cpp
int find_max( int *array, int n ) {
    int max = array[0];
    for ( int i = 1; i < n; ++i ) {
        if ( array[i] > max ) {     //---------            
            max = array[i];         //  Θ(???)
        }                           //---------
    }
    return max;
}
```

- If we had information about the distribution of the entries of the array, we may be able to determine it
    - if the list is sorted (ascending) it will always be run
    - if the list is sorted (descending) it will be run once
    - if the list is uniformly randomly distributed, then???

## Condition-controlled Loops

- The C++ for loop is a condition controlled statement:
```cpp    
for ( int i = 0; i < n; ++i ) {
    // ...
}
```    
is identical to
```cpp
int i = 0;        // initialization
while ( i < n ) { // condition
    // ...
    ++i;            // increment
}
```
- Assuming there are no break or return statements in the loop, the run time is $\Omega(n)$

- The initialization, condition, and increment usually are single statements running in $\Theta(1)$
```cpp    
for ( int i = 0; i < n; ++i ) {
    // code that runs in Θ(f(m))
}
```

- If the body **does not depend** on the variable (in this example, `i`), then the run time of $\Theta(n f(m))$
- If the body is $O(f(m))$, then the run time of the loop is $O(n f(m))$

- For example,
```cpp
int sum = 0; 
for ( int i = 0; i < n; ++i ) {
    sum += 1; // Θ(1)
}
```
- This code has run time $\Theta(n \cdot 1) = \Theta(n)$


- For example,
```cpp
int sum = 0; 
for ( int i = 0; i < n; ++i ) { 
    for ( int j = 0; j < n; ++j ) {
        sum += 1; // Θ(1)
    }
}
```
- The previous example showed that the inner loop is $\Theta(n)$, thus the outer loop is $\Theta(n \cdot n) = \Theta(n^2)$


## Repetition Statements

- Suppose with each loop, we use a linear search an array of size $m$:
```cpp
int sum = 0; 
for ( int i = 0; i < n; ++i ) {
    // search through an array of size m, O(m)
}
```
- The inner loop is $O(m)$ and thus the outer loop is $O(n \cdot m)$

- If the body **does depends** on the variable (in this example, `i`), then the run time of 
```cpp
for ( int i = 0; i < n; ++i ) {
 // code which is Θ(f(i,n))
}
```
is $\Theta\left(1 + \sum_{i=0}^{n-1}(1+f(i,n))\right)$
- If the body is $O(f(i, n))$, the result is
$$O\left(1 + \sum_{i=0}^{n-1}(1+f(i,n))\right)$$

- For example,
```cpp
int sum = 0; 
for ( int i = 0; i < n; ++i ) { 
    for ( int j = 0; j < i; ++j ) {
        sum += i + j;
    }
}
```
- The inner loop is $\Theta(1 + i(1 + 1) ) = \Theta(i)$ hence the outer is
$$\Theta\left(1 + \sum_{i=0}^{n-1}(1+i)\right) = \Theta\left(1 + n + \sum_{i=0}^{n-1}i\right) =$$
$$\Theta\left(1 + n + \frac{n(n-1)}{2}\right) = \Theta(n^2)$$

## Serial Statements

- Suppose we run one block of code followed by another block of code
    - Such code is said to be run **serially**

- If the first block of code is $O(f(n))$ and the second is $O(g(n))$, then the run time of two blocks of code is
$$O( f(n) + g(n) )$$
which usually (for algorithms not including function calls) simplifies to one or the other

- What is the proper means of describing the run time of these two algorithms?
    - if the leading term is big-$\Theta$, then the result must be big-$\Theta$, otherwise
    - if the leading term is big-$O$, we can say the result is big-$O$  

- For example,
$$O(n) + O(n^2) + O(n^4) = O(n + n^2 + n^4) = O(n^4)$$
$$O(n) + \Theta(n^2) = \Theta(n^2)$$
$$O(n^2) + \Theta(n) = O(n^2)$$
$$O(n^2) + \Theta(n^2) = \Theta(n^2)$$

## Functions

- A function (or subroutine) is code which is composed of repeated operations
- Because a function can be called from anywhere, we must:
    - prepare the appropriate environment
    - deal with arguments (parameters)
    - jump to the subroutine
    - execute the subroutine
    - deal with the return value
    - clean up

- On modern processors most of these steps in one instruction
    - Thus, we will assume that the overhead required to make a function call and to return is $\Theta(1)$

- Because any function requires the overhead of a function call and return, we will always assume that
$$T_f = \Omega(1)$$
    - That is, it is impossible for any function call to have a zero run time
    
- Thus, given a function $f(n)$ (the run time of which depends on n) we will associate the run time of $f(n)$ by some function $T_f(n)$
    - We may write this to $T(n)$

- Because the run time of any function is at least $O(1)$, we will include the time required to both call and return from the function in the run time


## Recursive Functions

- A function is relatively simple if it simply performs operations and calls other functions
- Most interesting functions designed to solve problems usually end up calling themselves
    - Such a function is said to be **recursive**
    
```cpp
int factorial( int n ) {
    if ( n <= 1 ) {
        return 1;                       // Θ(1)
    } else {
        return n * factorial( n - 1 );  // 𝑇(n-1) + Θ(1)
    }
}
```

- Thus, we may analyze the run time of this function as follows:
$$T(n) = \begin{cases}
\Theta(1) & n \leq 1 \\
T(n-1) + \Theta(1) & n > 1 \\
\end{cases}$$    
- We don't have to worry about the time of the conditional ($\Theta(1)$) nor is there a probability involved with the conditional statement

- The analysis of the run time of this function yields a recurrence relation
$$T(n) = T(n-1) + \Theta(1), \; \; \; T(1) = \Theta(1)$$

- Replace each Landau symbol with a representative function
$$T(n) = T(n-1) + 1, \; \; \; T(1) = 1$$

- Try to solve equations by examining the first few steps:
$$T(n) = T(n - 1) + 1$$
$$ = T(n - 2) + 1 + 1 = T(n - 2) + 2$$
$$ = T(n - 3) + 3$$

- From this, we see a pattern:
$$T(n) = T(n – k) + k$$


- If $k = n - 1$ then
$$T(n) = T(n – (n – 1)) + n – 1$$
$$= T(1) + n – 1 = 1 + n – 1 = n$$
- Thus, $T(n) = \Theta(n)$