# Chapter 4 Algorithm Analysis
* **The execution time is dependent on several factors**
1. **The amount of data that must be processed directly affect the execution time.**
1. **The execution times can vary depending on the type of hardware and the time of day a computer is used.**
1. **The choice of programming language and compiler used to implement an algorithm can also influence the execution time.**

## 4.1 Complexity Analysis

### 4.1.1 Big-O Notation
* **Order of magnitude:**
    * **This classification approximates the actual number of required steps required steps for execution or the actual storage requirement in terms of variable-sized data sets.**
    * **The term $\pmb{big-O}$, which is derived from the expression "on the order of," is used to specify an algorithm's classification.**

#### Defining Big-O
* **Suppose there exists a function $f(n)$ defined fot the integers $ n \geq 0$,such that for some constant $c$. and some constant $m$, 
$$ T(n) \leq c f(n) $$
for all sufficiently large values of $ n \geq m $. Then, such an algorithm is said to have a *time-complexity* of, or executes on the order of, $f(n)$ relative to the number of operations it requires.**
    * **The function $f(n)$ indicates the rate of growth at which the run time of an algorithm increases as the input size, $n$, increases.**
    * **To specify the time complexity of an algorithm, which runs on the order of $f(n)$, we use the notation:**
    $$ O(f (n))$$
    * **The objective is to find a function $ f(\cdot) $ that provides the tightest (lowest) *upper bound* or limit for the run timw of an algorithm. The big-O notation is intended to indicate an algorithm's efficiency for large values n.**

#### Constant of Proportionality
* **The constant of proportionality is only crucial when two algorithms have the same $f(n) $. It usually makes no difference when comparing algortihms whose growth oare of different magnitudes.**

#### Constructing $T(n)$
* **We assume that each basis operation or statement, at the abstract level, takes the same amount of time and, thus, each is assumed to cost $\pmb{constant time}$.**
* **The total number of operations required by an algorithm can be computed as a sum of the times required to perform each step:
$$ T(n) = f_1(n) + f_2(n)+\dots+ f_k(n) $$**

#### Choosing the Function
* **The function $f(n)$ used to categorize a particular algorithm s chosen to be the *dominant term* within $T(n)$. That is, the term that is so large for big values of $n$, that we can ignore the other terms when computing a big-O value.**

#### Classes of Algorithms

![Screen%20Shot%202020-11-19%20at%201.31.52%20PM.png](attachment:Screen%20Shot%202020-11-19%20at%201.31.52%20PM.png)

![Screen%20Shot%202020-11-19%20at%201.32.36%20PM.png](attachment:Screen%20Shot%202020-11-19%20at%201.32.36%20PM.png)

### 4.1.2 Evaluating Python Code
* **Basic Operations: include the statements and function calls whise execution time does not depend on the specific values of the data that is used or manipulated by the given instruction.**
* **Efficiency of String Operations: Most of the string operations have a time-complexity that is proportional to the length of the string. For most problems that do not involve stirng processing, string operations seldom have an impact on the run time of an algorithm. Thus, we assume the string operations, including the use of** print( ) **function, only requires constant time, unless explicitly stated otherwise.**

#### Linear Time Examples

In [1]:
def ex1( n ):
    total = 0
    for i in range(n):
        total += n
    return total

def ex2(n):
    count = 0
    for i in range(n):
        count += 1
    for j in range(n):
        coun += 1
    return count

#### Quadratic Time Examples

In [2]:
def ex3(n):
    count = 0
    for i in range(n):
        for j in range(n):
            count += 1
    return count

def ex4(n):
    count = 0
    for i in range(n):
        for j in range(25):
            count += 1
    return count

* **Special case of nested loops**

In [3]:
def ex5(n):
    count = 0
    for i in range(n):
        for j in range(i + 1):
            count += 1
    return count

* **Since the inner loop varies from 1 to n iterations by increments of 1, the total number of times the increment statement will be executed is equal to the sum of the first $n$ positive integers**
$$ T(n)=\frac{n(n+1)}{2}=\frac{n^2+n}{2}$$

#### Logarithmic Time Examples

In [4]:
#instead of incrementing by one, it cuts the loop variable in half
#each time through the loop
def ex6(n):
    count = 0
    i = n
    while i >= 1:
        count += 1
        i = i // 2
    return count

* **When the size of the input is reduced by half in each subsequent iteration, the number of iterations required to reach a size of one will be equal to**
$$ \lfloor \log_2 n \rfloor + 1$$

$ O(n \log n) $ **run time**

In [5]:
def ex7(n):
    count = 0
    for i in range(n):
        count += ex6(n)
    return count 

#### Different Cases
* **Some algorithms can have run times that are different orders of magnitude for different sets of the same size. These algorithms can be evaluated for their best, worst, and average cases.**

## 4.2 Evaluating a Python List

![Screen%20Shot%202020-11-19%20at%202.04.01%20PM.png](attachment:Screen%20Shot%202020-11-19%20at%202.04.01%20PM.png)

### List Traversal
* **Since all operations within the loop only require constant time, including the element access operation, a complete list traversal requires $O(n)$ time.**

### List Allocation
* **Creating an empty list. Can be acomplished in constant time.**

In [6]:
temp = list()

* **Creares a list of $n$ elements, with each elemnet intialized to 0. The actual allocation of the $n$ elements can be done in constant time, but the initialization of the individual elemnets requires a list traversal. The allocation of a vector with $n$ elements require $O(n)$ time.**

In [8]:
valueList = [0] * 10

#### Appending to a List
* **The** append( ) **operation adds a new item to the end of the sequence**.
* **The operation has a best case time of $O(1)$ since it only requires a single element access.**
* **In the worst case, there are no availiable slots and array has to be expanded, which required $O(n)$ time.**

### Extending to a list
* **The** extend( ) **operation adds the entire contents of a source list to the end of the destination list.**
    * **When the destination list has sufficient capacity to store the new items, the entire contents of the source list can be copied in $O(n)$ time.**
    * **If there is not sufficient capacity, the underlying array of the destination list has to be expanded to make room fot the new items. The expansion requires $O(n)$ time since there are currently $n$ items in the destination list**
    
### Inserting anf Removing Items
* **Both requires linear time($O(n)$)**

## 4.3 Amortized Cost
* ***aggregate method*: we can tally or compute the total running time by considering the time required for each individual append operation.**
* **Amortized analysis is the process of computing the time-complexity for a sequence of operations by computing the average cost over the entire sequence.**
    * **For this technique to be applied, the cost per operation must be known and it must vary in which many of the operations in the sequence contribute little cost and only a few operations contribute a high cost.**

## 4.4 Evaluating the Set ADT
![Screen%20Shot%202020-11-19%20at%202.33.59%20PM.png](attachment:Screen%20Shot%202020-11-19%20at%202.33.59%20PM.png)

### Simple Operations
* **The** add( ) **method also requires** $O(n)$ **time in the worst case since it used the $in$ operator to determine if the element is unique and the** append( ) **method to add the unique item to the underlying list, both of which require linear time in the worst case.**

### Operations of Two Sets
* isSubsetOf( ) **method determines if $A$ is a subset of $B$. It iterates over the n elements of set A, during which the $in$ operator is used to determine if the given element is a member of set B. Since there are n reptitions of the loop and each use of the $in $ operator requires $O(n)$ time, the** isSubsetOf() **method has a quadractic run time of $ O(n^2)$**
* **The set equality operation is also** $ O(n^2) $ **since it calls** isSubsetOf( ) **after determining the two sets are of equal seize**

### Set Union Operation
* **The set** Union( ) **operation creates a new set,$C$, that contains all of the unique elements from both set A and B. It requires three steps.**
    1. **The first step creates the new set $C$, which can be done in constant time.**
    1. **The second step fills set $C$ with the elements from set $A$, which requires $O(n)$ time since the** extend( ) **list method is used to add the elements to $C$.**
    1. **The last step iterates over the elements of set $B$ during which the $in$ operator is used to determine id thew given element is a member of set $A$. If the element is not a memeebr of set $A$, it's added to set $C$ by applying the** append( ) **list method. Given that the loop is performed $n$ times and each iteration requires $n+1$ time, this step requires $O(n^2)$ time.**

## 4.5 Application: The Sparse Matrix
* **A matrix containing a large number of zero elements is called a *sparse matrix***
    1. **A sparse matrix is formally defined to be an $ m \times n$ matrix that contains k non-zero elements such that $ k \ll m \times n$.**
* **One approach is to organize and store the non-zero elemetns of the matrix within a single list instead of 2-D array.**

### 4.5.1 List-Based Implementation

#### Constructor
* **The** \_elementList **filed stores** \_MatrixElement **objects representing the non-zero elments. Instances of the storage class contain not only the value for a specific element, but also the row and column indices indicating it location within the matrix.**
* **The** \_numRows **and** \_numCols **fields are used to store the dimensions of the matrix**

#### Helper Method
* \_findPosition( ) **performs a linear search by iterating through the element list looking for an entry with the given row and column indices. If found, it returns the list idex of the cell containing the element; otherwise,** None **is returned to indicate the absence of the element**

![Screen%20Shot%202020-11-19%20at%204.14.13%20PM.png](attachment:Screen%20Shot%202020-11-19%20at%204.14.13%20PM.png)

#### Modifying an Element
* **If the entry is in the list, we either change the corresponding element to the new value if it is non-zero or we remove the entry from the list when the new value is zero.**
* **If there is no entry for the given element, then a new** \_MatrixElement **object must be created and appended to the list.**

#### Matrix Scaling
* **Scaling a matrix requires multiplying eafch element in the matrix by a given scale factor. The implementation is to traverser the list of** \_MatrixElement **object and scale the corresponding value.**

#### Matrix Addition
1. **Verify the size of the two matrices to ensure they are the same as required by matrix addition**
1. **Create a new** SparseMatrix **object with the same number of rows and columns as the other two**
1. **Duplicate the elements of the** self **matrix and store them in the new matrix**
1. **Iterate over the element list of the right hand side $(rhsMatrix)$ to add the non-zero values to the corresponding elements in the new matrix.**

In [27]:
# Implementation of the Sparse Matrix ADT using a list
class SparseMatrix:
    # Create s sparse matrix of size numRows x numCols initialized to 0
    def __init__(self, numRows, numCols):
        self._numRows = numRows
        self._numCols = numCols
        self._elementList = list()
        
    # Return the nu mber of rows in the matrix
    def numRows(self):
        return self._numRows
    
    # set the value of element(i, j) to the value s: x[i, j] = s
    def __setitem__(self, ndxTuple, scalar):
        ndx = self._findPosition(ndxTuple[0], ndxTuple[1])
        if ndx is not None: # if the elemnet is found in the list
            if scalar != 0.0:
                self._elementList.value = scalar
            else: 
                self._elementList.pop(ndx) 
        else:
            if scalar != 0.0:
                element = _MatrixElement(ndxTuple[0], ndxTuple[1], scalar)
                self._elementList.append(element)
                    
    # scale the matrix by the given scalar
    def scaleBy(self, scalar):
        for element in self._elementList:
            element.value *= value
        
    #helper method used to find a specific matrix element(row, col)
    #in the list of non-zero entrie
    def _finfPosition(self, row, col):
        n = len( self._elementList )
        for i in range(n):
            if row == self._elementList.row and col == self._elementList.col:
                return i
        return None
    
    # Implmenating add opearion
    def __add__(self, rhsMatrix):
        assert rhsMatrix.numRows == self.numRows() and rhsMatrix.numCols() == self.numCols(),"Matrix sizes not compatible for add operation"
        
        #create the new matrix
        newMatrix = sparseMatrix(self.numRows(), self.numCols() )
        
        # Duplicate the lhs matrix. The elements are mutable, thus we
        # must create new objects and not simply copy the references
        for element in self._elementList:
            dupElement = _MatrixElement(element.row, element.col, element.value)
            newMatrix._elementList.append(dupElement)
            
        # Iterate through each non-zero element of the rhsMatrix
        for element in rhsMatrix._elementList:
            # Get tje value of the corresponding element in the new matrix
            value = newMatrix[ element.row, element.col ]
            value += element.value
            #store the new value back to the new matrix
            newMatrix[ element.row, element.col ] = value
        
        # return the new matrix
        return newMatrix
    
class _MatrixElement:
    def __init__(self, row, col, value):
        self.row = row
        self.col = col
        self.value = value

### 4.5.2 Efficiency Analysis
* \_findPosition( ) **helper method, performs a sequential search over the list of non-zero entries. The worst case run time for the helper method is** $ O(k) $.**
* \_\_setitem\_\_ method **cals** \_findPosition( ), **which requires $k$ time. It the changes the value of the target entry, which is a time operation, or either removes an entry, which is a constant time operation.**
* **It changes the value of the target entry which is a constant time operation, or either removes th