- Simplt put, a data structure is a systematic way of organizing and accessing data, and an algorithm is a step-by-step porcedure for performing some task in a finite amount of time.

## 3.1 Experimental Studies

In [None]:
from time import time 
start_time = time()
run algorithm
end_time = time()
elapsed = end_time - start_time

- but it is by no means perfect. The time function measure relative to what is known as the "wall clock". Because many processes share use of a computer's cpu, the elapsed time will depend on what other processes are running on the computer when the test is performed.
- Python includes a more advanced module, named timeit, to help automate such evaluations with repetition to account for such variance among trials

** Challenge of Experimental Analysis **

- While experimental studies of running times are valuable, especially when fine-tuning production-quality code, there are three major limitations to their use for algorithm analysis:

- Experimental running times of two algorithms are difficult to directly compare unless the experiments are performed in the same hardware and software environments.
- Experiments can be done only on a limited set of test inputs; hence, they leave out the running times of inputs not inlcuded in the experiment
- An algorithm must be fully implemented in order to execute it to study its running time experimentally.

- This last requirement is the most serious drawback to the use of experimental studies. At early stages of design, when considering a choice of data structures or algorithms, it would be foolish to spend a significant amount of time implementing an approach that could easily be deemed inferior by a higher-level analysis.

### 3.1.1 Moving Beyong Experimental Analysis

- Our goal is to develop an approach to analyzing the efficiency of algorithms that: 
1. Allows us to evaluate the relative efficiency of any two algorithms in a way that is independent of the hardware and software environment.
2. Is performed by studying a high-level description of the algorithm without need for implementation.
3. Takes into account all possible inputs.

** Counting Primitive Operations **

- To analyze the running time of an algorithm without performing experiments, we perform an analysis directly on a high-level description of the algorithm. We define a set of primitive operations such as the following:

- Assigning an identifier to an obeject 
- Determining the object associated with an identifier 
- Performing an arithmetic operation 
- Comparing two numbers
- Accessing a single element of a Python list by index
- Calling a function 
- Returning from a function

- Formally, a primitive operation corresponds to a low-level instruction with an execution time that is constant. Ideally, this might be the type of basic operation that is executed by the hardware, although many of our primitive operations may be translated to a small number of instructions. Instead of trying to determine the specific execution time of each primitive operation, we will simply count how many primitive operations are executed, and use this number t as a measure of the running time of the algorithm.

- This operation count will correlate to an actual running time in a specific computer, for each primitive operation corresponds to a constant number of instructions, and there are only a fixed number of primitive operations. The implicit assumption in this approach is that the running times of different primitive operations will be fairly similar. Thus, the number, t, of primitive operations an algorithm performs will be proportional to the actual running time of that algorithm.

** Measuring Operations as a Function of Input Size **

- To capture the order of growth of an algorithm's running time, we will associate with each algorithm, a function f(n) that characterizes the number of primitive operations that are performed as a function of the input size n. Section 3.2 will introduce the seven most common functions that arise, and Section 3.3 will introduce a mathematical framework for coparing functions to each other.

** Focusing on the Worst-Case Input **

- An algorithm may run faster on some inputs that it does on others of the same size. Thus, we may wish to express the running time of an algorithm as the function of the input size obtained by taking the average over all possible inputs of the same size. Unfortunately, such an average-case analysis is typically quite challenging. It requires us to define a probability distribution on the set of inputs, which is often a difficult task. Figure 3.2 schematically shows how, depending on the input distribution, the running time of an algorithm can be anywhere between the worst-case time and the best-case time. For example, what if inputs are really only of types "A" or "D"?


- An average-case analysis usually requires that we calculate expected running times based on a given input distribution, which usually involves sophisticated probability theory. Therefore, for the remainder of this book, unless we specify otherwise, we will characterize running times in terms of worst-case, as a function of the input size, n, of the algorithm.

- Worst-case analysis is much easier than average-case analysis, as it requires only the ability to identify the worst-case input, which is often simple. Also, this approach typically leads to better algorithms. Making the stadard of success for an algorithm to perform well in the worst case necessarily requires that it will do well on every input. That is, designing for the worst case leads to stronger algoorithmic "muscles," much like a track star who always pratices by running up an incline.

## 3.2 The Seven Functions Used in This Book

** The Constant Function **

- The simplest function we can think of is the constant function. 

$$ f(n) = c, $$

- for some fixed constant c, such as c = 5, c = 27, or c = $ 2^{10} $. That is, for any argument n, the constant function f(n) assigns the value c. In other wordsm it does not matter what the value of n isl f(n) will always be equal to the constant value c.

- Because we are most interested in integer functions, the most fundamental constant function is g(n) = 1, and this is the typical constant function we use in this book. Note that any other constant function, f(n) = c, can be written as a constant c times g(b). That is, f(n) = cg(n) in this case.

** The Logarithm Function **

** The Linear Function **

** THe N-Log-N Function **

** The Quadratic Function **

- Nested Loops and the Quadratic Function

** The Cubic Function and Other Polynomials **

- coefficients, degree

** Summations **

$$ \sum $$

** THe exponential Function **

- base, exponent

** Geometric Sums **

### 3.2.1 Comparing Growth Rates

![main](classes_of_functions.png "main")

- Ideally, we would like data structure operations to run  in times proportaional to the constant or logarithm function, and we would like our algorithms to run in linear or n-log-n time. Algorithms with quadratic or cubic running times are less practical, and algorithms with exponential running times are infeasible for all but the smallest sized inputs. Plots of the seven functions are shown in Figure 3.4.

** The ceiling and Floor Functions **

- One additional comment concerning the functions above is in order. When discussing logarithms, we noted that the value is generally not an integer, yet the running time of algorithm is usually expressed by means of an integer quatity, such as the number of operations performed. Thus, the analysis of algorithms may sometimes involve the use of the floor function and ceiling function, which are defined respectively as follows:

## 3.3 Asymptotic Analysis

- In algorithm analysis, we focus on the growth rate of the running time as a function of the input size n, taking a "big-picture" approach. For example, it is often enough just to know that the running time of an algorithm grows porportionally to n. 
- We analyze algorithms using a mathematical notation for functions that disregards constant factors. Namely, we characterize the running times of algorithms by using functions that map the size of the input, n, to values that correspond to the main factor that determines the growth rate in terms of n. This approach reflects that each basic step in a pseudo-code description or a high-level language implementation may correspond to a small number of primitive operations. Thus, we can perform an analysis of an algorithm by estimating the number of primitive operations executed up to a constant factor, rather than getting bogged down in language-specific or hardware-specific analysis if the exact number of operations that execute on the computer.

- As a tangible example, we rivisit the goal of finding the largest element of a Python listl we first used this example when introducing for loops on page 21 of Section 1.4.2. Code Fragment 3.1 presents a function named find\_max for this task.

In [1]:
def find_max(data):
    """Return the maximum element from a nonempty Python list."""
    biggest = data[0]
    for val in data:
        if val > biggest:
            biggest = val
    return biggest
        

- This is a classic example of an algorithm with a running times that grows proportional to n, as the loop executes once for each data element, with some fixed number of primitive operations executing for each pass. In the remainder of this section, we provide a framework to formalize this claim.

### 3.3.1 The "Big-Oh" Notation

![main](bigoh_function.png "main")

- The big-Oh notation allows us to say that a function f(n) is "less than or equal to" another function g(n) up to a constant factor and in the asymptotic sense as n gorws toward infinity. This ability comes from the fact that the definition uses "<=" to compare f(n) to a g(n) times a constant, c, for the asymtotic cases when n >= $n_0$.

- However, it is considered poor taste to say "f(n) <= O(g(n))," since the big-Og already denotes the "less-than-or-equal-to" concept. Likewise, although common, it is not fully correct to say "f(n) = O(g(n))," with the usual understanding of  the "=" relation, because there is no way to make sense of the symmetric statement, "O(g(n)) = f(n)." It is besst to say,

** f(n) is O(g(n))." **

** Characterizing Running Times Using the Big Oh Notation **

- The big-Oh notation is used widely to characterize running time and space bounds in terms of some parameter n, which varies from problem to problem, but is always defined as a chosen measure of the "size" of the problem. For example, if we are interested in finding the largest element in a sequence, as with the find\_max algorithm, we should let n denote the number of elements in that collection. Using the big-Oh notation, we can write the following methematically precise statement on the running time of algorithm find\_max for any computer.

** Characterizing Functions in Simplest Terms **

- without constant

** Big-Omega **

- Just as the big-Oh notation provides an asymptotic way of saying that a function is "less than or equal to" another function, the following notations provide an asymptotic way of saying that a function grows at a rate that is "greater than or equal to" that of another.

** Big-theta **

- In addition, there is a notation that allows us to say that two functions grow at the same rate, up to constant factors.

### 3.3.2 Comparative Analysis

- Suppose two algorithms solving the same problem are available: an algorithm A, which has a running time of O(n), and an algorithm B, which has a running time of O($n^{2}$). Which algorithm is better?

** Some of Words of Caution **

- A few words of caution about asymptotic notation are in order at this point. First, note that the use of the big-Oh and related notations can be somewhat misleading should the constant factors they "hide" be very large. For example, while it is true that the function 10000 n is O(n), if this is the running time of an algorithm being compared to one whose running time is 10nlogn, we should prefer the O(nlogn)- time algorithm, even though the linear-time algorithm is asymptotically faster. 

** Exponential Running Times **

### 3.3.3 Examples of Algorithm Analysis

- Now that we have the big-Oh notation for doing algorithm analysis, let us giev some examples by characterizing the running time of some simple algorithms using this notation. Moreover, in keeping with our earlier promise, we illustrate below how each of the seven functions given earlier in this chapter can be used to characterize the running time of an example algorithm.

- Rather than use pseudo-code in this section, we give complete Python implementations for our examples. We use Python's list class as the natural representation for an "array" of values. In Chapter 5, we will fully explore the underpinnings of Python's list class, and the efficiency of the various behaviors that it supports. In this section, we rely on just a few of its behaviors, discussing their efficiencies as introduced.

** Constant-Time Operations **

- Given an instance, named data, of  the Python list class, a call to the function len(data), is evaluated in constant time. This is a very simple algorithm because the list class maintains, for each list, and instance variable that records the current length of the list. This allows it to immediately report that length, rather than take time to iteratively count each of the elements in the list. Using asymptotic notation, we say that this function runs in I(1) time; that is, the running time of this function is independent of  the length, n, of the list.


- Another central behavior of Python's list class is that it allows access to an arbitrary element of the list using syntax, data[j], for integer index j. Because Python's lists are implemented as array-based sequences, references to a list's elements are stored in a iterating through the list one element at a time, but by validating the index, and using it as an offset into the underlying array. In turn, computer hardware supports constant-time access to an element based on its memory address. Therefore, we say that the expression data[j] is evaluated in O(1) time for a Python list.

** Revisiting the Problem of Finding the Maximum of a Sequence **

** Prefix Averages **

- The next problme we consider is computing what are known as *prefix averages* of a sequence of numbers. Namely,, given a sequence S containing of n numbers, we want to compute a sequence A such that A[j] is the average of elements S[0], ..., S[j] for j = 0, ...,n 01, that is ..

- Computing prefix averages has many applications in economics and statistics. For example, given the year-by-year returns of a mutula fund, ordered from recent to past, an investor will typically want to see the fund's average annual returns for the most recent year, the most recent three years, the most recent five years, and so on. Likewise, given a stream of daily Web usage logs, a Web site manager may wish to track average usage tredns over various time periods. We analyze three different implementations that solve this problem but with rather different running times. 

** A Quadratic-Time Algorithm **

- Our first algorithm for computing prefix averages, named prefix\_average1, is shown in Code Fragment 3.2. It computes every element of A separately, using an inner loop to compute the partila sum.

In [2]:
def prefix_avearge1(S):
    """Return list such that, for all j, A[j] equals average of S[0], ..., S[j]"""
    n = len(S)
    A = [0] * n
    for j in range(n):
        total = 0 
        for i in range(j + 1):
            total += S[i]
        A[j] = total / (j + 1)
    return A

In [3]:
def prefix_average2(S):
    """Return list such that, for all j, A[j] equals average of S[0], ..., S[j]"""
    n = len(S)
    A = [0] * n
    for j in range(n):
        A[j] = sum(S[0:j+1]) / (j + 1)
    return A

- This approach is essentially the same high-level algorithm as in prefix\_average1, but we have replaced the inner loop by using the single expression sum(S[0:j+1]) to compute the partial sum, S[0] + ... + S[j]. While the use of that function greatly simplifies the presentation of the algorithm, it is worth asking how it affects the efficiency. Asymptotically, this implementation is no better. Even though the expression, sum(S[0:j+1]), seems like a single command, it is a function call and an evaluation of that function takes O(j+1) time in this context. Technically, the computation of the slice, S[0:j+1], also uses O(j+1) time, as it constructs a new list instance for storage. So the running time of prefix\_average2 is still dominated by a series of steps that take time proportional to 1 + 2 ... and thus O($n^2$).

** Linear-Time Algorithm **

- Our final algorithm, prefix\)average3, is given in Code Fragment 3.4. Just as with our first two algorithms, we are interested in computing, for each j, the *prefix sum* S[0] + S[1] + ... + S[j], denoted as total in our code, so that we can then compute the prefix average A[j] = total / (j + 1). However, there is a key difference that results in much greater efficiency.

In [4]:
def prefix_average3(S):
    """Return list such that, for all j, A[[j] equals average of S[0],..., S[j]"""
    n = len(S)
    A = [0] * n
    total = 0
    for j in range(n):
        total += S[j]
        A[j] = total / (j + 1)
    return A

** Three-Way Set Disjointness **

- Suppose we are given three sequences of numbers, A, B and C. We will assume that no individual sequence contains duplicate values, but that there may be some numbers taht are in two or three of the sequences. The **three-way set disjointness** problem is to determine if the intersection of the three sequence is empty, namely, that there is no element x such that ... A simple Python function to determine this property is given in ...

In [6]:
def disjoint1(A, B, C):
    """Return True if there is no element common to all three lists."""
    for a in A:
        for b in B:
            for c in C:
                if a == b == c:
                    return False 
    return True

- This simple algorithm loops through each possible triple of values from the three sets to see if those values are equivalent. If each of the originla sets has size n, then the worst-case running time of this function is O($n^3$).

- We can improve upon the asymptotic performance with a simple observation. Once inside the body of the loop over B, if selected elements a and b do not match each other, it is a waste of time to iterate through all values of C looking for a matching triple. An improved solution to this problem, taking advantage of this observation, is presented in Code Fragment 3.6.

In [8]:
def disjoint2(A, B, C):
    """Return True if there is no element common to all three lists."""
    for a in A:
        for b in B:
            if a == b:
                for a in c:
                    if a == c:
                        return False
    return True

- In the improved version, it is not simply that we save time if we get lucky. WE claim that the worst-case running time for disjoint2 is O($n^2$). There are quadratically many pairs (a, b) to consider. However, if A and B are each sets of distinct elements, there can be at most O(n) such pairs with a equal to b. Therefore, the innermost loop, over C, executes at most n times.

- To account for the overall running time, we examine the time spent executing each line of code. The management of the for loop over A requires O(n) time. The management of the for loop over B accounts for a total of O($n^2$) times, since that loop is executed n different times. The test a == b is evaluated O($n^2$) times. The rest of the time spent depends upon how many matching (a, b) pairs exist. As we have noted, there are at most n such pairs, and so the management of the loop over C, and the commands within the body of that loop, use at most O($n^2$) time. By our standard application of Proposition 3.9, the total time spent is O($n^2$).

** Element Uniqueness **

- A problem that is closely related to the three-way set disjointness problem is the element uniqueness problem. In the former, we are given three collections and we presumes that there were no duplicates within a single collection. In the element uniqueness problem, we are given a single sequence S with n elements and asked whether all elements of that collection are distinct from each other.

- Our first solution to this problem uses a straightforward iterative algorithm. The unique1 function, given in Code Fragment 3.7, solves the element uniqueness problem by looping through all distinct pairs of indices j < k, checking if any of those pairs refer to elements that are equivalent to each other. It does this using two nested for loops, such that the first iteration of the outer loop causes n - 1 iterations of the inner loop, the second iteration of the outer loop causes n - 2 interations of the inner loop,, and so on. Thus, the worst-case running time of this function is proportional to 

In [9]:
def unique1(S):
    """Return True if there are no duplicate elements in sequence S."""
    for j in range(len(S)):
        for k in range(j+1, len(S)):
            if S[j] == S[k]:
                return False
    return True

** Using Sorting as a Problem-Solving Tool **

- An even better algorithm for the element uniqueness problem is based on using sorting as a problem-solving tool. In this case, by sorting the sequence of elements, we are guaranteed that any duplicate elements will be placed next to each other. Thus, to determine if there are any duplicates, all we need to do is perform a single pass over the sorted sequece, looking for *consective* duplicates. A Python implementation of this algorithm is as follows:

In [10]:
def unique2(S):
    """Return True if there are no duplicate elements in sequence S."""
    temp = sorted(S)
    for j in range(1, len(temp)):
        if S[j-1] == S[j]:
            return False
    return True

- Entirely, O(nlogn)

## 3.4 Simple Justification Techniques

- Sometimes, we will want to make claims about an algorithm, such as showing that it is correct or that it runs fast. In order to rigoroously make such claims, we must use mathematical language, and in order to back up such claims, we must justify or prove our statements.

### 3.4.1 By Example
- counterexample

### 3.4.2 The "Contra" Attack

- Contrapositive, contradiction (DeMorgan's Law)

### 3.4.3 Induction and Loop Invariants

** induction **

** Loop Invariants **

In [11]:
def find(S, val):
    """Return index j such that S[j] == val, or -1 if no such element."""
    n = len(S)
    j = 0
    while j < n:
        if S[j] == val:
            return j
        j += 1
    return -1