# Chapter 03: Algorithm Analysis

* **Data structure** is a systematic way of organizing and accessing data.
* **Algorithm** is a step-by-step procedure for performing some task in a finite amount of time.

* Limitation of Experimental Analysis
  1. Experimental running times is not an objective measure for comparing several algorithms since it is heavily rely on hardware and software environments.
  2. Experiments can be done only on a limited set of test inputs. We don't know what will happen for cases not included in the experiment. 
  3. Experimental analysis requires actual implementation of code for measuring performance.

Therefore, our way of analyzing the efficiency of algorithms should overcome the defects above.

1. Counting Primitive Operations
  
  We define a set of **primitive operations** such as following:
  * Assigning an identifier to an object
  * Determining the object associated with an identifier
  * Performing an arithmetic operation
  * Comparing two numbers
  * Accessing a single element of a Python `list` by index
  * Calling a function (excluding operations executed wihtin the function)
  * Returning from a function
  
  Formally, a primitive operation corresponds to a low-level instruction with an execution time that is constant. Instead of trying to determine the specific execution time of each primitive operation, we will simply count how many primitive operations are executed, and use this number, $t$, of primitive operations an algorithm performs will be proportional to the actual running time of that algorihtm.
  
2. Measuring Operations as a Function of Input Size

  To capture the order of growth of an algorithm's running time, we will associate, with each algorithm, a function $f(n)$ that characterizes the number of primitive operations that are performed as a function of the input isze $n$.
  
3. Focusing on the Worst-Case Input

  An algorithm may run faster on some inputs than it does on others of the same size. Thus, we may wish to express the running time of an algorithm as the function of the input size obtained by taking the average over all possible inputs of the same size. Unfortunately, such an **average-case** analysis is typically quite challenging. It requires us to define a probability distribution on the set of inputs, which is often a difficult task.
  
  An average-case analysis usually requires that we calculate expected running times based on a given input distribution, which usually involves sophisticated probability theory. Therefore, we will characterize runnning times in terms of the **worst case**, as a function of the input size, $n$, of the algorithm.
  
  Worst-case analysis si much easier than average-case analysis, as it requires only the ability to identify the wors-case input, which is often simple.

## 3.2 The Seven Functions Used in This Book

1. The Constant Function
    
    $$f(n) = c$$

2. The Logarithm Function

    $$f(n) = \log_b n$$
    
3. The Linear Function

    $$f(n) = n$$

4. The N-Log-N Function

    $$f(n) = n \log n$$

5. The Quadratic Function

    $$f(n) = n^2$$

6. The Cubic Function and Other Polynomials

    $$f(n) = n^3$$
    
    $$f(n) = a_0 + a_1n + a_2n^2 + a_3n^3 + \cdots + a_dn^d$$
    
7. The Exponential Function

    $$f(n) = b^n$$
    

## 3.3 Asymptotic Analysis

### 3.3.1 The "Big-Oh" Notation
Let $f(n)$ and $g(n)$ be the functions mapping positive integers to positive real umbers. We say that $f(n)$ is $O(g(n))$ if there is a real constant $c > 0$ and an integer constant $n_0 \geq 1$ such that

$$f(n) \leq cg(n), \quad \text{for} \quad n \geq n_0$$

This definition is often referred to as the "big-Oh" notation, for it is sometimes pronounced as $``f(n) \ is \ \boldsymbol{big-Oh} \ of \ g(n)."$

The big-Oh notation allows us to say that a function $f(n)$ is "less than or equal to" another function $g(n)$ up to 
a constant factor and in the **asymptotic** sense as $n$ grows toward infinity.
