# Chapter 2 - The Mathematics of Algorithms

## Size of a Problem Instance
An instance problem is a particular input data set given to a program. In most problems, the execution time increases with the size of this dataset. Also, overly compact representation (such as those using compression) may unnecessarily slow down things too. 

- When evaluating an algorithm, make sure as much as possible to assume that the encoding is not the determining factor in whether an algorithm can be implemented as efficiently as possible. 
- The representation of problem instance should just depend upon the type and variety of operations that need to be performed. 
- Designing efficient algorithms often start with selecting the proper data structures in which to represent the problem. 

Because we cannot formally define the size of an instance, we assume an instance is encoded in some generally accepted and consize manner.


**Example:** Sorting $n$ integers, we adopt the general convension that each number fits in a 32-bit word in the computing platform, and size of instance to be sorted is $n$. In case some numbers require more than one word, but only a constant, fixed number of words, then measure of the size of an instance is off only by a multiplicative constant. Therefore using 64-bits, the algorithm takes twice as long. 

## Rate of Growth of Functions
The function of rate of growth of execution time as the size of input problem instance increases is the rate of growth of functions.

Characterising algorithm's peformance this way is a common abstraction ignoring numerous details like:
- Computer running it, CPU, data cache, FPU, and other on-chip features. 
- Programming Language and Compiler/Interpreter and optimization settings for generated code. 
- OS
- Background Processes

We assume that changing the platform will change the execution time by a constant factor, and that we can therefore ignore platform differences in conformance with the asymptotically equivalent principle described earlier. 

**Example:** Sequential Search:
- There are $n$ distinct elements in the list. 
- The list contains the desired value $v$.
- Each elememt in the list is equally likely to be the desired value $v$.
To understand the performance of sequential search, we must know how many elements it examines on average. Since $v$ is known to be in the list and each element is equally likely to be $v$, the average number of elements examined for each of the $n$ values divided by $n$. Mathematically:
$$ E(n) = \frac{1}{n} \sum_{i=1}^{n} i = \frac{n(n+1)}{2n} = \frac{1}{2}n + \frac{1}{2} $$

Thus, sequential search examines half of the elements in a list of $n$ distinct elements subject to these assumptions. If no. of elements doubles, then sequential search should examine twice as many elements. The expected number of probes is a linear function of $n$, i.e. $c \times n$ for some constant $c$, here, $c=0.5$.

Fundamental fact is that $c$ is unimportant in the long run. 

As $n$ gets larger, error in claiming that:

$ \frac{1}{2}n \approx \frac{1}{2}n+\frac{1}{2}$

becomes less significant. The ratio between the two sides of this approximation approaches 1. i.e.:
$$ \lim_{n \to \infty} \frac{\big(\frac{1}{2}n\big)}{\big(\frac{1}{2}n+\frac{1}{2}\big)} = 1$$

When using abstraction of rate of growth, remember:
- Constants Matter : That is why we use supercomputers and upgrade our computers on a regular basis. 
- Size of $n$ is not always large : **Example:** Quick Sort grows slower than Insertion Sort, yet Insertion Sort outperforms Quick Sort for small arrays. 

Consider 4 sorting algorithms sorting a block of *n* random strings. 50 trials were run, and average running time is plotted below:

![image.png](attachment:2765393d-e52d-4a1e-8cb6-134946cdcadb.png)

The variance between runs is surprising. 

One way to interpret these results is to try and design a function that will predict the performance of each on a problem instance sized $n$. 
We use commercially available software to compute a trend line with a statistical process **Regression Analysys**. 

The fitness of a trend line to the actual data is based on a value between 0 and 1, known as $R^2$ value. Values near 1 indicate high fitness, and 0 indicate lower fitness.

**Example:** If $R^2 = 0.9948$, then there is only $0.52%$ chance the fitness is due to random variations in the data. 

Sort 4 ($y=0.0053n^2 - 0.3601n + 39.212$, $R^2 = 0.9948$) is clearly the worst performing of these sorting algorithms. $R^2$ so close to 1 indicates that this is an accurate estimate. 

Sort 2 ($y=0.05765n\log(n)+7.9653$) offers the fastest implementation over given range of data points.
Sort 2 marginally outperforms Sort 3 initially, and its ultimate behaviour is perhaps 10% faster than Sort-3.  

Sort 3 has 2 distinct behavioural patters: For blocks of 39 or fewer strings, the  behaviour is classified as $y = 0.0016n^2+0.2939n+3.1838$, $R^2 = 0.9761$. However with 40 or more strings, it is $y = 0.0798n\log(n)+142.7818$

Numeric Coefficients are platform dependent. Long term trend of increasing $n$ dominates the computation of these behavoiurs. 

The real world behaviour may not be apparent until $n$ is large enough. 

## Analysis in Best, Average and Worst Case
Will the result be true for all problem instances? How will the behaviour of, say Sort 2, change with different input problem of same size?
- Data could contain large runs of elements already in sorted order. 
- Input could contain duplicate values. 
- Regardless of size, the elements could be drawn from much smaller set and contain significant amount of duplicates. 


### Worst Case
Problem case where algorithm exhibits worst runtime behaviour.

For a given program and a given value n, the worst-case execution time is the maximum execution time, where the maximum is taken over all instances of size n.
If $S_n$ is the set of instances $s_i$ of size $n$, and $t()$ is a function that measures the work done by an algorithm on each instance, then work done by an algorithm on $S_n$ in the worst case is the maximum of t(si) over all $s_i$ ∈ $S_n$. Denoting this worst-case performance on Sn by Twc(n), the rate of growth of $T_{wc}(n)$ defines the worst-case complexity of the algorithm.

### Average Case
Defines expected behaviour when executing algorithm on random problem instances. 

### Best Case
Problem instance where algorithm exhibits best runtime behaviour. 