# Algorithm efficiency / complexity

Algorithm are evaluated in terms of the efficiency with which they can solve a task: the more resources they require, the less efficient and the more "*complex*" they are considered.
There are several dimensions along which we can consider the efficiency complexity of an algorithm:

- Run-time
- Memory
- Storage

In machine learning, we also need to consider each of these in both the context of training AND in the context of inference.

There are several approaches we can take to measuring this efficiency:

1. **Big-O** $O(n)$: Worst-case, upper-bound
2. **Big-Theta** $\Theta(n)$: Average-case
3. **Big-Omega** $\Omega(n)$: Best-case, lower-bound

We generally focus on **Big-O** because it provides us with guarantees, and we are able to reason about the behaviour of combinations of algorithms in terms of the upper-bound in ways that we can't when looking at best/average-cases.

## Big O Notation

***Big O*** notation serves to describe and compare how the running time of different algorithms scale (ie. how the running time will grow as we increase the size of the inputs). It does **not** tell us how fast the running time of an algorithm will be, rather it tells us how fast that running time will grow as we apply it to larger inputs.

We look at running time through this lens because with modern hardware, pretty much everything can run fast on 10 items. If one approach takes 20 ms to return an answer, and an alternative approach takes 24 ms to return an answer, then - at that scale - I probably won't care which approach is selected.

However, we expect that the running time will grow as we work with more data points. As we start to work with much larger datasets, then the rate at which running time grows can be the difference between being able to compute the result on your laptop Vs requiring millions of dollars of compute time in a data center.

### Common scaling functions

- $O(1)$ **Constant Time**: Resource requirements are independent of $n$ .
- $O(log(n))$ **Logarithmic Time**: Resource requirements grow less than proportionally with $n$ .
- $O(n)$ **Linear Time**: Resource requirements grow in proportion to $n$ .
- $O(nlog(n))$ **Logarithmic Linear Time**: If the algorithm has to do work of a certain complexity (eg. $log(n)$ ) for each input (ie. $n$ ), then we multiply these together work n times which is where multiplication comes in place.
- $O(n²)$ **Quadratic Time**: The algorithm is proportionally the squared number of inputs.
- $O(2^n)$ **Exponential Time**: The algorithm is growths a constant amount of time, (in this case doubles) within each addition of input n.
- $O(n!)$ **Factorial Time**: Any algorithm that calculates all permutation of a given array is O(N!).


In [6]:
import numpy as np
import plotly.graph_objects as go


input_sizes = [1, 2, 3, 4, 5]

log_n = np.log(input_sizes)
linear = input_sizes
n_squared = np.square(input_sizes)
factorial = [np.math.factorial(element) for element in input_sizes]

fig = go.Figure()
fig.add_trace(
    go.Scatter(x=input_sizes, y=log_n, mode='lines', name='log(n)')
)
fig.add_trace(
    go.Scatter(x=input_sizes, y=linear, mode='lines', name='O(n)')
)
fig.add_trace(
    go.Scatter(x=input_sizes, y=n_squared, mode='lines', name='O(n^2)')
)
fig.add_trace(
    go.Scatter(x=input_sizes, y=factorial, mode='lines', name='O(n!)')
)
fig.show()