## Whats the relationship between Run-time, counting operations and Big-O

### What makes a Good Program/Code ?

- Readable
- Functional - It works according to the problem specification  - Test cases
- Ethical Coding Practices
- Usability - UI/UX Principles
- Maintability
- Efficiency
     - resource efficiency
          - Space
          - Runtime(Time)

### Runtime Efficiency
- How long does it take ? (Using Time as a unit)

In [None]:
### Find the Sum of 1 to n ), O(n)
def sum_n1 (n):
    ret=0
    for i in range(1,n+1):
        ret = ret + i
    return ret

In [None]:
### Find the Sum of 1 to n using formula for Arithmetic Series, O(1)
def sum_n2 (n):
    return (n/2)*(1+n) #

In [None]:
def sum_n3(n):
    return sum( range(1,n+1) )

In [None]:
n = 10**7

In [None]:
%%timeit -n 1 -r 5
sum_n1(n)

1.2 s ± 31.1 ms per loop (mean ± std. dev. of 5 runs, 1 loop each)


In [None]:
%%timeit -n 1 -r 1
sum_n2(n)

3.7 µs ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


In [None]:
%%timeit -n 1 -r 5
sum_n3(n)

674 ms ± 10.2 ms per loop (mean ± std. dev. of 5 runs, 1 loop each)


### Misconception

the word runtime is very misleading as it leads us to use time as a unit. The time it takes for code execution is dependent mostly on cpu, compiler, language, memory caching and other variables that has nothing to do with the algorithm.

Instead what we really want to measure is the workload generated by the algorithm with an particular input size. Workload is measured by the primitive operations that needs to he perform.

### How many primitive operations ? (Using integer value as a unit)

- A primitive operation is an operation that a CPU can perfrom

In [None]:
import dis
dis.dis(sum_n1)

In [None]:
import dis
dis.dis(sum_n2)

In [None]:
import dis
dis.dis(sum_n3)

### Counting Operations

To count the number of operations of a block of code execution, we add up the number of operations.

Let's use a function $f(n)$ to denote this value wrt to the input size $n$

### Example

How many operations are there in the following block of code?

In [None]:
n = 5
operations = 0
a = n/2
operations+=1
b = 1+n
operations+=1
print(a * b, end="")
operations+=1
print(f"\nOperations count={operations}")

15.0
Operations count=3


$f(n) = 3$

In [None]:
n=5
operations = 0
for i in range(n):
    operations+=1
    print('a',end=",")    # n times

for i in range(n):# n times
    operations+=1
    print('b',end=",")
print(f"\nOperations count={operations}")

$f(n) = 2n$

On the other hand, if we have nested statement or loops, we **multiply** the number of operations.

### Example
How many operations are there in the following block of code?

In [None]:
n = 5
operations = 0
for i in range(n): # n times
    operations+=1
    for j in range(n): # n times
            operations+=1
            print('a',end=",")
    print()
print(f"\nOperations count={operations}")

a,a,a,a,a,
a,a,a,a,a,
a,a,a,a,a,
a,a,a,a,a,
a,a,a,a,a,

Operations count=30


$f(n) = n^2 + n$ , 5 outer + 25 inner

#### What about this: ?

In [None]:
def find_duplicates_with_early_exit(arr):
    """
    Find if there are any duplicate elements in the array.
    Different input values lead to different execution paths.
    """
    n = len(arr)
    operations = 0 # we ignore these count for simplicity , otherwise just initialise operations to 2

    # Outer loop - always executes
    for i in range(n):
        operations += 1  # Outer loop iteration

        # Inner loop - execution depends on input values
        for j in range(i + 1, n):
            operations += 1  # Comparison operation

            if arr[i] == arr[j]:
                print(f"Duplicate found at positions {i}, {j}! Operations: {operations}")
                return True  # EARLY EXIT - different execution path!

    print(f"No duplicates found. Operations: {operations}")
    return False


#### Test with arrays of the same size (n=5) but different values

In [None]:
# BEST CASE: Duplicate found immediately

print("BEST CASE - Duplicate in first two elements:")
best_case = [1, 1, 3, 4, 5]
find_duplicates_with_early_exit(best_case)
print()


BEST CASE - Duplicate in first two elements:
Duplicate found at positions 0, 1! Operations: 2



$f(n) = 2$ , 1 outer + 1 comparison operations

In [None]:
# AVERAGE CASE: Duplicate found after more comparisons
# Need to position duplicate further to get more operations
print("AVERAGE CASE - Duplicate found after several comparisons:")
avg_case = [1, 2, 3, 4, 1]  # duplicate at positions 0 and 4 (last element)
find_duplicates_with_early_exit(avg_case)
print()


AVERAGE CASE - Duplicate found after several comparisons:
Duplicate found at positions 0, 4! Operations: 5



1 outer + 4 comparisons operations

$f(n) = 1 + (n-1) \\$
$f(n)= n$

In [None]:
# WORST CASE: No duplicates, must check all pairs
#
print("WORST CASE - No duplicates, check all pairs:")
worst_case = [1, 2, 3, 4, 5]
find_duplicates_with_early_exit(worst_case)
print()



5 outer + (4 + 3 + 2 + 1) comparisons operations

$ f(n) = n + \frac{n-1}{2}\left(  1 + (n-1) \right) $

$f(n) = \frac{1}{2}(n^2 + n) $


### So why don't we use counting code operations to measure time complexity since it is now independent of CPU/hardware ?

- The actual code execution for a given input size $n$ will take different ***paths***
    - You can use a trace table/tree to trace the execution paths
    - In the above example, for the same array size, the algorithm will have different execution paths depending on the order of the values in the array.
- which means that for the same code and input size:

$\begin{aligned}
    & f(n) = 2 \\
    & f(n) = n \\
    & f(n) = \frac{n^2}{2} - \frac{n}{2} + 5\\
\end{aligned}$

## The Big-O Notation

We can use the Big-O notation  to represent the above set of $f(n)s$:

$f(n) ∈ O(g(n))$ if there exist constants $c > 0$ and $n_0 ≥ 0$ such that

$f(n) ≤ c \cdot g(n)$  for all $n ≥ n_0$

**In words,**

- $f(n) ∈ O(g(n))$
means $f(n)$ is in the set of functions that grow no faster than $g(n)$, up to a constant factor, for large n

OR

- $f(n)$ is asymptotically bounded above by $g(n)$


[Asymptotic Graph of $O(n^2)$](https://claude.ai/public/artifacts/55b7565d-1174-432e-87a7-6fd324c1c3fc)

### To simplify our analysis
-  ***we shall just consider how the runtime of an algorithm grows in relation to the size of the input***
- Big O notation is used to describe the **order of growth** by classifying group of functions using an asymptopic funtion

- so we measure time complexity by looking at how the workload grows in relation to the input size


Two further very important principles in working with the Big-O notation:
- Constant factors doesn't matter. In other words, if $f\left(n\right)$ is $O\left(c \cdot g\left(n\right)\right)$ for some constant $c>0$, then $f\left(n\right)$ is $O\left(g\left(n\right)\right)$, i.e. we ignore the multiplicative constants.
- The low-order terms don't matter. For example, if $f\left(n\right)$ is $O\left(n^3+n\right)$, then $f\left(n\right)$ is $O\left(n^3\right)$. In particular, we can ignore the additive constants as well.


## Example

Determine the orders of growth of the algorithms with the following running time:
- $n^2+2n+2$, is O($n^2$)
- $n^2+10000n+3^{10000}$,is O($n^2$)
- $\log(n)+n+4$, is O(n)
- $0.0001n\log(n)+300n$,is O(n)
- $2n^{30}+3^n$.is O($3^n$)


The following table lists the common running times for algorithms and their names.

<center>

| Big-Oh | Name |
|-|-|
| $O\left(1\right)$ | constant |
| $O\left(\log n \right)$ | logarithmic |
| $O\left(n\right)$ | linear |
| $O\left(n\log n\right)$ | log-linear |
| $O\left(n^2\right)$ | quadratic |
| $O\left(n^k\right)$ | polynomial, $k\in \mathbb{Z^+}$ |
| $O\left(k^n\right)$ | exponential, $k\in \mathbb{Z^+}$ |

</center>

The entries in the table above are arranged in the order of ascending <b>efficiency</b> ~~running time~~, i.e. the lower its position is in the table, the slower the running time is.

For most cases, the following holds for algorithms:
- $O(1)$ - algorithm doesn't depend on input size
- $O(\log n)$ - problem gets reduced in half each time through the process
- $O(n)$ - simple iterative or recursive programs
- $O(n^k)$ - nested loops or recursive calls
- $O(k^n)$ - multiple recursive calls at each level

### What are the runtime complexity of the following code

In [None]:
## Ex 1 O(n)
def f(n):
    x = n
    while ( x > 0 ): # n,n-1,n-2, .. 1
        x = x - 1

In [None]:
## Ex 2 O(log n)
def f(n):
    i = n
    while ( i > 0 ): # n, n/2, (n/2)/2, .. 1
        i = i // 2

#### Number of times line 5 executes
$ n , \dfrac{n}{2^1} , \dfrac{n}{2^2} , \dfrac{n}{2^3} , .. 1 $

$ \dfrac{n}{2^x} = 1 $

$ n = 2^x $

$ \log_2 n = \log_2 2^x $

$ x = log_2 n $

$ log_2 n = log_m n $

we can express,
$ log_2 n = \dfrac{\log_k n}{\log_k 2} $

$ log_2 n = \dfrac{1}{\log_k 2}\log_k n$

$ log_2 n = {C}log_k n$

Therefore we can say, $log_2 n$ is $O(\log n)$


In [None]:
## Ex 4
def f(n): ## O(n**2)
    for i in range(n-1): ## 0,1,..n-2 (n-1) times
        for j in range(n-1-i): #
            print(f"{'*'*n}") #n-1, n-2, n-3,...

f(2)

In [None]:
## Ex 3
def f(n): ## O(n**2)
    x = n
    while ( x > 0 ): # n times
        y = n
        while ( y > 0 ): # n times
            y = y - 1
        x = x - 1

In [None]:
## Ex 5
def f(n): ## O(nlogn)
    x = n
    while ( x > 0 ):
        y = n
        while ( y > 0 ):
            y = y // 2
        x = x - 1
