# Code Optimization

Code optimization is the process of making an application work more efficiently, usually without modifying its functionality and accuracy. Code optimization is usually concerned with the speed of processing, but can also be used to minimize the usage of different resources, such as memory, disk space, or network bandwidth.

Jaworski , Michal and Tarek Ziadé. Expert Python Programming. Packt Publishing, 2019.


Even the best logging, metrics, and tracing systems will give you only a rough overview of the performance problem. If you decide to fix it, you will have to perform a careful profiling process that will uncover detailed resource usage patterns

What are the main performance killers?
- Excessive complexity
- Excessive resource allocation and resource leaks
- Excessive I/O and blocking operations


## Code complexity
The two most popular ways to define application complexity are as follows:

**Cyclomatic complexity** - which is very often correlated with application performance.

**The Landau notation** - also known as big O notation, is an algorithm classification method that is useful in objectively judging code performance.

## Cyclomatic complexity
Short and sweet higher complexity = lower performance

| | | 
|:---|:---|
|Cyclomatic complexity value|Complexity class|
|1 to 10|Not complex|
|11 to 20|Moderately complex|
|21 to 50|Really complex|
|Above 50|Too complex|

## The big O notation
Defines how an algorithm is affected by the size of the input
To measure the big O notation, all constants and low-order terms are removed in order to focus on the portion that really matters when the size of the input data grows very large.


This is commonly expressed using **Big O notation**.

| Big O        | Name                | Example                |
|--------------|---------------------|------------------------|
| O(1)         | Constant time        | Accessing an array element |
| O(log n)     | Logarithmic time     | Binary search           |
| O(n)         | Linear time          | Iterating over a list   |
| O(n log n)   | Linearithmic time    | Efficient sorting (e.g., mergesort) |
| O(n²)        | Quadratic time       | Nested loops over data  |
| O(2ⁿ), O(n!) | Exponential / factorial | Recursive combinatorics |

### Example
```python
def function(n):
    for i in range(n):
        print(i)

```

the print() function will be executed n times therefor O(n)

## Profiling CPU usage

There are two ways to profile the code:

**Macro-profiling** - This profiles the whole program while it is being used and generates statistics.

**Micro-profiling** - This measures a precise part of the program by instrumenting it manually.




## Macro-profiling
You can do different tools tow of the available to you in Python:

- `profile`: A pure-Python profiler suitable for teaching or light use.
- `cProfile`: A C-optimized profiler, more efficient and widely used in practice.

In [None]:
import time
class runIT(object):
    def __init__(self):
        for i in range(5): 
            self.heavy() 

    def medium(self): 
        time.sleep(0.01) 
     
    def light(self): 
        time.sleep(0.001) 
     
    def heavy(self): 
        for i in range(100): 
            self.light() 
            self.medium() 
            self.medium() 
        time.sleep(2) 

In [None]:
%%prun -s cumulative -q -l 10 -T prun0
runs = runIT()

In [None]:
print(open('prun0', 'r').read())

The meaning of each column is as follows:


| Column        | Description                                                                 |
|---------------|-----------------------------------------------------------------------------|
| `ncalls`      | Total number of calls to the function                                        |
| `tottime`     | Total time spent in the function (excluding subcalls)                        |
| `percall`     | `tottime` divided by `ncalls` (avg time per direct call)                    |
| `cumtime`     | Cumulative time including all subcalls                                      |
| `percall`     | `cumtime` divided by `ncalls` (avg time including subcalls)                 |


In [None]:
import cProfile
profiler = cProfile.Profile()
profiler.runcall(runIT)
profiler.print_stats()

## Profiling your code line-by-line with line_profiler

Sometimes we need an even more detailed analysis of code performance 

This is particularly useful when optimizing algorithms, loops, or numerically intensive routines.

To profile code line-by-line, we need an external Python module named `line_profiler`.


```bash
pip install line_profiler
```


In [None]:
%%writefile runIT.py 
import time
def medium(): 
    time.sleep(0.01) 
 
def light(): 
    time.sleep(0.001) 
 
def heavy(): 
    for i in range(100): 
        light() 
        medium() 
        medium() 
    time.sleep(2) 
 
def runit(n): 
    for i in range(n): 
        heavy()

In [None]:
from runIT import runit

In [None]:
import numpy as np
%load_ext line_profiler

The `line_profiler` package integrates with IPython via the `%lprun`

In [None]:
#%lprun -T lprof0 -f simulate simulate(50)
%lprun -T lprof0 -f runit runit(5)

In [None]:
print(open('lprof0', 'r').read())

In [None]:
#%%writefile rates.py
#from condenced from Lecture 11
import threading
import requests
import json
def fetch_rate(bases, symbols =['eur','jpy','usd'] ):
    for base in bases:
        web = "http://www.floatrates.com/daily/"+str(base)+".json"
        response = requests.get(web)
        rate = response.json()
        rate[base]= {'rate':1}
        
        #create a line to output the rate
        rates_line = ", ".join(
            [f"{symbol}{float(rate[symbol]['rate']):10.04}" 
             for symbol in symbols]
        )
        print(f"{base} = {rates_line}")

In [None]:
%lprun -T lprof1 -f fetch_rate fetch_rate(['eur','jpy','usd','rub','cad'])

In [None]:
#to profile multiple function within the code you can do the following

from line_profiler import LineProfiler
lp = LineProfiler()
lp.add_function(light)
lp.add_function(medium)
lp.add_function(heavy)
lpp = lp(runit)
lpp(1)


lp.print_stats()


In [None]:
print(open('lprof2', 'r').read())

# Profiling the Memory Usage

Memory profiling helps identify unnecessary allocations that can degrade long-running systems.

Writing memory-efficient code is critical for performance, especially in high-throughput or data-intensive applications (e.g., when working with large NumPy arrays or data frames).


The `memory_profiler` package integrates with IPython via the `%memit` magic command, allowing you to measure the memory usage of individual lines or functions.


``` bash
!pip install memory_profiler
````

In [None]:
%load_ext memory_profiler
from rates import fetch_rate

In [None]:
%mprun -T mprof0 -f fetch_rate fetch_rate(['eur','jpy','usd','rub','cad'])

In [None]:
%mprun -T mprof1 -f runit runit(5)

In [None]:
%mprun -T mprof2 -f my_func my_func()

the memory_profiler IPython extension also comes with a %memit magic command that lets us benchmark the memory used by a single Python statement

In [None]:
%%memit 
with open('ItemData.json') as f:
    items = json.load(f)


# Practical Speed Improvements

In [None]:
import random
l = [random.normalvariate(0,1) for i in range(100000)]
print(type(l))

In [None]:
#function that computes the sum of all numbers in that list
def sum1():
    res = 0
    for i in range(len(l)):
        res = res + l[i]
    return res
a = %timeit sum1()

In [None]:
#same function but using the fact that python can enumerate 
#the elements of a list using for x in l instead of iterating with an index
def sum2():
    res = 0
    for x in l:
        res = res + x
    return res
b = %timeit sum2()

In [None]:
#using pythongs built-in function to 
#compute the sum of all elements in a list
def sum3():
    return sum(l)
c = %timeit sum3()

Strings

In [None]:
strings = ['%.3f' % x for x in l]

In [None]:
#A function concatenating all strings in that list
def concat1():
    cat = strings[0]
    for s in strings[1:]:
        cat = cat + ', ' + s
    return cat
%timeit concat1()

In [None]:
#A function using pythongs built in string concatinator
def concat2():
    return ', '.join(strings)

%timeit concat2()

In [None]:
print(concat1()[:24])
print(concat2()[:24])

In [None]:
print(strings[:3])
print(l[:3])

I hope this shows you the power of generators and functional programs. 

# OMG ITS SO FAST

[Numba](https://numba.pydata.org/) is a Just-in-Time (JIT) compiler that translates a subset of Python (primarily numerical code) into optimized machine code using LLVM.


Performance speedups when compared to pure Python code can reach several orders of magnitude (10x to 1000x) and may even outmatch manually-vectorized NumPy code.

In this section, we will show you how to accelerate pure Python code generating a Mandelbrot fractal.


In [None]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
size = 400
iterations = 100

To enable JIT acceleration, decorate your functions with:

```python
from numba import jit

@jit
def compute(...):
    #code goes here
```


In [None]:
@jit
def mandelbrot_python(size, iterations):
    m = np.zeros((size, size))
    for i in range(size):
        for j in range(size):
            c = (-2 + 3. / size * j +
                 1j * (1.5 - 3. / size * i))
            z = 0
            for n in range(iterations):
                if np.abs(z) <= 10:
                    z = z * z + c
                    m[i, j] = n
                else:
                    break
    return m

In [None]:
m = mandelbrot_python(size, iterations)
fig, ax = plt.subplots(1, 1, figsize=(10, 10))
ax.imshow(np.log(m), cmap=plt.cm.hot)
ax.set_axis_off()

In [None]:
%timeit mandelbrot_python(size, iterations)

In [None]:
#!pip install numba
from numba import jit

In [None]:
%timeit mandelbrot_python(size, iterations) #do it again with numba

## NOTE on JIT Compilation: 
Python bytecode is normally interpreted at runtime by the Python interpreter (most often, CPython). 

Numba functions are parsed and translated directly to machine code ahead of execution, using a powerful compiler architecture named Low Level Virtual Machine (LLVM).

Numba generally gives the most impressive speedups on functions that involve tight loops on NumPy arrays.



The @jit can only compile code that is also decorated with @jit or there is a known replacement inside Numba.

Please see http://numba.pydata.org/numba-doc/latest/user/5minguide.html#will-numba-work-for-my-code for a quick description of what Numba support.