## Profiling (7 pts + 2 bonus pts)

Before we go any further and start looking at how vectorization makes your program faster, we need to talk about profiling. Profiling is the act of measuring performance of a program, either by timing it or by looking into memory access, depending on what is you are trying to measure.

(Follow the instructions here: https://jakevdp.github.io/PythonDataScienceHandbook/01.07-timing-and-profiling.html to setup the profilers)

# **Remember to save your file after generating all the required results. Then we can directly see your results.**

### Time

This is the most common profiler. In a python code you just import the time module and measure starting and ending time. For IPython we can call the %time %%time and %%timeit magic

In [2]:
%time?

# Question 1 (0.5 pts)
Run the following code and explain its output

In [1]:
%%time
total = 0
for i in range(1000):
    for j in range(1000):
        total += i * (-1) ** j

CPU times: user 1.48 s, sys: 0 ns, total: 1.48 s
Wall time: 1.48 s


#### Your answer goes here

so in this case, the code just printed out the the time needed to execute the given statement.

# Question 2 (0.5)
There are two blocks of code below performing the same function on a given input, explain why the second sort is much faster

In [3]:
import random
L = [random.random() for i in range(100000)]
%time L.sort()

CPU times: user 124 ms, sys: 0 ns, total: 124 ms
Wall time: 138 ms


In [4]:
%time L.sort()

CPU times: user 16 ms, sys: 0 ns, total: 16 ms
Wall time: 19.6 ms


Your answer goes here
basically, the difference is that we are going through the sort on a random unsorted list vs a sorted list which causes the discrepancy.

# Question 3 (1 pts)
Use Python memory_profiler to profile your own code and explain the results

In [12]:
!pip3 install memory_profiler

Defaulting to user installation because normal site-packages is not writeable


In [13]:
%load_ext memory_profiler

The memory_profiler extension is already loaded. To reload it, use:
  %reload_ext memory_profiler


In [19]:
def sum_of_lists(N):
    total = 0
    for i in range(10):
        L = [i ^ (i) for i in range(N)]
        total -= sum(L)
    return total
%prun sum_of_lists(300000)

 

Your answer goes here

it outputs a table which as the order of total time on each function call, where the execution is spending the most time. in this case its the <ipython-input-19-68e8321615cf>:4(<listcomp>)

# Question 4 (7 pts)
Run the following codes to measure execution time, memory usage and answer the following questions.
Note: Make sure to install any missing Python packages

1. This code snippet defines and runs a simple Python function hello() that prints 'hello world!'. It also employs the memory_profiler module to profile the memory usage of the hello() function with a specified precision.

In [30]:
%%time
%%file helloworld.py
from memory_profiler import profile

@profile(precision=4)
def hello():
	print("hello world!") 

hello()

Overwriting helloworld.py


In [29]:
%run -i helloworld.py

hello world!
Filename: /home/eric/Downloads/helloworld.py

Line #    Mem usage    Increment  Occurrences   Line Contents
     3  75.6211 MiB  75.6211 MiB           1   @profile(precision=4)
     4                                         def hello():
     5  75.6211 MiB   0.0000 MiB           1   	print("hello world!") 




2. This code snippet demonstrates memory profiling for a Python function my_func() that creates, manipulates, and deletes large lists, showcasing how memory usage changes with these operations

In [36]:
%%time
%%file expressions.py
from memory_profiler import profile
@profile(precision=4)
def my_func():
    a = [1] * (10 ** 6)
    b = [2] * (2 * 10 ** 7)
    del b
    return a
my_func()

Overwriting expressions.py
CPU times: user 4 ms, sys: 0 ns, total: 4 ms
Wall time: 4 ms


In [23]:
%run -i expressions.py

Filename: /home/eric/Downloads/expressions.py

Line #    Mem usage    Increment  Occurrences   Line Contents
     2  61.7695 MiB  61.7695 MiB           1   @profile(precision=4)
     3                                         def my_func():
     4  69.1094 MiB   7.3398 MiB           1       a = [1] * (10 ** 6)
     5 221.7344 MiB 152.6250 MiB           1       b = [2] * (2 * 10 ** 7)
     6  69.2891 MiB -152.4453 MiB           1       del b
     7  69.2891 MiB   0.0000 MiB           1       return a




3. This code snippet profiles memory usage of the function math_funcs(), which demonstrates the application of logarithmic, cosine, and reciprocal functions from the NumPy library on an array of numbers, and prints the results for each operation.

In [33]:
%%time
%%file math_funcs.py
from memory_profiler import profile
import math
import numpy as np

@profile(precision=4)
def math_funcs():
	inp_arr = [10, 20, 30, 40, 50] 
	print ("Array input elements:\n", inp_arr) 

	res_arr = np.log(inp_arr) 
	print ("Applying log function:\n", res_arr)

	res_arr2 = np.cos(inp_arr) 
	print ("Applying cos function:\n", res_arr2)

	res_arr3 = np.reciprocal(inp_arr) 
	print ("Applying reciprocal function:\n", res_arr3)


math_funcs()

Overwriting math_funcs.py
CPU times: user 4 ms, sys: 0 ns, total: 4 ms
Wall time: 1.82 ms


In [25]:
%run -i math_funcs.py

Array input elements:
 [10, 20, 30, 40, 50]
Applying log function:
 [ 2.30258509  2.99573227  3.40119738  3.68887945  3.91202301]
Applying cos function:
 [-0.83907153  0.40808206  0.15425145 -0.66693806  0.96496603]
Applying reciprocal function:
 [0 0 0 0 0]
Filename: /home/eric/Downloads/math_funcs.py

Line #    Mem usage    Increment  Occurrences   Line Contents
     5  75.2500 MiB  75.2500 MiB           1   @profile(precision=4)
     6                                         def math_funcs():
     7  75.2500 MiB   0.0000 MiB           1   	inp_arr = [10, 20, 30, 40, 50] 
     8  75.2539 MiB   0.0039 MiB           1   	print ("Array input elements:\n", inp_arr) 
     9                                         
    10  75.2539 MiB   0.0000 MiB           1   	res_arr = np.log(inp_arr) 
    11  75.2539 MiB   0.0000 MiB           1   	print ("Applying log function:\n", res_arr)
    12                                         
    13  75.2539 MiB   0.0000 MiB           1   	res_arr2 = np.co

4. This code snippet, using memory profiling, demonstrates a nested loop in Python where it iterates through combinations of adjectives and fruit names, printing each pair.

In [31]:
%%time
%%file loops.py
from memory_profiler import profile
import numpy as np
import ctypes
import math
import time

@profile(precision=4)
def my_loops():
	adj = ["red", "big", "tasty"]
	fruits = ["apple", "banana", "cherry"]

	for x in adj:
 		 for y in fruits:
   			 print(x, y)


my_loops()

Overwriting loops.py
CPU times: user 4 ms, sys: 0 ns, total: 4 ms
Wall time: 3.4 ms


In [27]:
%run -i loops.py

red apple
red banana
red cherry
big apple
big banana
big cherry
tasty apple
tasty banana
tasty cherry
Filename: /home/eric/Downloads/loops.py

Line #    Mem usage    Increment  Occurrences   Line Contents
     7  75.6094 MiB  75.6094 MiB           1   @profile(precision=4)
     8                                         def my_loops():
     9  75.6094 MiB   0.0000 MiB           1   	adj = ["red", "big", "tasty"]
    10  75.6094 MiB   0.0000 MiB           1   	fruits = ["apple", "banana", "cherry"]
    11                                         
    12  75.6094 MiB   0.0000 MiB           4   	for x in adj:
    13  75.6094 MiB   0.0000 MiB          12    		 for y in fruits:
    14  75.6094 MiB   0.0000 MiB           9      			 print(x, y)




## Question 4.1 (1.5 pts)
Modify each of the above function to capture their execution time (Both CPU and Wall). You can modify the code directly, if required.

In [None]:
# You can modify the code in-place or re-write the code here

## Question 4.2 (1.5 pts)

What patterns did you notice between each of the above function with respect to latency, memory usage and code ?


Your answer goes here

when any line of code was called, if the values were already hardcoded, it didnt seem to increment the memory usage at all, but if there was any calculation involved then it would increase the memory usage. Also for the first function there were many instances of memory usage incrementing and decrementing while =he other functions were only incremented at the start for the most part. this matches the pattern in the latency as well. while the cpu times were estimated at a higher value, the functions wall time or time elapsed in real time were lower except for the first one,

## Question 4.3 (2 pts)
Using %time magic command, we can trace overall code execution time. Sometimes, you might have to get deeper insights to identify performance bottlenecks. Write your own code and profile execution time line by line.

In [35]:
import random
%time list = [random.random()-50 for i in range(256)]
%time list.sort()
%time list.sort()


CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 186 µs
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 143 µs
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 50.5 µs


## (Bonus) Question 4.4 (2 pts)
Memory usage of a program can also be reported as a function of time. Profile memory of any of the above code as a function of time.
Submit your profile results and a plot of the results (Mem used vs Time).

In [None]:
# Your code goes here

Plot goes here