In [None]:
import time
import matplotlib.pyplot as plt
import numpy as np
import sys

# Making code faster

Code optimization is the process of modifying a program to make some aspect of it work more efficiently. In general, a computer program may be optimized to deliver high speed, or to make it consume less resources (i.e. CPU, memory, electricity).

Today, we will taking at closer look at what can make our code run faster. 

You will use the magic method `%%timeit` to measure the runtime of your code. Placing `%%timeit` at the beginning of a code cell will give you the mean time for executing the entire code cell. The method will automatically calculate the number of executions required to get sufficiently accurate/stable time results. Further documentation can be found [here](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-timeit).

### 18.1. Compare three ways to combine 3 lists
Consider the three lists below, `list1`, `list2`, and `list3`, with rangesize, n, set to `10`.
```py
list1 = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
list2 = [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
list3 = [20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
```
Write a program that returns a list of tuples, where each tuple contains one element each from the 3 lists. The indices should increment at the same time. For input list with rangesize 10, the output list will be:

```py
combList = [(0, 10, 20),
 (1, 11, 21),
 (2, 12, 22),
 (3, 13, 23),
 (4, 14, 24),
 (5, 15, 25),
 (6, 16, 26),
 (7, 17, 27),
 (8, 18, 28),
 (9, 19, 29)]
```

Implement three different solutions using the following methods:
1. Using a standard loop based approach
2. Using list comprehension
3. Using the `zip` method

Experiment with rangesize, n, set to the values `50, 100, 1000, 10000, 100000`. 

In [None]:
n = 50                   # increment n as described

list1 = range(n)         # for n = 50, range(50)
list2 = range(n,2*n)     # for n = 50, range(50,100)
list3 = range(2*n,3*n)   # for n = 50, range(100,150)

In [None]:
%%timeit
#for loops

In [None]:
%%timeit
#list comprehension

In [None]:
%%timeit
#zip

## Lists vs. Numpy arrays

### 18.2. Find the mean of elements in a list
Consider a the list `py_list = range(1,n)`. Find the mean of the elements in the list.
Find the runtime for rangesizes, `n`, with the values `50,100,1000,10000,100000`.

In [None]:
n = 50
py_list = range(1,n)

In [None]:
%%timeit
# mean using list

### 18.3.  Find the mean of elements in a numpy array
Turn the generated `py_list` into a numpy array. Find the mean of the elements in the numpy array.
Find the runtime for rangesizes, `n`, with the values `50,100,1000,10000,100000`.

- Compare the runtime for the operation on the numpy array vs. the python list

In [None]:
numpy_arr = np.array(py_list)

In [None]:
%%timeit
# mean using numpy array

##  Python, Numpy and Multiprocessing
- The following questions might look familiar to you, as you were introduced to some of those as introductory numpy exercises.
- For the following questions, write a program using both numpy and standard python data structures (lists, tuples,  dictionaries, etc.).
- For both the cases, use `%%timeit` and observe the performance difference for each of them.
    -  Reverse a vector (first element becomes last)
    -  Create a 3x3 matrix with values ranging from 0 to 8 
    -  Find indices of non-zero elements from [1,2,0,0,4,0]
    -  Create a 5x5 matrix with values 1,2,3,4 just below the diagonal
    -  Find common values between two arrays
    -  How to find the closest value (to a given scalar) in a vector
    -  Consider a random vector with shape (100,2) representing coordinates, find point by point distances 
    -  Subtract the mean of each row of a matrix with elements of the row
    -  Compute averages using a sliding window over an array
    -  Find the most frequent value in an array


In [None]:
%%timeit
#for questions above

In [None]:
stepsize = 500
import random
random.seed(5126423231512)
#Creates a list of random numbers
list4 = []*stepsize
for i in range(stepsize):
    list4.append(int(random.random() * 12353215321000 % 752293777))

Run through list4 and find if any number equals 79. Print True when you find it.

In [None]:
#%%timeit
def onepass(list4):
    for i in list4:
        #find 79
        pass
onepass(list4)

Check if two items are the same in list 4. The basic way of doing this is to have a for loop run though the list and while another runs though the list again and compare the two against eachother

In [None]:
#%%timeit
def Nsquared(list4):
    for item1 in list4:
        for item2 in list4:
            #Compare item1 vs item. If they are the same print True
            pass
Nsquared(list4)

Run though the list to check if list4 and add 5 to each number. Then check if the number 39381552 appears. 

In [None]:
#%%timeit
def runtwice(list4):
    for item1 in list4:
        #add 5
        pass
    for item in list4:
        if(item == 39381552):
            print(True)
runtwice(list4)

Pop each element from list5. Do they same for list6 but instead use pop(0). Time the results using timeit and explain why one performs better than the other.

In [None]:
stepsize = 5000

#use pop()
list5 = range(stepsize)
#use pop(0)
list6 = range(stepsize)

In [None]:
#%%timeit
def pop(list5):
    #use pop for each item in the list
    pass
pop(list5)

In [None]:
#%%timeit
def pop0(list6):
    #use pop0
    pass
pop(list6)

Compare the time difference of the 3 tasks on list4, using different step sizes. Which one takes the longest? Which one takes twice as long as the other?

In [None]:
step_sizes = [1,10,100,1000,10000]

Create your own tester  class that tests how long it takes to run a function and plots it against others.

In [None]:
class tester:
    def __init__(self):
        self.tests = {} # key is name of test and while value is a list of the test at different step sizes
        self.step_sizes = [1,10,100,1000,10000]
        self.runs = 10 #part of bonus
    def addtestfunction(self,fuctionname,function):
        self.tests[fuctionname] = [] #intialize results to empty list so we can add to it.
        for step_size in self.step_sizes:
            alist = range(step_size)
                
            #TODO
            #test how long each function took and set time equal to that
            #HINT: time.time() gets current time
            
            #call function by:
            function(alist)

            #BONUS: run multiple tests and plot times or use average of times.
            time = 0 #set to average time, whether you did 1 or 10 runs.            
            
            #adds to test (i.e. results)
            self.tests[fuctionname].append(time)

    def plot(self):
        #TODO
        #plot all tests done
        
        
        #BONUS you can use yscale and xscale to make a logarithmic graph
        #https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.yscale.html
        plt.show()
        

tester = tester()
tester.addtestfunction('onepass',onepass)
tester.addtestfunction('runtwice',runtwice)
tester.addtestfunction('Nsquared',Nsquared)
tester.addtestfunction('pop',pop)
tester.addtestfunction('pop0',pop0)
tester.plot()
#comment out %%timeit for each function otherwise python wont be able to find the functions

### References

- https://towardsdatascience.com/a-hitchhiker-guide-to-python-numpy-arrays-9358de570121
- https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises.ipynb