# Pure Python vs. Numpy - Lab

## Introduction 

Numpy, Scipy, and Pandas provide a significant increase in computational efficiency with complex mathematical operations as compared to Python's built-in arithmetic functions. In this lab, you will calculate and compare the processing speed required for calculating a dot product using both basic arithmetic operations in Python and Numpy's `.dot()` method. 

## Objectives
You will be able to:

- Compare the performance of high-dimensional matrix operations in Numpy vs. pure Python

## Problem 

Write a routine to calculate the dot product between two $200 \times 200$ dimensional matrices using:

a) Pure Python (no libraries)

b) Numpy's `.dot()` method 


### Create two $200 \times 200$ matrices in Python and fill them with random values using `np.random.rand()` 

In [13]:
import random
def new_matrix(a, b):
    M = np.zeros((a, b))
    for x in range(0, M.shape[0]):
        for y in range(0, M.shape[1]):
            M[x][y] = random.randrange(1, 9) 
    return(M)

In [14]:
# Compare 200x200 matrix-matrix multiplication speed
import numpy as np

# Set up the variables
A = new_matrix(9,9)
B = new_matrix(9,9)

In [15]:
len(A)

9

In [16]:
A.shape

(9, 9)

In [10]:
# first attempt
def multiply(A, B):
    result = []
    for i in range(len(B[0])): #this loops through columns of the matrix
        total = 0
        for j in range(len(v)): #this loops through vector coordinates & rows of matrix
            total += A[j] * B[j][i]
        result.append(total)
    return result

In [17]:
# second attempt with help
def multiply_matx(X, Y):
    # I still need numpy to get me the zeros!
    result = np.zeros((X.shape[1], Y.shape[0]))
    # iterate through rows of X
    for i in range(len(X)):
    # iterate through columns of Y
       for j in range(len(Y[0])):
           # iterate through rows of Y
           for k in range(len(Y)):
                result[i][j] += X[i][k] * Y[k][j]
    return result

In [18]:
# Mini-test
print(A, '\n')
print(B, '\n')
C = multiply_matx(A,B)
print('Python multiplication:', '\n', C, '\n')
D = A.dot(B)
print('Numpy dot product:', '\n',D)

[[1. 4. 2. 4. 5. 4. 2. 3. 6.]
 [5. 6. 2. 7. 7. 7. 3. 4. 7.]
 [3. 6. 7. 6. 2. 7. 1. 2. 7.]
 [2. 6. 6. 5. 6. 3. 7. 6. 4.]
 [5. 3. 6. 6. 6. 5. 6. 3. 4.]
 [3. 3. 4. 1. 7. 5. 7. 5. 3.]
 [6. 7. 6. 3. 4. 6. 2. 5. 1.]
 [6. 5. 6. 7. 1. 6. 3. 1. 4.]
 [1. 7. 7. 7. 4. 3. 6. 7. 7.]] 

[[1. 7. 7. 5. 4. 7. 5. 2. 2.]
 [5. 3. 8. 7. 8. 1. 5. 1. 8.]
 [1. 3. 1. 8. 6. 8. 4. 5. 3.]
 [5. 8. 6. 2. 6. 7. 8. 8. 7.]
 [6. 5. 7. 4. 8. 3. 4. 8. 2.]
 [3. 4. 5. 4. 8. 8. 4. 5. 8.]
 [1. 2. 4. 4. 7. 5. 8. 6. 7.]
 [8. 5. 7. 7. 8. 4. 3. 6. 2.]
 [6. 6. 2. 1. 3. 6. 7. 1. 4.]] 

Python multiplication: 
 [[147. 153. 161. 128. 200. 160. 168. 144. 154.]
 [212. 246. 265. 200. 308. 256. 260. 222. 240.]
 [162. 200. 193. 186. 254. 242. 220. 171. 216.]
 [187. 200. 233. 220. 303. 228. 242. 225. 218.]
 [161. 211. 221. 199. 282. 252. 241. 222. 210.]
 [149. 162. 198. 184. 260. 197. 194. 193. 174.]
 [152. 184. 225. 217. 271. 214. 191. 178. 191.]
 [131. 195. 194. 178. 239. 238. 218. 169. 208.]
 [215. 226. 242. 232. 319. 255. 270. 232. 244

## Pure Python

* Initialize a zeros-filled `numpy` matrix
* In Python, calculate the dot product using the formula 


$$ \large C_{i,j}= \sum_k A_{i,k}B_{k,j}$$


* Use Python's `timeit` library to calculate the processing time
* [Visit this link](https://www.pythoncentral.io/time-a-python-function/) for an in-depth explanation on how to time a function or routine in Python

**Hint**: Use a nested `for` loop for accessing, calculating, and storing each scalar value in the resulting matrix. 

In [17]:
import timeit

def wrapper(func, *args, **kwargs):
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

def costly_func(lst): 
     return map(lambda x: x^2, lst) 
    
short_list = range(10) 
wrapped = wrapper(costly_func, short_list)

timeit.timeit(wrapped, number=1000)

#timeit.timeit(multiply_matx(A,B))

ValueError: stmt is neither a string nor callable

In [20]:
print(timeit.timeit(stmt=multiply_matx(A,B)))

ValueError: stmt is neither a string nor callable

In [21]:
print(timeit.timeit(stmt=A.dot(B)))

ValueError: stmt is neither a string nor callable

In [None]:
import timeit

# Start the timer
start = None

# Matrix multiplication in pure Python


time_spent = None

print('Pure Python time:', time_spent, 'sec.')

## Numpy 
Set the timer and calculate the time taken by the `.dot()` method for multiplying $A$ and $B$ 


In [18]:
timeit.timeit(A.dot(B))

ValueError: stmt is neither a string nor callable

In [None]:
# Start the timer
start = None

# Matrix multiplication in numpy


time_spent = None
print('Numpy time:', time_spent, 'sec.')

### Your comments

## Summary

In this lab, you performed a quick comparison between calculating a dot product in Numpy vs pure Python. You saw that Numpy is computationally much more efficient than pure Python code because of the sophisticated implementation of Numpy source code. You're encouraged to always perform time tests to fully appreciate the use of an additional library in Python. 

In [1]:
# Compare 200x200 matrix-matrix multiplication speed
import numpy as np
# Set up the variables

SIZE = 200
A = np.random.rand(SIZE, SIZE)
B = np.random.rand(SIZE, SIZE)

In [2]:
import timeit

# Start the timer
start = timeit.default_timer()

# Matrix multiplication in pure Python

out2 = np.zeros((SIZE, SIZE))

for i in range(SIZE):
    for j in range(SIZE):
        for k in range(SIZE):
      
            out2[i, k] += A[i, j]*B[j, k]

time_spent = timeit.default_timer() - start

print('Pure Python time:', time_spent, 'sec.')

Pure Python time: 8.752514368999982 sec.


In [3]:
# Start the timer
start = timeit.default_timer()

# Matrix multiplication in numpy
out1 = A.dot(B)

time_spent = timeit.default_timer() - start
print('Numpy time:', time_spent, 'sec.')

Numpy time: 0.051898368000024675 sec.


In [4]:
# Your comments:

# Numpy is much faster than pure Python 

# Numpy provides support for large multidimensional arrays and matrices 
# along with a collection of mathematical functions to operate on these elements. 

# Numpy relies on well-known packages implemented in other languages (like Fortran) to perform efficient computations, 
# bringing the user both the expressiveness of Python and a performance similar to MATLAB or Fortran.

In [None]:
# Again with my code:

In [30]:
SIZE = 8
A = np.random.rand(SIZE, SIZE)
B = np.random.rand(SIZE, SIZE)

In [31]:
import timeit

# Start the timer
start = timeit.default_timer()

# Matrix multiplication in pure Python

result = np.zeros((A.shape[1], B.shape[0]))

for i in range(len(A)):
    for j in range(len(B[0])):
        for k in range(len(B)):
            result[i][j] += A[i][k] * B[k][j]

time_spent = timeit.default_timer() - start

print('Pure Python time:', time_spent, 'sec.')

Pure Python time: 0.0015383579998342611 sec.


In [32]:
start = timeit.default_timer()

# Matrix multiplication in numpy
out1 = A.dot(B)

time_spent = timeit.default_timer() - start
print('Numpy time:', time_spent, 'sec.')

Numpy time: 0.00033566899992365506 sec.


In [36]:
result = np.zeros((A.shape[1], B.shape[0]))

for i in range(len(A)):
    for j in range(len(B[0])):
        for k in range(len(B)):
            result[i][j] += A[i][k] * B[k][j]
print(result)

[[3.35705582 2.87695203 2.78798929 2.93422315 3.79735216 2.5041932
  2.20196063 3.33702929]
 [2.69424616 2.39243307 1.70499916 2.19177813 2.95865457 1.38425914
  2.06711771 2.42564125]
 [3.0436007  3.04889843 2.65020376 2.82269737 4.36498634 2.48889146
  2.43956697 3.31343441]
 [1.25064587 1.33739261 0.93929976 1.31464438 1.77338209 0.86513893
  1.11469134 1.44501809]
 [2.92677623 2.19218214 2.15551666 2.47708399 3.27156386 2.05493869
  1.91712353 3.04509824]
 [2.38743596 2.59728204 2.055106   2.26724297 3.34772731 1.91711765
  1.86939363 2.58995873]
 [2.55084769 2.36001866 1.95907686 2.56190338 3.31139061 1.81402424
  1.91229548 3.22337372]
 [2.88104937 2.51736757 2.23691488 2.05394131 3.21247188 2.09842072
  1.94243865 2.63167756]]


In [41]:
import numpy as np
# Set up the variables

SIZE = 3
A = np.random.rand(SIZE, SIZE)
B = np.random.rand(SIZE, SIZE)

In [44]:
def dot_prod_matx(A,B):

    out2 = np.zeros((SIZE, SIZE))

    for i in range(SIZE):
        for j in range(SIZE):
            for k in range(SIZE):
      
                out2[i, k] += A[i, j]*B[j, k]
        
    return out2

In [45]:
# Their code

print(A, '\n')
print(B, '\n')
C = dot_prod_matx(A,B)
print('Python multiplication:', '\n', C, '\n')
D = A.dot(B)
print('Numpy dot product:', '\n',D)

[[0.00921187 0.54590681 0.40151519]
 [0.05723937 0.58346159 0.63112579]
 [0.51378959 0.18666599 0.65432702]] 

[[0.66178619 0.6419171  0.74308223]
 [0.54972227 0.6608754  0.26685425]
 [0.26970928 0.78727619 0.37445994]] 

Python multiplication: 
 [[0.41448579 0.68279298 0.30287408]
 [0.52884254 0.91920865 0.43456409]
 [0.61911138 0.96830936 0.67661978]] 

Numpy dot product: 
 [[0.41448579 0.68279298 0.30287408]
 [0.52884254 0.91920865 0.43456409]
 [0.61911138 0.96830936 0.67661978]]
