# Instructions 

For the first part of tonight's assignment, we'll be solving problems by using `numpy`. Since `Pandas` is built on top of `numpy`, it's good to be familiar with `numpy` and know what it provides. Below, we'll ask that you solve questions **by using `numpy`**. While you could answer the questions below without using `numpy` arrays, the focus of these questions is for you to use `numpy` arrays and get familiar with their methods (there are a lot of them). Remember how fast they are compared to the `list` as an alternative! 

As a final note, a lot of the questions below are kind of simplistic, and could possibly be written in a single line or two. As such, you might wonder why we're asking you to write functions for them, or why they are even questions at all. As you've hopefully found throughout this course, programming skills increase quite a bit simply through repetition, which is a majority of the reason we've asked you to write functions for each question below. After tonight, you will hopefully never forget that when we define a function in Python, it starts with `def my_func_name():`. In addition, these one off functions will get you exposure to some of the most common and useful methods available on a `numpy array`. 

# Assignment Questions

**Hint**: [Broadcasting](http://docs.scipy.org/doc/numpy-1.10.1/user/basics.broadcasting.html) is a concept that will be particularly useful here. 

Write a function that takes in a one-dimensional `numpy array` and normalizes it (i.e. subtracts off the mean and divides by the standard deviation). Return the normalized array 

In [85]:
import numpy as np

def np_func_1(n):
    return ((n - n.mean()) / n.std())

np_func_1(np.arange(5))

array([-1.41421356, -0.70710678,  0.        ,  0.70710678,  1.41421356])

Now create a version of `1` that takes in a two-dimensional `numpy array` and normalizes it along the columns (i.e. for each column subtract off its mean and divide by its standard deviation). Return the normalized array. 

In [86]:
def np_func_2(n):
    return (n - n.mean(axis=0) / n.std(axis=0))

np_func_2(np.arange(10).reshape(5,2))

array([[-1.41421356, -0.76776695],
       [ 0.58578644,  1.23223305],
       [ 2.58578644,  3.23223305],
       [ 4.58578644,  5.23223305],
       [ 6.58578644,  7.23223305]])

Write a function that creates a `numpy array` of an inputted shape, filled with an inputted number. Your function should have three parameters - `num_cols`, `num_rows`, and `fill_value`. As an example, if I called your function with `num_cols=4`, `num_rows=3`, and `fill_value=2`, then your function should output a 3 by 4 array of 2s. Return the newly created array.  

In [55]:
def np_func_3(num_cols, num_rows, fill_value):
    return np.array([fill_value for _ in range(num_rows * num_cols)]).reshape(num_rows, num_cols)

np_func_3(3, 4, 2)

array([[2, 2, 2],
       [2, 2, 2],
       [2, 2, 2],
       [2, 2, 2]])

Write a function that takes in a one-dimensional `numpy array`, an `int`, and a mathematical operator (either `+`, `-`, `/`, or `*`) as a string, and then performs the indicated operation on each element of the `array`, using the inputted `int`. For example, if I inputted a `numpy array`, 2, and `'*'`, you should multiply each element of the `array` by 2. If I inputted a `numpy array`, 5, and `'-'`, then you should subtract 5 from every element in the array. Return the resulting array. 

In [56]:
def np_func_4(array, num, operator):
    if operator == '+':
        a = array + num
    elif operator == "-":
        a = array - num        
    elif operator == "/":
        a = array / num
    else: #*
        a = array * num
    
    return a

np_func_4(np.array([10,20,30]), 10, '+')

array([20, 30, 40])

Write a function that takes in one argument, an `int`, and creates a one-dimensional array that is the inputted number of elements long. Make the one-dimensional array full of random floating point numbers between 0 and 1 (**Hint**: Check out `numpy.random.random()`). Return the resulting array. 

In [44]:
import numpy as np

def np_func_5(num):
    return np.array([np.random.random() for _ in range(num)]) 

np_func_5(5)

array([ 0.22923183,  0.51348291,  0.77124492,  0.2034322 ,  0.46802173])

In [88]:
import numpy as np

def np_func_5b(num):
    return np.random.rand(num)

np_func_5b(5)

array([ 0.58162261,  0.44169539,  0.62293663,  0.0421776 ,  0.65187727])

Now, alter your solution to `5` to take in two parameters that will denote the final shape of your array of random floating point numbers (so now you will potentially end up with a two-dimensional array). Name these parameters `num_rows` and `num_cols`. Return the resulting array. 

In [45]:
import numpy as np

def np_func_6(num_rows, num_cols):
    return np.array([np.random.random() for _ in range(num_rows * num_cols)]).reshape(num_rows, num_cols)

np_func_6(5,5)

array([[ 0.74068397,  0.27355634,  0.47534234,  0.13370679,  0.54055469],
       [ 0.34314396,  0.73313048,  0.78869993,  0.91926869,  0.06625968],
       [ 0.17736876,  0.87531484,  0.70722225,  0.28375736,  0.00851341],
       [ 0.80789371,  0.45827081,  0.67295566,  0.90616918,  0.15908537],
       [ 0.20352071,  0.47932698,  0.16081874,  0.24399131,  0.73417075]])

In [87]:
import numpy as np

def np_func_6b(num_rows, num_cols):
    return np.random.rand(num_rows, num_cols)

np_func_6b(2,4)

array([[ 0.36649528,  0.56871011,  0.87591233,  0.87098379],
       [ 0.14526758,  0.10225472,  0.1577942 ,  0.19533961]])

Write a function that will take in a one-dimensional `numpy array`, as well as an `int`, and randomly sample the inputted integer number of elements from the inputted array (**Hint**: Check out `numpy.random.choice()`). Return the randomly sampled elements. 

In [49]:
import numpy as np

def np_func_7(array, sample):   
    return np.random.choice(array, sample)

np_func_7(np.arange(10),3)

array([4, 3, 3])

Write a function that will take in a one-dimensional `numpy array` and replace the maximum element in it with a `0`. Return the resulting array. 

In [92]:
import numpy as np

def np_func_8(array):
    array[array.argmax()]=0
    return array

print(np_func_8(np.arange(10)))
np_func_8(np.array([20,25,60,1,2,3]))


[0 1 2 3 4 5 6 7 8 0]


array([20, 25,  0,  1,  2,  3])

Write a function that takes in an `int`, creates a one-dimensional `numpy array` of the numbers from `0` up to `int`, and then returns the cumulative sum of all those numbers. 

In [53]:
import numpy as np

def np_func_9(num):
    return np.arange(num).cumsum()

np_func_9(10)

array([ 0,  1,  3,  6, 10, 15, 21, 28, 36, 45], dtype=int32)

Write a function that takes in two two-dimensional `numpy arrays` and performs matrix multiplication, the dot product, between the two. You should construct it such that the first array is multiplied by the second (i.e. the number of columns of the first has to equal the number of rows of the second; you can assume that your inputs will meet this criteria). Return the result of the multiplication. 

In [62]:
import numpy as np

def np_func_10(array1, array2):
    print(array1)
    print(array2)
    return np.dot(array1, array2)

print(np_func_10(np.arange(9).reshape(3,3), np.arange(9).reshape(3,3)))
np_func_10(np.arange(10).reshape(2,5), np.arange(10).reshape(5,2))

[[0 1 2]
 [3 4 5]
 [6 7 8]]
[[0 1 2]
 [3 4 5]
 [6 7 8]]
[[ 15  18  21]
 [ 42  54  66]
 [ 69  90 111]]
[[0 1 2 3 4]
 [5 6 7 8 9]]
[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


array([[ 60,  70],
       [160, 195]])

Write a function that takes in two two-dimensional `numpy arrays` and performs element-wise multiplication between the two. You can assume that the two arrays are of the same size. Return the array resulting from the multiplication. 

In [67]:
import numpy as np

def np_func_11(array1, array2):
    print(array1)
    print(array2)
    return array1 * array2

print(np_func_11(np.arange(9).reshape(3,3), np.arange(9).reshape(3,3)))
np_func_11(np.arange(10).reshape(2,5), np.arange(10).reshape(2,5))

[[0 1 2]
 [3 4 5]
 [6 7 8]]
[[0 1 2]
 [3 4 5]
 [6 7 8]]
[[ 0  1  4]
 [ 9 16 25]
 [36 49 64]]
[[0 1 2 3 4]
 [5 6 7 8 9]]
[[0 1 2 3 4]
 [5 6 7 8 9]]


array([[ 0,  1,  4,  9, 16],
       [25, 36, 49, 64, 81]])

Write a function that takes in a one-dimensional `numpy array` and returns the top 5 elements (you can assume it's an array of numbers). 

In [79]:
import numpy as np

def np_func_12(array):
    array = np.sort(array)[::-1]
    print(array)
    return array[0:5]
    
np_func_12(np.array([10,4,50,20,30,25,17,28,55,8]))

[55 50 30 28 25 20 17 10  8  4]


array([55, 50, 30, 28, 25])

Write a function that takes in a two-dimensional `numpy array` and returns the smallest 5 elements of each column (**Hint**: The `axis` parameter on the `numpy` function you use might come in handy here). 

In [93]:
import numpy as np

def np_func_13(array):
    array = np.sort(array)
    print(array)
    return np.min(array, axis=0)
    
np_func_13(np.array([[10,4,50,15,8,35,5],[30,25,17,28,55,33,2]]))

[[ 4  5  8 10 15 35 50]
 [ 2 17 25 28 30 33 55]]


array([ 2,  5,  8, 10, 15, 33, 50])

# Extra-Credit 

1. Write a function that takes in an integer, and does the following: 

* Creates an array of the numbers from 0 up to that inputted integer 
* Reshapes it to be the largest `n` * `n` array that it could be, discarding 
any elements that are extra (i.e. if you want to make a 10 x 10, but have 102 elements, discard the last 2)
* Returns the cumulative sum of the column means