# Exercises: NumPy
The following exercises are designed to test your understanding of NumPy. You will be
using a variety of NumPy functions and methods to answer the questions. The goal is to
become familiar with the power of NumPy and how to use it effectively in your data
science journey.

In [1]:
import numpy as np

In [None]:
np.linalg.cholesky

For exercises 1-10, we will be working with a simple dataset representing the ratings of
four different customers in response to five products on a scale of 1-10.

- Customer 0's ratings: 4, 5, 3, 1, 2
- Customer 1's ratings: 6, 7, 8, 7, 5
- Customer 2's ratings: 8, 9, 10, 9, 8
- Customer 3's ratings: 1, 3, 2, 1, 3

## Exercise 1
Use the information above to create the "ratings" matrix.

In [5]:
# TODO: Your code here!
ratings = np.array([[4, 5, 3, 1, 2],
                   [6, 7, 8, 7, 5],
                   [8, 9, 10, 9, 8],
                   [1, 3, 2, 1, 3]])

## Exercise 2
- Find the size and shape of the 'ratings' matrix.
- Print the data type of the 'ratings' matrix.

In [7]:
# TODO: Your code here!
print(f'size: {ratings.size}')
print(f'shape: {ratings.shape}')

print(f'datatype: {ratings.dtype}')

size: 20
shape: (4, 5)
datatype: int64


## Exercise 3
- How did Customer 2 rate the 0th item?
- Extract the ratings for the last product by all customers.

In [9]:
# TODO: Your code here!
customer_2_rating = ratings[1, -1]
print(f'customer 2 rating for 0th item: {customer_2_rating}')
print(f'last product ratings: {ratings[:, -1]}')

customer 2 rating for 0th item: 5
last product ratings: [2 5 8 3]


## Exercise 4
- Calculate the average product rating across ALL customers and ALL products.
- On average, how did Customer 1 rate the products?
- What's the average product rating for the last item?

In [13]:
# TODO: Your code here!
print(f'avg: {ratings.mean()}')
print(f'cust 1 avg: {ratings[0].mean()}')
print(f'last prod avg: {ratings[:, -1].mean()}')

avg: 5.1
cust 1 avg: 3.0
last prod avg: 4.5


## Exercise 5
- Reshape 'ratings' to a column vector and call it 'col_vec' (don't change 'ratings').
- Reshape 'ratings' to a row vector and call it 'row_vec' (don't change 'ratings').

In [15]:
# TODO: Your code here!
col_vec = ratings.reshape(4 * 5, 1)
row_vec = ratings.reshape(1, 4 * 5)
print(col_vec.shape, row_vec.shape)

(20, 1) (1, 20)


## Exercise 6
Find the median and standard deviation of 'ratings'.

In [20]:
# TODO: Your code here!
std = ratings.std()
print(f'ratings std: {std}')
print(f'median: {np.median(ratings)}')

ratings std: 2.930870177950569
median: 5.0


## Exercise 7
Find the unique product ratings of Customer 2.

In [21]:
# TODO: Your code here!
print(np.unique(ratings[1]))

[5 6 7 8]


## Exercise 8
Find the product ratings that Customer 0 and Customer 3 have in common.

In [29]:
# TODO: Your code here!
print(ratings[0] == ratings[3])

[False False False  True False]


## Exercise 9
Make a new matrix called 'binary_mat' based on 'ratings'. If a number in 'ratings' is
less than 6, set it equal to 0 in 'binary_mat', else, set it equal to 1 in 'binary_mat'.

In [31]:
# TODO: Your code here!
binary_mat = np.where(ratings < 6, 0, 1)
binary_mat, ratings

(array([[0, 0, 0, 0, 0],
        [1, 1, 1, 1, 0],
        [1, 1, 1, 1, 1],
        [0, 0, 0, 0, 0]]),
 array([[ 4,  5,  3,  1,  2],
        [ 6,  7,  8,  7,  5],
        [ 8,  9, 10,  9,  8],
        [ 1,  3,  2,  1,  3]]))

## Exercise 10
- Which product has the highest average rating and which has the lowest?


In [None]:
# TODO: Your code here!
avg_prod_ratings = ratings.mean(0)
print(f'Max prod: {avg_prod_ratings.argmax()}')
print(f'Min prod: {avg_prod_ratings.argmin()}')

Max prod: 1
min prod: 3


## Exercise 11
Create a 5x5 diagonal matrix with 7s along the diagonal.

In [40]:
# TODO: Your code here!

a = np.zeros((5, 5))
np.fill_diagonal(a, 7)
print(a)

[[7. 0. 0. 0. 0.]
 [0. 7. 0. 0. 0.]
 [0. 0. 7. 0. 0.]
 [0. 0. 0. 7. 0.]
 [0. 0. 0. 0. 7.]]


## Exercise 12
- Create two 1D arrays of size 10, generated from random uniform distribution on [0, 1].
- Compute their dot product.

In [41]:
# TODO: Your code here!
first = np.random.uniform(0, 1, 10)
second = np.random.uniform(0, 1, 10)
print(first @ second)

2.7778733883944047


## Exercise 13
Given the following array...
- Extract all odd numbers from 'arr'.
- Create a new array where all odd numbers in 'arr' are -1 and the rest are the original numbers from arr. Be sure not to change arr (print it out at the end to make sure it hasn't changed)!

In [45]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
# TODO: Your code here!
odds = arr[arr % 2 == 1]
print(odds)

new_arr = np.where(arr % 2 == 1, -1, arr)
print(new_arr)

[ 1  3  5  7  9 11 13 15]
[-1  2 -1  4 -1  6 -1  8 -1 10 -1 12 -1 14 -1]


## Exercise 14
- Create a list from 0 to 100,000.
- Write a Python loop to calculate the sum of these numbers.
- Convert the list to a NumPy array.
- Compute the sum of the array using a NumPy function.
- Use %timeit to compare the speed of these two approaches.

In [53]:
import timeit

# TODO: Your code here!
my_list = list(range(100000))
my_array = np.array(my_list)


def list_sum(my_list):
    return sum(my_list)


def array_sum(my_array):
    return my_array.sum()

In [54]:
%timeit list_sum(my_list)

320 μs ± 1.37 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [55]:
%timeit array_sum(my_array)

12.1 μs ± 30.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
