# Coding Temple's Data Analytics Program  
---
# Advanced Python - Intro to `numpy`
---



## Part 1: Working with `numpy`


### 1.1 Importing `numpy`

We've already used the `numpy` package by importing it and assigning it the standard alias of `np`. Do this again in the following cell - the more you practice typing these lines of code, the easier it will be to remember.

In [1]:
# Import numpy and assign it the standard alias
# YOUR CODE HERE
import numpy as np

### 1.1 Solution - Run this cell to check your answer in 1.1. Please do not edit the values in this cell!

In [2]:
# DO NOT EDIT THIS CELL
assert np.__name__ == 'numpy', 'Make sure that you have properly imported numpy and aliased it as np!'

### 1.2 Generate random numbers

Create a `(5,3)` `numpy` array of random integer values between 0 and 100.

Use the the `random()` method in numpy to generate these integers. Name your new variable `myarray`. You should also print the array to check it's dimensions and values.

In [3]:
# Generate your random numbers
np.random.seed(1) #Seed generated for reproducibility

#YOUR CODE HERE
myarray = np.random.randint(0,100, (5,3))

# Print out the array
myarray

array([[37, 12, 72],
       [ 9, 75,  5],
       [79, 64, 16],
       [ 1, 76, 71],
       [ 6, 25, 50]])

### 1.2 Solution - Run the following cell to check your answer.

In [4]:
#DO NOT EDIT THIS CELL

#Verify the array was created with the correct name and has the proper shape
assert myarray.shape == (5,3), 'Make sure you create an array with the proper shape!'

### 1.3 Calculate BMI 

Using the two lists provided, please calculate the BMI(body mass index) of each individual using NDArrays. Save the variable containing your results as `bmi`

The formula for BMI in pounds and inches can be defined as: $BMI= \frac{703 * weight} {(height)^2}$

In [11]:
height = [55, 120, 90, 100]
weight = [170, 180, 190, 200]
np_height = np.array(height)
np_weight = np.array(weight)

In [12]:
bmi = (703* np_weight)/(np_height)**2
bmi

array([39.50743802,  8.7875    , 16.49012346, 14.06      ])

### 1.3 Solution: Run the following cell to check your answer.

In [13]:
assert 'bmi' in dir() , 'Make sure you have saved your results to the proper variable name!'
assert type(bmi) == np.ndarray, 'Make sure that you made the calculation using an NDArray for both height and weight!'

### 1.4 Create a function 

Create a function named `my_func` that will take in two parameters and will create a random matrix based off of those parameters. Extra: Have additional parameters taken in that allow the user to choose the shape and data type of the matrix.

In [24]:
def my_func(ar_min,ar_max,shape=(3,3),dtype=int):
    if dtype == int:
        arr_out = np.random.randint(ar_min, ar_max,shape)
        return arr_out
    else:
        arr_out = np.random.uniform(ar_min, ar_max,shape)
        return arr_out
    
print(f'Matrix result: \n{my_func(0,100)} \n')
print(f'Matrix result: \n{my_func(0,100, (9,5),float)} \n')
print(f'Matrix result: \n{my_func(0,1, (9,5),float)} \n')
    

Matrix result: 
[[22 38 41]
 [74 77 70]
 [25 48 50]] 

Matrix result: 
[[ 9.53078106 14.32630062 93.10129333 57.65842164 83.96452138]
 [62.32921128 32.45174964 72.80116434 52.27366296 73.68114658]
 [16.54061148 68.70582924 42.681084   72.85573976 75.63353922]
 [39.76112824 92.52001089 20.3510321   0.80027107 92.63507082]
 [29.45115453 16.69515938  2.41016265 45.20016084 80.83387458]
 [36.8376854  60.92069891  3.48477228 35.45772294  7.85199637]
 [69.31851991  1.27126651 45.95452841 96.13172623 33.41852204]
 [47.20834209 10.53912515 50.30759061 88.56898607 53.43773391]
 [28.14767654 35.45846908 89.62802505 24.1489141   2.38840945]] 

Matrix result: 
[[0.9657268  0.4299679  0.34628852 0.57706763 0.12652616]
 [0.95004331 0.31361083 0.95283112 0.21839323 0.2487002 ]
 [0.86380179 0.23526833 0.81510006 0.54389724 0.19126027]
 [0.58883994 0.04813679 0.01587298 0.04971102 0.39941529]
 [0.57690396 0.86751746 0.78616224 0.25470275 0.08225491]
 [0.157756   0.20949288 0.4173828  0.34769002 0.69988

### 1.5 Array practice

Time for some more practice. Run each of these tasks in the separate code cell listed below:

1.  Return the first row
2.  Return the last column
3.  Return the third column values from the 4th and 5th rows
4.  Multiply every value in the array by 2
5.  Divide every value by 3
6.  Increase the values in the first row by 12
7. Calculate the mean of the first column
8. Calculate the median of the array _after_ removing the 2 smallest values in the array
9. Calculate the standard deviation of the first 3 rows
10. Return values greater than 25 in the second column
11. Return values less than 40 in the array

In [26]:
M = my_func(0,100,(9,5))
M

array([[28, 75,  5, 81,  5],
       [42, 86, 52, 57, 56],
       [78, 87, 81, 10, 72],
       [48, 19, 12, 25, 77],
       [16,  4, 88, 27, 50],
       [68,  3, 58, 12,  2],
       [76, 96, 96, 61, 15],
       [74, 12, 18, 30, 59],
       [ 5, 16, 95, 96, 60]])

In [27]:
# 1. Return the first row:
M[0]

array([28, 75,  5, 81,  5])

In [34]:
# 2. Return the last column
M[:,-1]

array([ 5, 56, 72, 77, 50,  2, 15, 59, 60])

In [37]:
# 3. Return the third column values from the 4th and 5th rows

M[3:5,2]

array([12, 88])

In [38]:
# 4. Multiply every value in the array by 2
print(M*2)

[[ 56 150  10 162  10]
 [ 84 172 104 114 112]
 [156 174 162  20 144]
 [ 96  38  24  50 154]
 [ 32   8 176  54 100]
 [136   6 116  24   4]
 [152 192 192 122  30]
 [148  24  36  60 118]
 [ 10  32 190 192 120]]


In [39]:
# 5. Divide every value by 3
print(M/3)

[[ 9.33333333 25.          1.66666667 27.          1.66666667]
 [14.         28.66666667 17.33333333 19.         18.66666667]
 [26.         29.         27.          3.33333333 24.        ]
 [16.          6.33333333  4.          8.33333333 25.66666667]
 [ 5.33333333  1.33333333 29.33333333  9.         16.66666667]
 [22.66666667  1.         19.33333333  4.          0.66666667]
 [25.33333333 32.         32.         20.33333333  5.        ]
 [24.66666667  4.          6.         10.         19.66666667]
 [ 1.66666667  5.33333333 31.66666667 32.         20.        ]]


In [41]:
# fixing my setting all values in row1 to 12
M[0]=[28, 75,  5, 81,  5]
print(M)

[[28 75  5 81  5]
 [42 86 52 57 56]
 [78 87 81 10 72]
 [48 19 12 25 77]
 [16  4 88 27 50]
 [68  3 58 12  2]
 [76 96 96 61 15]
 [74 12 18 30 59]
 [ 5 16 95 96 60]]


In [43]:
# 6. Increase the values in the first row by 12
M[0] =M[0]+12
print(M)

[[40 87 17 93 17]
 [42 86 52 57 56]
 [78 87 81 10 72]
 [48 19 12 25 77]
 [16  4 88 27 50]
 [68  3 58 12  2]
 [76 96 96 61 15]
 [74 12 18 30 59]
 [ 5 16 95 96 60]]


In [46]:
# 7. Calculate the mean of the first column
print(f'First column: {M[:,0]}')
np.mean(M[:,0])

First column: [40 42 78 48 16 68 76 74  5]


49.666666666666664

In [59]:
# 8. Calculate the median of the array after removing the 2 smallest values in the array
Neo = M  # <== not wanting to mess up the original matrix
FlatNeo = Neo.flatten()
SortedNeo = np.sort(FlatNeo)
min_values = SortedNeo[:2]
# print(min_values)
filtered_neo = FlatNeo[FlatNeo != min_values]
np.median(filtered_neo)

  filtered_neo = FlatNeo[FlatNeo != min_values]


52.0

In [62]:
# Generate a new array to work on
np.random.seed(2) # New seed for new array
N = my_func(0,50,(9,5))
print(f'New Matrix: \n {N}')


New Matrix: 
 [[40 15 45  8 22]
 [43 18 11 40  7]
 [34 49 31 11 21]
 [47 31 26 20 37]
 [39  3 38  4 42]
 [43 39 38 42 33]
 [ 3  5 24  4 46]
 [ 6 31 19 31  2]
 [16 46 12  4 26]]


In [65]:
# 9. Calculate the standard deviation of the first 3 rows
print(f'First 3 rows: \n{N[0:3]}')
print(f'Standard deviation of them: {np.std(N[0:3])}')

First 3 rows: 
[[40 15 45  8 22]
 [43 18 11 40  7]
 [34 49 31 11 21]]
Standard deviation of them: 14.187631546135135


In [69]:
# 10. Return values in the second column greater than 25
mask = N[:,1] > 25
print(f'{N[:,1][mask]}')

[49 31 39 31 46]


In [71]:
# 11. Return values < 40 in the array
mask2 = N < 40
print(f'{N[mask2]}')

[15  8 22 18 11  7 34 31 11 21 31 26 20 37 39  3 38  4 39 38 33  3  5 24
  4  6 31 19 31  2 16 12  4 26]


### Solution 1.5: Run the following cell to view the solution for each of the above tasks.

A new array will be generated to demonstrate the solution - the values will not be the same as your array. But the code for each task will still apply.

In [None]:
# DO NOT EDIT THIS CELL
# SOLUTION 1.3

# SOLUTION: Make sure you have completed all of the above tasks
# Generate your random numbers (NEW ARRAY)
myarray = np.random.randint(100, size=(5, 3))
print('The generated array: \n', myarray)
print('\n')

# 1. Return the first row:
print('1. The first row: ', myarray[0])
print('\n')

# 2. Return the last column
print('2. The last column: ', myarray[:,-1])
print('\n')

# 3. Return the third column values from the 4th and 5th rows
print('3. The 3rd column, 4th & 5th rows: ', myarray[3:5,2])
print('\n')

# 4. Multiply every value in the array by 2
# (operates on the original array)
print('4. Multiply by 2: \n', myarray * 2)
print('\n')

# 5. Divide every value by 3
# (operates on the original array)
print('5. Divide by 3: \n', myarray / 3)
print('\n')

# 6. Increase the values in the first row by 12
# (operates on the original array)
print('6. Add 12 to the first row: \n', myarray[0,:] + 12)
print('\n')

# 7. Calculate the mean of the first column
print('7. The mean of the 1st column: ', myarray[:,0].mean())
print('\n')

# 8. Calculate the median of the array after removing the 2 smallest values in the array
# flatten and sort (axis=None does the flattening)
myarray = np.sort(myarray, axis=None)
# remove two smallest values
myarray = myarray[2:]
# calculate the median
print('8. The median after removing the 2 smallest values: ', np.median(myarray))
print('\n')

# 9. Calculate the standard deviation of the first 3 rows
# Generate new array first:
myarray = np.random.randint(100, size=(5, 3))
# Then calculate the std:
print('9. The standard deviation is: ', np.std(myarray[0:3,:]))
print('\n')

# 10. Return values in the second column greater than 25
# create a Boolean mask where values in the 2nd column > 25 are True
condition = myarray[:,1] > 25
# Apply the mask
print('10. All values in 2nd column > 25: \n', myarray[condition])
print('\n')

# 11. Return values < 40 in the array
# create another Boolean mask for values < 40
condition = myarray < 40
# apply the mask
print('11. All values < 40: \n', myarray[condition])