# Coding Temple's Data Analytics Program  
---
# Advanced Python - Intro to `numpy`
---



## Part 1: Working with `numpy`


### 1.1 Importing `numpy`

We've already used the `numpy` package by importing it and assigning it the standard alias of `np`. Do this again in the following cell - the more you practice typing these lines of code, the easier it will be to remember.

In [1]:
import numpy as np

### 1.1 Solution - Run this cell to check your answer in 1.1. Please do not edit the values in this cell!

In [2]:
# DO NOT EDIT THIS CELL
assert np.__name__ == 'numpy', 'Make sure that you have properly imported numpy and aliased it as np!'

### 1.2 Generate random numbers

Create a `(5,3)` `numpy` array of random integer values between 0 and 100.

Use the the `random()` method in numpy to generate these integers. Name your new variable `myarray`. You should also print the array to check it's dimensions and values.

In [5]:
# Generate your random numbers
np.random.seed(1) #Seed generated for reproducibility
myarray = np.random.randint(0,100, (5,3))

print(myarray)



[[37 12 72]
 [ 9 75  5]
 [79 64 16]
 [ 1 76 71]
 [ 6 25 50]]


### 1.2 Solution - Run the following cell to check your answer.

In [6]:
#DO NOT EDIT THIS CELL

#Verify the array was created with the correct name and has the proper shape
assert myarray.shape == (5,3), 'Make sure you create an array with the proper shape!'

### 1.3 Calculate BMI 

Using the two lists provided, please calculate the BMI(body mass index) of each individual using NDArrays. Save the variable containing your results as `bmi`

The formula for BMI in pounds and inches can be defined as: $BMI= \frac{703 * weight} {(height)^2}$

In [7]:
height = [55, 120, 90, 100]
weight = [170, 180, 190, 200]

bmi =(np.round(703*(np.array(weight))/(np.array(height)**2),2))
bmi

array([39.51,  8.79, 16.49, 14.06])

### 1.3 Solution: Run the following cell to check your answer.

In [8]:
assert 'bmi' in dir() , 'Make sure you have saved your results to the proper variable name!'
assert type(bmi) == np.ndarray, 'Make sure that you made the calculation using an NDArray for both height and weight!'

### 1.4 Create a function 

Create a function named `my_func` that will take in two parameters and will create a random matrix based off of those parameters. Extra: Have additional parameters taken in that allow the user to choose the shape and data type of the matrix.

In [15]:
def my_func( start, stop, shape):
    my_fun_array = (np.random.randint(start, stop, shape))
    return my_fun_array


my_new_array = my_func(0,100, (5,3))
my_new_array



array([[30, 71,  3],
       [70, 21, 49],
       [57,  3, 68],
       [24, 43, 76],
       [26, 52, 80]])

### 1.5 Array practice

Time for some more practice. Run each of these tasks in the separate code cell listed below:

1.  Return the first row
2.  Return the last column
3.  Return the third column values from the 4th and 5th rows
4.  Multiply every value in the array by 2
5.  Divide every value by 3
6.  Increase the values in the first row by 12
7. Calculate the mean of the first column
8. Calculate the median of the array _after_ removing the 2 smallest values in the array
9. Calculate the standard deviation of the first 3 rows
10. Return values greater than 25 in the second column
11. Return values less than 40 in the array

In [14]:
# 1. Return the first row:
my_new_array[0]


array([ 9,  7, 63])

In [23]:
# 2. Return the last column
my_new_array[:,-1]


array([ 3, 49, 68, 76, 80])

In [24]:
# 3. Return the third column values from the 4th and 5th rows
my_new_array[3:5,2]


array([76, 80])

In [25]:
# 4. Multiply every value in the array by 2
my_new_array*2

array([[ 60, 142,   6],
       [140,  42,  98],
       [114,   6, 136],
       [ 48,  86, 152],
       [ 52, 104, 160]])

In [26]:
# 5. Divide every value by 3
my_new_array/3

array([[10.        , 23.66666667,  1.        ],
       [23.33333333,  7.        , 16.33333333],
       [19.        ,  1.        , 22.66666667],
       [ 8.        , 14.33333333, 25.33333333],
       [ 8.66666667, 17.33333333, 26.66666667]])

In [32]:
# 6. Increase the values in the first row by 12
my_new_array[:1]+12

array([[42, 83, 15]])

In [44]:
# 7. Calculate the mean of the first column
my_new_array[:,0]/5

array([ 6. , 14. , 11.4,  4.8,  5.2])

In [45]:
# 8. Calculate the median of the array after removing the 2 smallest values in the array
my_new_array1=np.sort(my_new_array[:,:])
print(np.median(my_new_array[2:]))
print(np.median(my_new_array))

52.0
49.0


In [47]:
# 9. Calculate the standard deviation of the first 3 rows
# Generate a new array to work on
np.random.seed(2) # New seed for new array
new_array = my_func(6,45, (5,3))
Stdarray =np.std(new_array[0:3,:])
print(Stdarray)
print(new_array)

9.226906440900752
[[21 14 28]
 [24 17 13]
 [40 37 17]
 [27 37 32]
 [26 43  9]]


In [50]:
# 10. Return values in the second column greater than 25
second_column = new_array[:,1]> 25

print(new_array[second_column])

[[40 37 17]
 [27 37 32]
 [26 43  9]]


In [51]:
# 11. Return values < 40 in the array
second_column = new_array<40
print(new_array[second_column])

[21 14 28 24 17 13 37 17 27 37 32 26  9]


### Solution 1.5: Run the following cell to view the solution for each of the above tasks.

A new array will be generated to demonstrate the solution - the values will not be the same as your array. But the code for each task will still apply.

In [52]:
# DO NOT EDIT THIS CELL
# SOLUTION 1.3

# SOLUTION: Make sure you have completed all of the above tasks
# Generate your random numbers (NEW ARRAY)
myarray = np.random.randint(100, size=(5, 3))
print('The generated array: \n', myarray)
print('\n')

# 1. Return the first row:
print('1. The first row: ', myarray[0])
print('\n')

# 2. Return the last column
print('2. The last column: ', myarray[:,-1])
print('\n')

# 3. Return the third column values from the 4th and 5th rows
print('3. The 3rd column, 4th & 5th rows: ', myarray[3:5,2])
print('\n')

# 4. Multiply every value in the array by 2
# (operates on the original array)
print('4. Multiply by 2: \n', myarray * 2)
print('\n')

# 5. Divide every value by 3
# (operates on the original array)
print('5. Divide by 3: \n', myarray / 3)
print('\n')

# 6. Increase the values in the first row by 12
# (operates on the original array)
print('6. Add 12 to the first row: \n', myarray[0,:] + 12)
print('\n')

# 7. Calculate the mean of the first column
print('7. The mean of the 1st column: ', myarray[:,0].mean())
print('\n')

# 8. Calculate the median of the array after removing the 2 smallest values in the array
# flatten and sort (axis=None does the flattening)
myarray = np.sort(myarray, axis=None)
# remove two smallest values
myarray = myarray[2:]
# calculate the median
print('8. The median after removing the 2 smallest values: ', np.median(myarray))
print('\n')

# 9. Calculate the standard deviation of the first 3 rows
# Generate new array first:
myarray = np.random.randint(100, size=(5, 3))
# Then calculate the std:
print('9. The standard deviation is: ', np.std(myarray[0:3,:]))
print('\n')

# 10. Return values in the second column greater than 25
# create a Boolean mask where values in the 2nd column > 25 are True
condition = myarray[:,1] > 25
# Apply the mask
print('10. All values in 2nd column > 25: \n', myarray[condition])
print('\n')

# 11. Return values < 40 in the array
# create another Boolean mask for values < 40
condition = myarray < 40
# apply the mask
print('11. All values < 40: \n', myarray[condition])

The generated array: 
 [[ 4 42 51]
 [38 33 58]
 [67 69 88]
 [68 46 70]
 [95 83 31]]


1. The first row:  [ 4 42 51]


2. The last column:  [51 58 88 70 31]


3. The 3rd column, 4th & 5th rows:  [70 31]


4. Multiply by 2: 
 [[  8  84 102]
 [ 76  66 116]
 [134 138 176]
 [136  92 140]
 [190 166  62]]


5. Divide by 3: 
 [[ 1.33333333 14.         17.        ]
 [12.66666667 11.         19.33333333]
 [22.33333333 23.         29.33333333]
 [22.66666667 15.33333333 23.33333333]
 [31.66666667 27.66666667 10.33333333]]


6. Add 12 to the first row: 
 [16 54 63]


7. The mean of the 1st column:  54.4


8. The median after removing the 2 smallest values:  67.0


9. The standard deviation is:  24.077549606671532


10. All values in 2nd column > 25: 
 [[66 80 52]
 [76 50  4]
 [90 63 79]
 [49 39 46]
 [ 8 50 15]]


11. All values < 40: 
 [ 4 39  8 15]
