# Coding Temple's Data Analytics Program  
---
# Advanced Python 1 - Intro to `numpy`
---



## Part 1: Working with `numpy`


### 1.1 Importing `numpy`

We've already used the `numpy` package by importing it and assigning it the standard alias of `np`. Do this again in the following cell - the more you practice typing these lines of code, the easier it will be to remember.

In [1]:
import numpy as np

### 1.1 Solution - Run this cell to check your answer in 1.1. Please do not edit the values in this cell!

In [2]:
# DO NOT EDIT THIS CELL
assert np.__name__ == 'numpy', 'Make sure that you have properly imported numpy and aliased it as np!'

### 1.2 Generate random numbers

Create a `(5,3)` `numpy` array of random integer values between 0 and 100.

Use the the `random()` method in numpy to generate these integers. Name your new variable `myarray`. You should also print the array to check it's dimensions and values.

In [17]:
# Generate your random numbers
np.random.uniform(1) #Seed generated for reproducibility
np.random.uniform(0,100, (5,3))
# my_array = np.arange(15)
# print(my_array.reshape(5,3))

# Print out the array
myarray = np.random.randint(0,100, (5,3))
print(myarray)

[[82 31 28]
 [94 74  8]
 [99 32  8]
 [84 77 50]
 [79 41 64]]


### 1.2 Solution - Run the following cell to check your answer.

In [18]:
#DO NOT EDIT THIS CELL

#Verify the array was created with the correct name and has the proper shape
assert myarray.shape == (5,3), 'Make sure you create an array with the proper shape!'

### 1.3 Calculate BMI 

Using the two lists provided, please calculate the BMI(body mass index) of each individual using NDArrays. Save the variable containing your results as `bmi`

The formula for BMI in pounds and inches can be defined as: $BMI= \frac{703 * weight} {(height)^2}$

In [46]:
height = np.array([110,120,90,100])
weight = np.array([170,180,190,200])

bmi = weight*703/height**2
bmi

array([ 9.8768595 ,  8.7875    , 16.49012346, 14.06      ])

### 1.3 Solution: Run the following cell to check your answer.

In [47]:
# DO NOT EDIT THIS CELL
assert 'bmi' in dir() , 'Make sure you have saved your results to the proper variable name!'
assert type(bmi) == np.ndarray, 'Make sure that you made the calculation using an NDArray for both height and weight!'


### 1.4 Create a function 

Create a function named `my_func` that will take in two parameters and will create a random matrix based off of those parameters. Extra: Have additional parameters taken in that allow the user to choose the shape and data type of the matrix.

In [62]:
# def my_func(lst1, lst2):
#    x = np.random.uniform(lst1, (lst2))
#    return x

# shape = (3,3)
# dt = (0, 10)

# print(my_func(dt, shape))


def my_func(rows, cols, shape=(3, 3)):
  np.random.seed(42)
  info = np.random.randint(rows, cols, shape)
  return info

matrix = my_func(0,89)
print(matrix)


[[51 14 71]
 [60 20 82]
 [86 74 74]]


### 1.5 Array practice

Time for some more practice. Run each of these tasks in the separate code cell listed below:

1.  Return the first row
2.  Return the last column
3.  Return the third column values from the 4th and 5th rows
4.  Multiply every value in the array by 2
5.  Divide every value by 3
6.  Increase the values in the first row by 12
7. Calculate the mean of the first column
8. Calculate the median of the array _after_ removing the 2 smallest values in the array
9. Calculate the standard deviation of the first 3 rows
10. Return values greater than 25 in the second column
11. Return values less than 40 in the array

In [66]:
# 1. Return the first row:
myarray = np.random.randint(0,100, (10,5))
print(myarray)
myarray[0]


[[87 99 23  2 21]
 [52  1 87 29 37]
 [ 1 63 59 20 32]
 [75 57 21 88 48]
 [90 58 41 91 59]
 [79 14 61 61 46]
 [61 50 54 63  2]
 [50  6 20 72 38]
 [17  3 88 59 13]
 [ 8 89 52  1 83]]


array([87, 99, 23,  2, 21])

In [107]:
# 2. Return the last column
print(myarray)
myarray[:, -1]

[[31 90 20]
 [37 39 67]
 [ 4 42 51]
 [38 33 58]
 [67 69 88]]


array([20, 67, 51, 58, 88])

In [109]:

# 3. Return the third column values from the 4th and 5th rows
print(myarray)
myarray[3:5,2]

[[31 90 20]
 [37 39 67]
 [ 4 42 51]
 [38 33 58]
 [67 69 88]]


array([58, 88])

In [110]:
# 4. Multiply every value in the array by 2
myarray * 2


array([[ 62, 180,  40],
       [ 74,  78, 134],
       [  8,  84, 102],
       [ 76,  66, 116],
       [134, 138, 176]])

In [111]:
# 5. Divide every value by 3
myarray / 2

array([[15.5, 45. , 10. ],
       [18.5, 19.5, 33.5],
       [ 2. , 21. , 25.5],
       [19. , 16.5, 29. ],
       [33.5, 34.5, 44. ]])

In [112]:
# 6. Increase the values in the first row by 12
myarray[0,:] += 12
myarray

array([[ 43, 102,  32],
       [ 37,  39,  67],
       [  4,  42,  51],
       [ 38,  33,  58],
       [ 67,  69,  88]])

In [113]:
# 7. Calculate the mean of the first column
print(myarray[:,0].mean())

37.8


In [114]:
# 8. Calculate the median of the array after removing the 2 smallest values in the array
myarray = np.sort(myarray, axis=None)
myarray[:2]
print(np.median(myarray))

43.0


In [115]:
# 9. Calculate the standard deviation of the first 3 rows
# Generate a new array to work on
newarray = np.random.randint(0,58, (7,4))
np.random.seed(2) # New seed for new array
print(newarray)
print('\n')
print(np.std(newarray))
print('\n')
print(np.std(newarray[0:3,:])) # Standard deviation for the first 3 rows

[[ 4 46  6 31]
 [19 57 31  2]
 [16 52 46 12]
 [50  4 26 15]
 [49 39 46  8]
 [50 45 15 41]
 [45  8 55 17]]


18.29920679668261


18.893708535441693


In [116]:
# 10. Return values in the second column greater than 25
print(newarray)
print('\n')
newarray[:,1] > 25 # False for the numbers that are not greater than 25

[[ 4 46  6 31]
 [19 57 31  2]
 [16 52 46 12]
 [50  4 26 15]
 [49 39 46  8]
 [50 45 15 41]
 [45  8 55 17]]




array([ True,  True,  True, False,  True,  True, False])

In [117]:
# 11. Return values < 40 in the array
newarray < 40
print([newarray])

[array([[ 4, 46,  6, 31],
       [19, 57, 31,  2],
       [16, 52, 46, 12],
       [50,  4, 26, 15],
       [49, 39, 46,  8],
       [50, 45, 15, 41],
       [45,  8, 55, 17]])]


### Solution 1.5: Run the following cell to view the solution for each of the above tasks.

A new array will be generated to demonstrate the solution - the values will not be the same as your array. But the code for each task will still apply.

In [106]:
# DO NOT EDIT THIS CELL
# SOLUTION 1.5

# SOLUTION: Make sure you have completed all of the above tasks
# Generate your random numbers (NEW ARRAY)
myarray = np.random.randint(100, size=(5, 3))
print('The generated array: \n', myarray)
print('\n')

# 1. Return the first row:
print('1. The first row: ', myarray[0])
print('\n')

# 2. Return the last column
print('2. The last column: ', myarray[:,-1])
print('\n')

# 3. Return the third column values from the 4th and 5th rows
print('3. The 3rd column, 4th & 5th rows: ', myarray[3:5,2])
print('\n')

# 4. Multiply every value in the array by 2
# (operates on the original array)
print('4. Multiply by 2: \n', myarray * 2)
print('\n')

# 5. Divide every value by 3
# (operates on the original array)
print('5. Divide by 3: \n', myarray / 3)
print('\n')

# 6. Increase the values in the first row by 12
# (operates on the original array)
print('6. Add 12 to the first row: \n', myarray[0,:] + 12)
print('\n')

# 7. Calculate the mean of the first column
print('7. The mean of the 1st column: ', myarray[:,0].mean())
print('\n')

# 8. Calculate the median of the array after removing the 2 smallest values in the array
# flatten and sort (axis=None does the flattening)
myarray = np.sort(myarray, axis=None)
# remove two smallest values
myarray = myarray[2:]
# calculate the median
print('8. The median after removing the 2 smallest values: ', np.median(myarray))
print('\n')

# 9. Calculate the standard deviation of the first 3 rows
# Generate new array first:
myarray = np.random.randint(100, size=(5, 3))
# Then calculate the std:
print('9. The standard deviation is: ', np.std(myarray[0:3,:]))
print('\n')

# 10. Return values in the second column greater than 25
# create a Boolean mask where values in the 2nd column > 25 are True
condition = myarray[:,1] > 25
# Apply the mask
print('10. All values in 2nd column > 25: \n', myarray[condition])
print('\n')

# 11. Return values < 40 in the array
# create another Boolean mask for values < 40
condition = myarray < 40
# apply the mask
print('11. All values < 40: \n', myarray[condition])

The generated array: 
 [[40 15 72]
 [22 43 82]
 [75  7 34]
 [49 95 75]
 [85 47 63]]


1. The first row:  [40 15 72]


2. The last column:  [72 82 34 75 63]


3. The 3rd column, 4th & 5th rows:  [75 63]


4. Multiply by 2: 
 [[ 80  30 144]
 [ 44  86 164]
 [150  14  68]
 [ 98 190 150]
 [170  94 126]]


5. Divide by 3: 
 [[13.33333333  5.         24.        ]
 [ 7.33333333 14.33333333 27.33333333]
 [25.          2.33333333 11.33333333]
 [16.33333333 31.66666667 25.        ]
 [28.33333333 15.66666667 21.        ]]


6. Add 12 to the first row: 
 [52 27 84]


7. The mean of the 1st column:  54.2


8. The median after removing the 2 smallest values:  63.0


9. The standard deviation is:  23.786083699881697


10. All values in 2nd column > 25: 
 [[31 90 20]
 [37 39 67]
 [ 4 42 51]
 [38 33 58]
 [67 69 88]]


11. All values < 40: 
 [31 20 37 39  4 38 33]
