# Coding Temple's Data Analytics Program  
---
# Advanced Python 1 - Intro to `numpy`
---



## Part 1: Working with `numpy`


### 1.1 Importing `numpy`

We've already used the `numpy` package by importing it and assigning it the standard alias of `np`. Do this again in the following cell - the more you practice typing these lines of code, the easier it will be to remember.

In [1]:
# Import numpy and assign it the standard alias
import numpy as np


### 1.1 Solution - Run this cell to check your answer in 1.1. Please do not edit the values in this cell!

In [3]:
# DO NOT EDIT THIS CELL
assert np.__name__ == 'numpy', 'Make sure that you have properly imported numpy and aliased it as np!'

### 1.2 Generate random numbers

Create a `(5,3)` `numpy` array of random integer values between 0 and 100.

Use the the `random()` method in numpy to generate these integers. Name your new variable `myarray`. You should also print the array to check it's dimensions and values.

In [14]:
# Generate your random numbers

np.random.seed(1)

myarray = np.random.randint(0, 100, size=(5,3))

print(myarray)



[[37 12 72]
 [ 9 75  5]
 [79 64 16]
 [ 1 76 71]
 [ 6 25 50]]


### 1.2 Solution - Run the following cell to check your answer.

In [15]:
#DO NOT EDIT THIS CELL

#Verify the array was created with the correct name and has the proper shape
assert myarray.shape == (5,3), 'Make sure you create an array with the proper shape!'

### 1.3 Calculate BMI 

Using the two lists provided, please calculate the BMI(body mass index) of each individual using NDArrays. Save the variable containing your results as `bmi`

The formula for BMI in pounds and inches can be defined as: $BMI= \frac{703 * weight} {(height)^2}$

In [20]:
weight = [150, 170, 200, 210, 180]
height = [65, 70, 72, 75, 68]

weight_kg = np.array(weight) 
height_m = np.array(height) 

bmi = 703 * weight_kg / (height_m ** 2)

print(bmi)


[24.95857988 24.38979592 27.12191358 26.24533333 27.36591696]


### 1.3 Solution: Run the following cell to check your answer.

In [26]:
# DO NOT EDIT THIS CELL
assert 'bmi' in dir() , 'Make sure you have saved your results to the proper variable name!'
assert type(bmi) == np.ndarray, 'Make sure that you made the calculation using an NDArray for both height and weight!'


### 1.4 Create a function 

Create a function named `my_func` that will take in two parameters and will create a random matrix based off of those parameters. Extra: Have additional parameters taken in that allow the user to choose the shape and data type of the matrix.

In [24]:
import numpy as np

def my_func(start, stop, shape=(3,3), dtype=float):
   
    return np.random.uniform(start, stop, size=shape).astype(dtype)


### 1.5 Array practice

Time for some more practice. Run each of these tasks in the separate code cell listed below:

1.  Return the first row
2.  Return the last column
3.  Return the third column values from the 4th and 5th rows
4.  Multiply every value in the array by 2
5.  Divide every value by 3
6.  Increase the values in the first row by 12
7. Calculate the mean of the first column
8. Calculate the median of the array _after_ removing the 2 smallest values in the array
9. Calculate the standard deviation of the first 3 rows
10. Return values greater than 25 in the second column
11. Return values less than 40 in the array

In [30]:
print(myarray[0])

[43 32 26]


In [None]:
#Return the last column
print(myarray[:, -1])

In [None]:
print(myarray[3:5, 2])

In [16]:
print(myarray * 2)

[ 94  98 126 144 150 150 164 170 190]


In [None]:
print(myarray / 3)

In [19]:
myarray[0] += 12
print(myarray)

[75 72 75 75 82 85 95]


In [29]:
print('Mean: ', myarray[:,0].mean())

The mean of the 1st column:  34.0


In [21]:
import numpy as np
myarray = np.sort(myarray, axis=None)
myarray = myarray[2:]
print('The median : ', np.median(myarray))


8. The median after removing the 2 smallest values:  82.0


In [23]:
# 9. Calculate the standard deviation of the first 3 rows
# Generate a new array to work on
np.random.seed(2) # New seed for new array
myarray = np.random.randint(100, size=(5, 3))
print('9. The STD is: ', np.std(myarray[0:3,:]))

9. The STD is:  25.81128091014125


In [24]:
# Return values in the second column greater than 25

mask = myarray [:, 1] > 25
result = myarray[mask, 1]
print(result)


[43 95 47]


In [26]:
#Return values < 40 in the array
Mask = myarray < 40
result = myarray[mask]
print(result)


[15 22  7 34]


### Solution 1.5: Run the following cell to view the solution for each of the above tasks.

A new array will be generated to demonstrate the solution - the values will not be the same as your array. But the code for each task will still apply.

In [33]:
# DO NOT EDIT THIS CELL
# SOLUTION 1.5

# SOLUTION: Make sure you have completed all of the above tasks
# Generate your random numbers (NEW ARRAY)
myarray = np.random.randint(100, size=(5, 3))
print('The generated array: \n', myarray)
print('\n')

# 1. Return the first row:
print('1. The first row: ', myarray[0])
print('\n')

# 2. Return the last column
print('2. The last column: ', myarray[:,-1])
print('\n')

# 3. Return the third column values from the 4th and 5th rows
print('3. The 3rd column, 4th & 5th rows: ', myarray[3:5,2])
print('\n')

# 4. Multiply every value in the array by 2
# (operates on the original array)
print('4. Multiply by 2: \n', myarray * 2)
print('\n')

# 5. Divide every value by 3
# (operates on the original array)
print('5. Divide by 3: \n', myarray / 3)
print('\n')

# 6. Increase the values in the first row by 12
# (operates on the original array)
print('6. Add 12 to the first row: \n', myarray[0,:] + 12)
print('\n')

# 7. Calculate the mean of the first column
print('7. The mean of the 1st column: ', myarray[:,0].mean())
print('\n')

# 8. Calculate the median of the array after removing the 2 smallest values in the array
# flatten and sort (axis=None does the flattening)
myarray = np.sort(myarray, axis=None)
# remove two smallest values
myarray = myarray[2:]
# calculate the median
print('8. The median after removing the 2 smallest values: ', np.median(myarray))
print('\n')

# 9. Calculate the standard deviation of the first 3 rows
# Generate new array first:
myarray = np.random.randint(100, size=(5, 3))
# Then calculate the std:
print('9. The standard deviation is: ', np.std(myarray[0:3,:]))
print('\n')

# 10. Return values in the second column greater than 25
# create a Boolean mask where values in the 2nd column > 25 are True
condition = myarray[:,1] > 25
# Apply the mask
print('10. All values in 2nd column > 25: \n', myarray[condition])
print('\n')

# 11. Return values < 40 in the array
# create another Boolean mask for values < 40
condition = myarray < 40
# apply the mask
print('11. All values < 40: \n', myarray[condition])

The generated array: 
 [[96 49 40]
 [46 59 73]
 [78 94 95]
 [32 16 21]
 [43 58 98]]


1. The first row:  [96 49 40]


2. The last column:  [40 73 95 21 98]


3. The 3rd column, 4th & 5th rows:  [21 98]


4. Multiply by 2: 
 [[192  98  80]
 [ 92 118 146]
 [156 188 190]
 [ 64  32  42]
 [ 86 116 196]]


5. Divide by 3: 
 [[32.         16.33333333 13.33333333]
 [15.33333333 19.66666667 24.33333333]
 [26.         31.33333333 31.66666667]
 [10.66666667  5.33333333  7.        ]
 [14.33333333 19.33333333 32.66666667]]


6. Add 12 to the first row: 
 [108  61  52]


7. The mean of the 1st column:  59.0


8. The median after removing the 2 smallest values:  59.0


9. The standard deviation is:  29.856860162539398


10. All values in 2nd column > 25: 
 [[21 49 63]
 [83 92 51]
 [43 69 59]]


11. All values < 40: 
 [21  8 11  1]
