# Getting Started with NumPy - Lab

## Introduction

Now that we have introduced NumPy, let's put it to practice. In this lab, you are going to be creating arrays, performing operations on them, and returning new arrays all using the NumPy library. Let's get started!

## Objectives

You will be able to: 

- Instantiate a numpy array with specified values 
- Use broadcasting to perform a math operation on an entire numpy array 


## Import `NumPy` under the standard alias

In [1]:
# Import numpy using the standard alias
import numpy as np

## Generate some mock data

Create a NumPy array for each of the following:
    1. Using a range
    2. Using a Python list
    
Below, create a list in Python that has 5 elements (i.e. [0,1,2,3,4]) and assign it to the variable `py_list`. 

Next, do the same, but instead of a list, create a range with 5 elements and assign it to the variable, `py_range`.

Finally, use the list and range to create NumPy arrays and assign the array from list to the variable `array_from_list`, and the array from the range to the variable `array_from_range`.

In [3]:
# Your code here
py_list = [0,1,2,3,4]
py_range = range =(0,5)
array_from_list = np.array(py_list)
array_from_range = np.array(py_range)

array_from_arange = np.arange(0,5) # numpy also has builtin function called arange to write a range

array_from_list, array_from_range, array_from_arange

(array([0, 1, 2, 3, 4]), array([0, 5]), array([0, 1, 2, 3, 4]))

Next, we have a list of heights and weights and we'd like to use them to create a collection of BMIs. However, they are both in inches and pounds (imperial system), respectively. 

Let's use what we know to create NumPy arrays with the metric equivalent values (height in meters & weight in kg).

> **Remember:** *NumPy can make these calculations a lot easier and with less code than a list!*

> 1.0 inch = 0.0254 meters

> 2.2046 lbs = 1 kilogram

In [4]:
# Use the conversion rate for turning height in inches to meters
list_height_inches = [65, 68, 73, 75, 78]

# Your code here
array_height_inches = np.array(list_height_inches)
array_height_meters = array_height_inches * 0.0254
array_height_meters

array([1.651 , 1.7272, 1.8542, 1.905 , 1.9812])

In [11]:
# Use the conversion rate for turning weight in pounds to kilograms
list_weight_pounds = [150, 140, 220, 205, 265]

# Your code here
array_weight_pounds = np.array(list_weight_pounds)
array_weight_kg = array_weight_pounds / 2.2046 # using2 fwd slash will give float and 1 slash will give integer
array_weight_kg

array([ 68.03955366,  63.50358342,  99.79134537,  92.98739   ,
       120.20321147])

The metric formula for calculating BMI is as follows:

> BMI = weight (kg) ÷ height^2 (m^2)

So, to get BMI we divide weight by the squared value of height. For example, if I weighed 130kg and was 1.9 meters tall, the calculation would look like:

> BMI = 130 / (1.9*1.9)

Use the BMI calculation to create a NumPy array of BMIs

In [13]:
# Your code here
BMI_array = array_weight_kg / array_height_meters**2
#BMI_array = array_weight_kg / (array_height_meters*array_height_meters) this is one more way to write the function
BMI_array

array([24.9613063 , 21.28692715, 29.02550097, 25.62324316, 30.62382485])

In [18]:
BMI_array2 = array_weight_kg / (array_height_meters*2)# this is a wrong way as this is multiplying by two and not squaring
BMI_array2

array([20.60555835, 18.38339029, 26.90954195, 24.40613911, 30.3359609 ])

## Create a vector of ones the same size as your BMI vector using `np.ones()`

In [19]:
# Your code here
identity = np.ones(shape=BMI_array.shape)
#identity = np.ones(len(BMI_array)) this is one more way to write the code
identity

array([1., 1., 1., 1., 1.])

## Multiply the BMI_array by your vector of ones
The resulting product should have the same values as your original BMI numpy array.

In [20]:
# Your code here
BMI_array*identity

array([24.9613063 , 21.28692715, 29.02550097, 25.62324316, 30.62382485])

## Level Up: Using NumPy to Parse a File
The Pandas library that we've been using is built on top of NumPy; all columns/series in a Pandas DataFrame are built using NumPy arrays. To get a better idea of a how a built-in method like `pd.read_csv()` works, we'll try and recreate that here!

In [21]:
# Open a text file (csv files are just plaintext separated by commas)
f = open('bp.txt')
n_rows = len(f.readlines())
# Print number of lines in the file
print('The file has {} lines.'.format(n_rows)) 
# After using readlines, we must reopen the file
f = open('bp.txt') 
# The file has values separated by tabs; we read the first line and check it's length 
n_cols = (len(f.readline().split('\t'))) 

f = open('bp.txt')

# Your code here
# Pseudocode outline below
#1) Create a matrix of zeros that is the same size of the file
#2) Iterate through the file: "for line in f:" Hint: using enumerate will also be required
    #3) Update each row of the matrix with the new stream of data
    #Hint: skip the first row (it's just column names, not the data.)
#4) Preview your results; you should now have a NumPy matrix with the data from the file


The file has 21 lines.


### For Understanding better running the cell separately for each question 1-4

In [27]:
# Open a text file (csv files are just plaintext separated by commas)
f = open('bp.txt')
n_rows = len(f.readlines())
# Print number of lines in the file
print('The file has {} lines.'.format(n_rows)) 
# After using readlines, we must reopen the file
f = open('bp.txt') 
# The file has values separated by tabs; we read the first line and check it's length 
n_cols = (len(f.readline().split('\t'))) 

f = open('bp.txt')

# Your code here
# Pseudocode outline below
#1) Create a matrix of zeros that is the same size of the file

zeros = np.zeros(shape=(n_rows, n_cols)) ### According to solution matrix = np.zeros([n_rows, n_cols]) 
zeros




#2) Iterate through the file: "for line in f:" Hint: using enumerate will also be required
    #3) Update each row of the matrix with the new stream of data
    #Hint: skip the first row (it's just column names, not the data.)
#4) Preview your results; you should now have a NumPy matrix with the data from the file

The file has 21 lines.


array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])

In [30]:
# Open a text file (csv files are just plaintext separated by commas)
f = open('bp.txt')
n_rows = len(f.readlines())
# Print number of lines in the file
print('The file has {} lines.'.format(n_rows)) 
# After using readlines, we must reopen the file
f = open('bp.txt') 
# The file has values separated by tabs; we read the first line and check it's length 
n_cols = (len(f.readline().split('\t'))) 

f = open('bp.txt')

# Your code here
# Pseudocode outline below
#1) Create a matrix of zeros that is the same size of the file

zeros = np.zeros(shape=(n_rows, n_cols)) ### According to solution matrix = np.zeros([n_rows, n_cols]) 




#2) Iterate through the file: "for line in f:" Hint: using enumerate will also be required

for index, line in enumerate(f): ### According to solution for n, line in enumerate(f):
    print(index, line)
    
    
    #3) Update each row of the matrix with the new stream of data
    #Hint: skip the first row (it's just column names, not the data.)
#4) Preview your results; you should now have a NumPy matrix with the data from the file

The file has 21 lines.
0 Pt	BP	Age	Weight	BSA	Dur	Pulse	Stress

1 1	105	47	85.4	1.75	5.1	63	33

2 2	115	49	94.2	2.10	3.8	70	14

3 3	116	49	95.3	1.98	8.2	72	10

4 4	117	50	94.7	2.01	5.8	73	99

5 5	112	51	89.4	1.89	7.0	72	95

6 6	121	48	99.5	2.25	9.3	71	10

7 7	121	49	99.8	2.25	2.5	69	42

8 8	110	47	90.9	1.90	6.2	66	8

9 9	110	49	89.2	1.83	7.1	69	62

10 10	114	48	92.7	2.07	5.6	64	35

11 11	114	47	94.4	2.07	5.3	74	90

12 12	115	49	94.1	1.98	5.6	71	21

13 13	114	50	91.6	2.05	10.2	68	47

14 14	106	45	87.1	1.92	5.6	67	80

15 15	125	52	101.3	2.19	10.0	76	98

16 16	114	46	94.5	1.98	7.4	69	95

17 17	106	46	87.0	1.87	3.6	62	18

18 18	113	46	94.5	1.90	4.3	70	12

19 19	110	48	90.5	1.88	9.0	71	99

20 20	122	56	95.7	2.09	7.0	75	99



In [36]:
# Open a text file (csv files are just plaintext separated by commas)
f = open('bp.txt')
n_rows = len(f.readlines())
# Print number of lines in the file
print('The file has {} lines.'.format(n_rows)) 
# After using readlines, we must reopen the file
f = open('bp.txt') 
# The file has values separated by tabs; we read the first line and check it's length 
n_cols = (len(f.readline().split('\t'))) 

f = open('bp.txt')

# Your code here
# Pseudocode outline below
#1) Create a matrix of zeros that is the same size of the file

zeros = np.zeros(shape=(n_rows, n_cols)) ### According to solution matrix = np.zeros([n_rows, n_cols]) 




#2) Iterate through the file: "for line in f:" Hint: using enumerate will also be required

for index, line in enumerate(f): ### According to solution for n, line in enumerate(f):
    
    
    
    #3) Update each row of the matrix with the new stream of data
    
    
    if index == 0:
        continue
    data = line
    print(type(data))
    print(data.split('\t'))#split tab

    
    
    #Hint: skip the first row (it's just column names, not the data.)
#4) Preview your results; you should now have a NumPy matrix with the data from the file

The file has 21 lines.
<class 'str'>
['1', '105', '47', '85.4', '1.75', '5.1', '63', '33\n']
<class 'str'>
['2', '115', '49', '94.2', '2.10', '3.8', '70', '14\n']
<class 'str'>
['3', '116', '49', '95.3', '1.98', '8.2', '72', '10\n']
<class 'str'>
['4', '117', '50', '94.7', '2.01', '5.8', '73', '99\n']
<class 'str'>
['5', '112', '51', '89.4', '1.89', '7.0', '72', '95\n']
<class 'str'>
['6', '121', '48', '99.5', '2.25', '9.3', '71', '10\n']
<class 'str'>
['7', '121', '49', '99.8', '2.25', '2.5', '69', '42\n']
<class 'str'>
['8', '110', '47', '90.9', '1.90', '6.2', '66', '8\n']
<class 'str'>
['9', '110', '49', '89.2', '1.83', '7.1', '69', '62\n']
<class 'str'>
['10', '114', '48', '92.7', '2.07', '5.6', '64', '35\n']
<class 'str'>
['11', '114', '47', '94.4', '2.07', '5.3', '74', '90\n']
<class 'str'>
['12', '115', '49', '94.1', '1.98', '5.6', '71', '21\n']
<class 'str'>
['13', '114', '50', '91.6', '2.05', '10.2', '68', '47\n']
<class 'str'>
['14', '106', '45', '87.1', '1.92', '5.6', '67', 

In [39]:
# Open a text file (csv files are just plaintext separated by commas)
f = open('bp.txt')
n_rows = len(f.readlines())
# Print number of lines in the file
print('The file has {} lines.'.format(n_rows)) 
# After using readlines, we must reopen the file
f = open('bp.txt') 
# The file has values separated by tabs; we read the first line and check it's length 
n_cols = (len(f.readline().split('\t'))) 

f = open('bp.txt')

# Your code here
# Pseudocode outline below
#1) Create a matrix of zeros that is the same size of the file

zeros = np.zeros(shape=(n_rows, n_cols)) ### According to solution matrix = np.zeros([n_rows, n_cols]) 




#2) Iterate through the file: "for line in f:" Hint: using enumerate will also be required

for index, line in enumerate(f): ### According to solution for n, line in enumerate(f):
    
    
    
    #3) Update each row of the matrix with the new stream of data
    
     #Hint: skip the first row (it's just column names, not the data.)
    if index == 0:
        continue
    data = line
    
    # print(type(data))
    
     #print(data.split('\t'))#split tab
        
        
    print(data.strip('\n').split('\t'))
    #for removing the new line section at the end of each list\
    # 1st row of the list ['1', '105', '47', '85.4', '1.75', '5.1', '63', '33\n']
   

    
    
    
#4) Preview your results; you should now have a NumPy matrix with the data from the file
   

The file has 21 lines.
['1', '105', '47', '85.4', '1.75', '5.1', '63', '33']
['2', '115', '49', '94.2', '2.10', '3.8', '70', '14']
['3', '116', '49', '95.3', '1.98', '8.2', '72', '10']
['4', '117', '50', '94.7', '2.01', '5.8', '73', '99']
['5', '112', '51', '89.4', '1.89', '7.0', '72', '95']
['6', '121', '48', '99.5', '2.25', '9.3', '71', '10']
['7', '121', '49', '99.8', '2.25', '2.5', '69', '42']
['8', '110', '47', '90.9', '1.90', '6.2', '66', '8']
['9', '110', '49', '89.2', '1.83', '7.1', '69', '62']
['10', '114', '48', '92.7', '2.07', '5.6', '64', '35']
['11', '114', '47', '94.4', '2.07', '5.3', '74', '90']
['12', '115', '49', '94.1', '1.98', '5.6', '71', '21']
['13', '114', '50', '91.6', '2.05', '10.2', '68', '47']
['14', '106', '45', '87.1', '1.92', '5.6', '67', '80']
['15', '125', '52', '101.3', '2.19', '10.0', '76', '98']
['16', '114', '46', '94.5', '1.98', '7.4', '69', '95']
['17', '106', '46', '87.0', '1.87', '3.6', '62', '18']
['18', '113', '46', '94.5', '1.90', '4.3', '70', 

In [43]:
# Open a text file (csv files are just plaintext separated by commas)
f = open('bp.txt')
n_rows = len(f.readlines())
# Print number of lines in the file
print('The file has {} lines.'.format(n_rows)) 
# After using readlines, we must reopen the file
f = open('bp.txt') 
# The file has values separated by tabs; we read the first line and check it's length 
n_cols = (len(f.readline().split('\t'))) 

f = open('bp.txt')

# Your code here
# Pseudocode outline below
#1) Create a matrix of zeros that is the same size of the file

zeros = np.zeros(shape=(n_rows, n_cols)) ### According to solution matrix = np.zeros([n_rows, n_cols]) 




#2) Iterate through the file: "for line in f:" Hint: using enumerate will also be required

for index, line in enumerate(f): ### According to solution for n, line in enumerate(f):
    
    
    
    #3) Update each row of the matrix with the new stream of data
    
     #Hint: skip the first row (it's just column names, not the data.)
    if index == 0:
        continue
    
    
    
    #converting our string data to a list of floats
    
    
    data = line
    
    # print(type(data))
    
     #print(data.split('\t'))#split tab
        
        
    row = np.array(data.strip('\n').split('\t'))
    row = row.astype(float)# append this as a row
    zeros[5] = row ## this will convert the 6th row of zeros
    break
    
    
   





    #print(row)
    #for removing the new line section at the end of each list\
    # 1st row of the list ['1', '105', '47', '85.4', '1.75', '5.1', '63', '33\n']
   

    
    
    
#4) Preview your results; you should now have a NumPy matrix with the data from the file
   

The file has 21 lines.


In [44]:
zeros # if you see the 6th row of zeros have changed

array([[  0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ],
       [  0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ],
       [  0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ],
       [  0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ],
       [  0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ],
       [  1.  , 105.  ,  47.  ,  85.4 ,   1.75,   5.1 ,  63.  ,  33.  ],
       [  0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ],
       [  0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ],
       [  0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ],
       [  0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ],
       [  0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ],
       [  0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ],
       [  0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0.  ],
       [  0.  ,   0.  ,   0.  ,   0.  ,   0.  ,   0

In [47]:
# Open a text file (csv files are just plaintext separated by commas)
f = open('bp.txt')
n_rows = len(f.readlines())
# Print number of lines in the file
print('The file has {} lines.'.format(n_rows)) 
# After using readlines, we must reopen the file
f = open('bp.txt') 
# The file has values separated by tabs; we read the first line and check it's length 
n_cols = (len(f.readline().split('\t'))) 

f = open('bp.txt')

# Your code here
# Pseudocode outline below
#1) Create a matrix of zeros that is the same size of the file

zeros = np.zeros(shape=(n_rows, n_cols)) ### According to solution matrix = np.zeros([n_rows, n_cols]) 




#2) Iterate through the file: "for line in f:" Hint: using enumerate will also be required

for index, line in enumerate(f): ### According to solution for n, line in enumerate(f):
    
    
    
    #3) Update each row of the matrix with the new stream of data
    
     #Hint: skip the first row (it's just column names, not the data.)
    if index == 0:
        continue
    
    
    
    #converting our string data to a list of floats
    
    
    data = line
    
    # print(type(data))
    
     #print(data.split('\t'))#split tab
        
        
    row = np.array(data.strip('\n').split('\t'))
    row = row.astype(float)
    
# append this as a row
#     zeros[5] = row ## this will convert the 6th row of zeros
#     break


#print(row)
    #for removing the new line section at the end of each list\
    # 1st row of the list ['1', '105', '47', '85.4', '1.75', '5.1', '63', '33\n']
   

    
    
#4) Preview your results; you should now have a NumPy matrix with the data from the file    
        
    
        
    
    zeros[index - 1] = row 
    


    
    
   





    
   

The file has 21 lines.


In [48]:
zeros

array([[  1.  , 105.  ,  47.  ,  85.4 ,   1.75,   5.1 ,  63.  ,  33.  ],
       [  2.  , 115.  ,  49.  ,  94.2 ,   2.1 ,   3.8 ,  70.  ,  14.  ],
       [  3.  , 116.  ,  49.  ,  95.3 ,   1.98,   8.2 ,  72.  ,  10.  ],
       [  4.  , 117.  ,  50.  ,  94.7 ,   2.01,   5.8 ,  73.  ,  99.  ],
       [  5.  , 112.  ,  51.  ,  89.4 ,   1.89,   7.  ,  72.  ,  95.  ],
       [  6.  , 121.  ,  48.  ,  99.5 ,   2.25,   9.3 ,  71.  ,  10.  ],
       [  7.  , 121.  ,  49.  ,  99.8 ,   2.25,   2.5 ,  69.  ,  42.  ],
       [  8.  , 110.  ,  47.  ,  90.9 ,   1.9 ,   6.2 ,  66.  ,   8.  ],
       [  9.  , 110.  ,  49.  ,  89.2 ,   1.83,   7.1 ,  69.  ,  62.  ],
       [ 10.  , 114.  ,  48.  ,  92.7 ,   2.07,   5.6 ,  64.  ,  35.  ],
       [ 11.  , 114.  ,  47.  ,  94.4 ,   2.07,   5.3 ,  74.  ,  90.  ],
       [ 12.  , 115.  ,  49.  ,  94.1 ,   1.98,   5.6 ,  71.  ,  21.  ],
       [ 13.  , 114.  ,  50.  ,  91.6 ,   2.05,  10.2 ,  68.  ,  47.  ],
       [ 14.  , 106.  ,  45.  ,  87.1 ,   1.92,   5

## Summary

In this lab, we practiced creating NumPy arrays from both lists and ranges. We then practiced performing math operations like converting imperial measurements to metric measurements on each element of a NumPy array to create new arrays with new values. Finally, we used both of our new NumPy arrays to operate on each other and create new arrays containing the BMIs from our arrays containing heights and weights.