When importing to local machine/IDE, you first need to install it into your machine using *pip3 install numpy* and alias it as np

In [3]:
import numpy as np
# To make numpy functions available

In [7]:
array1 = np.array([2,4,5,6,8,10]) # Initializing a numpy array
print(f"This is your array: {array1}")
print(type(array1))

This is your array: [ 2  4  5  6  8 10]
<class 'numpy.ndarray'>


In [12]:
mark_list = [189,278,392,403,395,402, 450]
mark_array = np.array(mark_list) # Converting a list to a numpy array
print(mark_array)
print(type(mark_array))

[189 278 392 403 395 402 450]
<class 'numpy.ndarray'>


Numpy can also handle multi-dimensional arrays

In [58]:
numpy_arr = np.array([[1,2,3,4],
                      [5,6,7,8]])
print(numpy_arr)
numpy_arr.ndim

[[1 2 3 4]
 [5 6 7 8]]


2

Numpys are preferred to a list because Numpy makes it easier to perform array operations on the array and on each element.

In [20]:
array_of_ints = np.array([0,1,2,3,4,5,6,7,8,9])
print(array_of_ints + 3)
# Adding 3 to each element in the array

[ 3  4  5  6  7  8  9 10 11 12]


In [37]:
# For multiplication, both arrays have to be the same shape/length for it to work
# Otherwise, ValueError
array_1 = np.array([10,15,20,25,45])
array_2 = np.array([30,14,21,19,2])
product_array = array_1 * array_2
print(product_array)
# type(product_array)

[300 210 420 475  90]


Now, let's imagine we have a list of temperatures that represent the average high temperatures for each month of the year in NYC. Currently, this list has all the temperatures in Fahrenheit. However, since NYC has such a large international presence and population, it would be great to have these numbers in Celsius as well. Without NumPy, we would have to access each element individually, get its value, convert the value to Celsius, and add the new value to a new array. With NumPy, we can just multiply each element by the factor we need to convert Fahrenheit to Celsius.

The formula for converting Fahrenheit to Celsius is below:

T(°C) = (T(°F) - 32) × 5/9

In [41]:
nyc_avg_temps_f = [39, 42, 50, 62, 72, 80, 85, 84, 76, 65, 54, 44]
np_nyc_avg_temps_f = np.array(nyc_avg_temps_f)
np_nyc_avg_temps_c = (np_nyc_avg_temps_f - 32) * (5/9)
print(np_nyc_avg_temps_c)

[ 3.88888889  5.55555556 10.         16.66666667 22.22222222 26.66666667
 29.44444444 28.88888889 24.44444444 18.33333333 12.22222222  6.66666667]


In [44]:
# Difference on list and array behaviour
list_1 = [1,2,3]
array_1 = np.array(list_1)
print(list_1 * 3)
print(array_1 * 3)

[1, 2, 3, 1, 2, 3, 1, 2, 3]
[3 6 9]


**Numpy Array**

In [47]:
x = np.array([30,45,65])
y = np.array([4,5,6])

In [62]:
np.add(x,y)
# Adds elementwise (30 + 4)(45 + 5)( 65 + 6)

array([34, 50, 71])

In [65]:
np.multiply(x,y)
# Multiplies elementwise (30 * 4)(45 * 5)(65 * 6)

array([120, 225, 390])

In [68]:
np.add(x,1)
# Adds 1 to each element

array([31, 46, 66])

**Multidimensional Arrays**

In [71]:
y = np.array([[1, 2], [3, 4]])
print(type(y))
print(y)
print(y.shape)

<class 'numpy.ndarray'>
[[1 2]
 [3 4]]
(2, 2)


The shape atttribute in Numpy is used to define the number of rows and columns in the array.
Returns a tuple. A shape of (2,3) shows the array has 2 rows and 3 columns

In [72]:
y = np.array([[[1, 2],[3, 4],[5, 6]],
             [[1, 2],[3, 4],[5, 6]]
             ])
print(y.shape)
print(y)

(2, 3, 2)
[[[1 2]
  [3 4]
  [5 6]]

 [[1 2]
  [3 4]
  [5 6]]]


In [74]:
np.zeros(5)
# Used to create a 1D array of 5 elements

array([0., 0., 0., 0., 0.])

In [80]:
np.zeros([2,2])
# Creating a 2D; 2x2 matrix array of zeroes

array([[0., 0.],
       [0., 0.]])

In [81]:
np.zeros([3,5])
# 2D, 3x5 matrix array of zeroes

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [83]:
np.zeros([2,3,4])
# 3D, 2x3x4

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

In [85]:
np.ones([2,2])
# 2x2, 2D Array of ones

array([[1., 1.],
       [1., 1.]])

In [87]:
np.ones([3,4])
# 3 rows, 4 columns

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

The np.full() method allows you to create an array of arbitrary values

In [90]:
np.full(5,3)
# 1D array of 5 elements full of 3s

array([3, 3, 3, 3, 3])

In [99]:
np.full([2,4], range(4))
# Create a 2D, 2x4 array with elements ranging from 0 to 3

array([[0, 1, 2, 3],
       [0, 1, 2, 3]])

**Numpy Array Subsetting**

Works like slicing lists

In [101]:
x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
print(x.shape)
print(x)

(4, 3)
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


In [104]:
x[0]
# Retrieving the first row

array([1, 2, 3])

In [106]:
x[1:]
# Everything apart from the first row

array([[ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [110]:
x[:,1]
# All rows, col 1

array([ 2,  5,  8, 11])

In [112]:
x[2:4,1:3]
# Row 2 to 4, col 1 to 3

array([[ 8,  9],
       [11, 12]])

**3D slicing**

In [115]:
x = np.array([
              [[1,2,3], [4,5,6]],
              [[7,8,9], [10,11,12]]
             ])
print(x)
x.shape

[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]


(2, 2, 3)

In [118]:
x[:,:,-1]

array([[ 3,  6],
       [ 9, 12]])

**Quiz**

Create a NumPy array for each of the following: 

1. Using a range
2. Using a Python list

Below, create a list in Python that has 5 elements (i.e. [0,1,2,3,4]) and assign it to the variable py_list.

Next, do the same, but instead of a list, create a range with 5 elements and assign it to the variable, py_range.

Finally, use the list and range to create NumPy arrays and assign the array from list to the variable array_from_list, and the array from the range to the variable array_from_range.

In [132]:
py_list = [0,1,2,3,4]
py_range = list(range(5))
# Creating a list from a range
array_from_list = np.array(py_list)
array_from_range = np.arange(5)
# Creating an array using arange() method

print(py_list)
print(py_range)
print(array_from_list)
print(array_from_range)

[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0 1 2 3 4]
[0 1 2 3 4]


Next, we have a list of heights and weights and we'd like to use them to create a collection of BMIs. However, they are both in inches and pounds (imperial system), respectively.

Let's use what we know to create NumPy arrays with the metric equivalent values (height in meters & weight in kg).

Remember: NumPy can make these calculations a lot easier and with less code than a list!

1.0 inch = 0.0254 meters

2.2046 lbs = 1 kilogram

In [136]:
# Use the conversion rate for turning height in inches to meters
list_height_inches = [65, 68, 73, 75, 78]

np_height_inches = np.array(list_height_inches)
np_height_meters = np_height_inches * 0.0254
np_height_meters

array([1.651 , 1.7272, 1.8542, 1.905 , 1.9812])

In [139]:
# Use the conversion rate for turning weight in pounds to kilograms
list_weight_pounds = [150, 140, 220, 205, 265]

np_weight_pounds = np.array(list_weight_pounds)
np_weight_kg = np_weight_pounds * 2.2046
np_weight_kg

array([330.69 , 308.644, 485.012, 451.943, 584.219])

The metric formula for calculating BMI is as follows:

BMI = weight (kg) / height^2 (m^2)

So, to get BMI we divide weight by the squared value of height. For example, if I weighed 130kg and was 1.9 meters tall, the calculation would look like:

BMI = 130 / (1.9*1.9)

Use the BMI calculation to create a NumPy array of BMIs

In [146]:
np_bmi = np_weight_kg/np.power(np_height_meters,2)
np_bmi

array([121.31846749, 103.46002526, 141.07151502, 124.53565352,
       148.8397865 ])

Create a vector of ones the same size as your BMI vector using np.ones()

In [148]:
identity = np.ones(5)
identity

array([1., 1., 1., 1., 1.])

Multiply the BMI_array by your vector of ones
The resulting product should have the same values as your original BMI numpy array.

In [149]:
np_bmi * identity

array([121.31846749, 103.46002526, 141.07151502, 124.53565352,
       148.8397865 ])

Level Up: Using NumPy to Parse a File

The Pandas library that we've been using is built on top of NumPy; all columns/series in a Pandas DataFrame are built using NumPy arrays. To get a better idea of a how a built-in method like pd.read_csv() works, we'll try and recreate that here

Ignore below!.

In [None]:
# Open a text file (csv files are just plaintext separated by commas)
f = open('bp.txt')
n_rows = len(f.readlines())
# Print number of lines in the file
print('The file has {} lines.'.format(n_rows))
# After using readlines, we must reopen the file
f = open('bp.txt')
# The file has values separated by tabs; we read the first line and check it's length
n_cols = (len(f.readline().split('\t')))
f = open('bp.txt')

#1) Create a matrix of zeros that is the same size of the file
np.zeros([n_rows,n_cols])
#2) Iterate through the file: "for line in f:" 
# Hint: using enumerate will also be required
for index, line in enumerate(f):
    #3) Update each row of the matrix with the new stream of data
    #Hint: skip the first row (it's just column names, not the data.)
    if line == 0:
        continue
    elif
#4) Preview your results; you should now have a NumPy matrix with the data from the file



**Let's try this one more time**

In [4]:
import numpy as np
# Loading numpy into your environment

In [5]:
a_np = np.array([1,2,3])
print(a_np)

[1 2 3]


In [10]:
b_np = np.array([[8.5,6.2,9.7],[2.2,9.1,6.4]])
b_np
# Creating a 2D array

array([[8.5, 6.2, 9.7],
       [2.2, 9.1, 6.4]])

In [14]:
# Get dimension of the array
print(b_np.ndim)
print(a_np.ndim)

2
1


In [18]:
# Get Shape (rows, cols)
print(b_np.shape)
print(a_np.shape) # 1D so only shows the cols

(2, 3)
(3,)


In [20]:
# Get datatype in the array
b_np.dtype

dtype('float64')

In [26]:
# Total memory occupied
b_np.nbytes

48

In [27]:
# Number of elements inside
b_np.size

6

**Accessing/Changing Specific Elements in Numpy**

In [36]:
c_np = np.array([[1,2,3,4,5,6,7,8,9,10],[11,12,13,14,15,16,17,18,19,20]])
c_np

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])