Numpy is the fundamental package for numeric computing with Python. It provides powerful ways to create, store, and/or manipulate data, which makes it able to seamlessly and speedily integrate with a wide variety of databases. This is also the foundation that Pandas is built on, which is a high-performance data-centric package that we will learn later in the course.

In this lecture, we will talk about creating array with certain data types, manipulating array, selecting elements from arrays, and loading dataset into array. Such functions are useful for manipulating data and understanding the functionalities of other common Python data packages.

# Numpy

In [1]:
import numpy as np
import math

# Array Creation

In [2]:
# Arrays are displayed as a list or list of lists and can be created through list as well. When creating an
# array, we pass in a list as an argument in numpy array
a = np.array([1, 2, 3])
print(a)
# We can print the number of dimensions of a list using the ndim attribute
print(a.ndim)

[1 2 3]
1


In [3]:
# If we pass in a list of lists in numpy array, we create a multi-dimensional array, for instance, a matrix
b = np.array([[1, 2, 3], [4, 5, 6]])
b

array([[1, 2, 3],
       [4, 5, 6]])

In [4]:
# We can print out the length of each dimension by calling the shape attribute, which returns a tuple
b.shape

(2, 3)

In [5]:
# We can also check the type of the items in an array
a.dtype

dtype('int64')

In [6]:
# Floats are accepted in numpy arrays too
c = np.array([2.5, 6.8, 1.0])
print(c.dtype.name)
print(c)

float64
[2.5 6.8 1. ]


In [7]:
# Note that numpy automatically converts integers, like 5, up to floats, since there is no loss of prescision.
# Numpy will try and give you the best data type format possible to keep your data types homogeneous, which
# means all the same, in the array

In [8]:
# Sometimes we know the shape of an array that we want to create, but not what we want to be in it. numpy
# offers several functions to create arrays with initial placeholders, such as zero's or one's.
# Lets create two arrays, both the same shape but with different filler values
d = np.zeros((2,3))
print(d)

print()

e = np.ones((2,3))
print(e)

[[0. 0. 0.]
 [0. 0. 0.]]

[[1. 1. 1.]
 [1. 1. 1.]]


In [9]:
# We can create an array with random numbers
np.random.rand(2,3)

array([[0.65589638, 0.95272459, 0.72791079],
       [0.42039029, 0.667062  , 0.41147726]])

In [10]:
# We can also create a sequence of numbers in an array with the arrange() function. The fist argument is the
# starting bound and the second argument is the ending bound, and the third argument is the difference between
# each consecutive numbers

# Let's create an array of every even number from ten (inclusive) to fifty (exclusive)
f = np.arange(10, 50, 2)
f

array([10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42,
       44, 46, 48])

In [11]:
# if we want to generate a sequence of floats, we can use the linspace() function. In this function the third
# argument isn't the difference between two numbers, but the total number of items you want to generate
np.linspace(0, 2, 15) # 15 numbers from 0(inclusive) to 2(inclusive) 

array([0.        , 0.14285714, 0.28571429, 0.42857143, 0.57142857,
       0.71428571, 0.85714286, 1.        , 1.14285714, 1.28571429,
       1.42857143, 1.57142857, 1.71428571, 1.85714286, 2.        ])

# Array Operations

In [12]:
# We can do many things on arrays, such as mathematical manipulation (addition, subtraction, square,
# exponents) as well as use boolean arrays, which are binary values. We can also do matrix manipulation such
# as product, transpose, inverse, and so forth.

In [13]:
# Arithmetic operators on array apply elementwise.

a = np.array([10, 15, 20, 25])
b = np.array([5, 8, 9, 14])

# Let's look to a minus b
c = a-b
print(c)

print()

# And let's look to a times b
d = a*b
print(d)

[ 5  7 11 11]

[ 50 120 180 350]


In [14]:
# With arithmetic manipulation, we can convert current data to the way we want it to be. 
# Let's create an array of typical Ann Arbor winter farenheit values
farenheit = np.array([0, -10, -5, -15, 0])

# The formula for conversion is ((°F - 32) * 5/9 = °C)
celsius = (farenheit - 32) * (5/9)
celsius

array([-17.77777778, -23.33333333, -20.55555556, -26.11111111,
       -17.77777778])

In [15]:
# Another useful and important manipulation is the boolean array. We can apply an operator on an array, and a
# boolean array will be returned for any element in the original, with True being emitted if it meets the condition
celsius > -20

array([ True, False, False, False,  True])

In [16]:
# # Here's another example, we could use the modulus operator to check numbers in an array to see if they are even. Recall that modulus does division but throws away everything but the remainder (decimal) portion)
celsius%2 == 0

array([False, False, False, False, False])

In [17]:
# Besides elementwise manipulation, it is important to know that numpy supports matrix manipulation. Let's
# look at matrix product. if we want to do elementwise product, we use the "*" sign
A = np.array([[1,1], [0, 1]])
B = np.array([[2,0], [3, 4]])
print(A*B)

print()

# if we want to do matrix product, we use the "@" sign or use the dot function
print(A@B)

[[2 0]
 [0 4]]

[[5 4]
 [3 4]]


In [18]:
# You don't have to worry about complex matrix operations for this course, but it's important to know that
# numpy is the underpinning of scientific computing libraries in python, and that it is capable of doing both
# element-wise operations (the asterix) as well as matrix-level operations (the @ sign). There's more on this
# in a subsequent course.

In [19]:
# A few more linear algebra concepts are worth layering in here. You might recall that the product of two
# matrices is only plausible when the inner dimensions of the two matrices are the same. The dimensions refer
# to the number of elements both horizontally and vertically in the rendered matricies you've seen here. We
# can use numpy to quickly see the shape of a matrix:
A.shape

(2, 2)