# 2.2.1: Introduction to NumPy Arrays

NumPy is a Python module designed for scientific computation.

NumPy arrays are an additional data type provided by NumPy,
and they are used for representing vectors and matrices.

Unlike dynamically growing Python lists, NumPy arrays have a size that is fixed when they are constructed.
Elements of NumPy arrays are also all of the same data type leading to more efficient and simpler code
than using Python's standard data types.

By default, the elements are floating point numbers.

Python arrays are a data structure like lists. They contain a number of objects that can be of different data types. ... For example, if you have a list of student names that you want to store, you may want to store them in an array. Arrays are useful if you want to work with many values of the same Python data type.

In [1]:
# Let's start by constructing an empty vector and an empty matrix.

import numpy as np

# We're first going to define our first zero vector using the numpy np.zeros function.

zero_vector = np.zeros(5) # five for the five elements in the vector

# We can define a two-dimensional array, but for the argument in place (what goes inside the parenthesis) a tuple must
# placed. A tuple specifies two things. The first argument is the number of rows in the table and the 
# second argument is the number of columns in our table.

zero_matrix = np.zeros((5,3)) # 5 rows and 3 columns, a 3 by 5 table

# Both the zero_vector and the zero_matrix will contain only zeroes as their elements.

zero_vector

array([0., 0., 0., 0., 0.])

In [2]:
zero_matrix

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [3]:
# We can also construct NumPy arrays using specified values, in which case, we use the np.array function, 
# and the input argument to the function is a sequence of numbers, typically a list of numbers.

# In what follows, we assume that lower case variables are vectors or one-dimensional arrays and upper case variables are
# matrices, or two-dimensional arrays.

x = np.array([1,2,3])

y =np.array([2,4,6])

x

array([1, 2, 3])

In [4]:
y

array([2, 4, 6])

In [6]:
# When you construct a two-dimensional NumPy array, you specify the elements of each row as a list
# and you can then define the entire table as a list that contains at its elements
# each of the lists of the row elements you've defined.

A = np.array([[1,3], [5,9]]) # we have two list embeded in another list
A

array([[1, 3],
       [5, 9]])

In [7]:
# Finally, sometimes you want to turn the table sideways. This is called taking the transpose of a matrix, which
# means that the first row becomes the first column, the second row becomes the second column, and so on.

A.transpose()

array([[1, 5],
       [3, 9]])

In [9]:
x = np.array([[3,6],[5,7]])
y = x.transpose()
print(y)

[[3 5]
 [6 7]]


# 2.2.2: Slicing NumPy Arrays

In [None]:
It's easy to index and slice NumPy arrays regardless of their dimension,
meaning whether they are vectors or matrices.

With one-dimension arrays, we can index a given element
by its position, keeping in mind that indices start at 0.

With two-dimensional arrays, the first index
specifies the row of the array and the second index
specifies the column of the array.

We can also slice NumPy arrays.
Remember the indexing logic.
Start index is included but stop index is not,
meaning that Python stops before it hits the stop index.

NumPy arrays can have more dimensions than one of two.
For example, you could have three or four dimensional arrays.
With multi-dimensional arrays, you can use the colon character
in place of a fixed value for an index, which
means that the array elements corresponding
to all values of that particular index will be returned.

For a two-dimensional array, using just one index
returns the given row which is consistent with the construction of 2D
arrays as lists of lists, where the inner lists correspond
to the rows of the array.

In [73]:
# I'm first going to define two one-dimensional arrays,
x = np.array([1,2,3])

y =np.array([2,4,6])

In [74]:
# And I'm also going to define two two-dimensional arrays

X = np.array([[1,2,3], [4,5,6]])

Y = np.array([[2,4,6], [8,10,12]])

In [75]:
x

array([1, 2, 3])

In [76]:
X

array([[1, 2, 3],
       [4, 5, 6]])

In [12]:
# Let's first see how we would access a single element of the array

x[2] # gives me the element located at position 2 of x

3

In [13]:
# I can also do slicing.

x[0:2]

array([1, 2])

In [14]:
# If you look at the sizes of x and y, each of them has exactly three elements.
# So I can type x plus y, which gives me a new array called z.

z = x+y # first element of x is added to the first element of y, and so on.
z

array([3, 6, 9])

In [15]:
# Now moving on to two-dimensional arrays, we can also investigate individual rows or columns of arrays.

X[:,1] # second column of the table X

array([2, 5])

In [16]:
X[:,0] # first column of the table X

array([1, 4])

In [17]:
Y[:,0] # first column of the table Y

array([2, 8])

In [19]:
# I can also add these two up.

X[:,1] + Y[:,1]  # I have added together the first columns of these two arrays.

array([ 6, 15])

In [20]:
# To extract the first row of X

X[0,:] # 0 comma colon which gives me all of the elements in the first row.

array([1, 2, 3])

In [21]:
X[1,:] + X[1,:]

array([ 8, 10, 12])

In [23]:
# Because two-dimensional arrays are defined as nested rows, I can use a shorthand notation to access the first row of X,
X[0]

array([1, 2, 3])

In [25]:
# what happens if we take two lists and put a plus sign between them?

[2,4] + [6,8] # this concatenates those two lists, resulting in a new longer list

[2, 4, 6, 8]

In [26]:
# What happens if we first turn those lists into NumPy arrays?

np.array([2,4]) + np.array([6,8]) # When we turn them into numpy arrays, now the elements of the lists are added.


array([ 8, 12])

In [27]:
a = np.array([1,2])
b = np.array([3,4,5])
a + b

ValueError: operands could not be broadcast together with shapes (2,) (3,) 

# 2.2.3: Indexing NumPy Arrays

NumPy arrays can also be indexed with other arrays
or other sequence-like objects like lists.

In [28]:
z1 = np.array([1,3,5,7,9])
z1

array([1, 3, 5, 7, 9])

In [29]:
z2 = z1 + 1  # one added to every single element of the array.
z2

array([ 2,  4,  6,  8, 10])

In [30]:
# I can now define a list called n, which I will be using to index my z1 and z2.

ind = [0,2,3]
ind

[0, 2, 3]

In [32]:
# I can type z1, square bracket, ind, which gives me access to the elements that are located within z1 at the locations that
# are specified by ind.

z1[ind] # so at locations 0, 2 and 3, in array z1 we will find 1 5 7

array([1, 5, 7])

In [34]:
# In this example, index or ind, was defined as a Python list, but we could also have defined that as a NumPy array.

ind = np.array([0,2,3])
z1[ind]

# In other words, we can index NumPy arrays using either lists or other NumPy arrays.

array([1, 5, 7])

NumPy arrays can also be indexed using logical indices, Just as we can have an array of numbers, we
can have an array consisting of true and false, which are two Boolean element.`

In [35]:
z1 = np.array([1,3,5,7,9])
z1 > 6 # The element 0 of z1 is not greater than 6, and therefore element 0 of the Boolean array is false and so on

array([False, False, False,  True,  True])

In [36]:
# We can use the Boolean array, also called a logical array, to index another vector.

z1[z1>6] # using the boolean operation I get returned the locations for which the boolean is true.

array([7, 9])

In [37]:
z2[z1>6] # the boolen operations gives the location(index) 3,4,at which the boolean is true. Then using 3,4 it uses it for z2

array([ 8, 10])

In [39]:
# I could also define my logical vector in a slightly different way. I could explicitly construct a logical vector.

ind = z1>6
ind # it is a logical vector

array([False, False, False,  True,  True])

In [40]:
z1[ind] # same output as before

array([7, 9])

In [41]:
z2[ind]

array([ 8, 10])

One final word about indexing NumPy arrays-- and this is really important because it can easily
lead to subtle programming errors.
When you slice an array using the colon operator, you get a view of the object.
This means that if you modify it, the original array will also be modified.
This is in contrast with what happens when you index an array, in which case
what is returned to you is a copy of the original data.

In [42]:
z1 = np.array([1,3,5,7,9])

w = z1[0:3] # Sliced z1
w

array([1, 3, 5])

In [43]:
w[0] = 3 # I modify the element at location zero to equal 3
w # I inspect it

array([3, 3, 5])

In [45]:
# However, if I now type z1, you will see that the first element at location is 0 of that array has also been modified.

z1 # z1 first location was also modified to 3

array([3, 3, 5, 7, 9])

In [47]:
# Let's see what happens if we use indexing and not slicing to access an array.

z1 = np.array([1,3,5,7,9])

ind = np.array([0,1,2])

In [48]:
w = z1[ind]
w

array([1, 3, 5])

In [49]:
w[0] = 3
w # the first element changed

array([3, 3, 5])

In [50]:
z1 # if I looked at z1, with indexing I see that z1 did not change

#In summary, for all cases of indexed arrays, what is returned
# is a copy of the original data, not a view as one gets for slices.

array([1, 3, 5, 7, 9])

In [53]:
a = np.array([1,2])
b = np.array([3,4,5])
b[a]

array([4, 5])

In [51]:
c = b[1:]
b[a] is c

# The is comparison operator tests if two objects are the same exact object --- not if they have the same exact values.
# When testing values, you could try b[a] == c or all(b[a] == c).

False

# 2.2.4: Building and Examining NumPy Arrays


NumPy provides a couple of ways to construct arrays with fixed,
start, and end values, such that the other elements are uniformly
spaced between them.

In [56]:
# To construct an array of 10 linearly spaced elements starting with 0 and ending with 100, 
# we can use the NumPy linspace function.

np.linspace(0, 100, 10) # starting point, ending point, number of points

array([  0.        ,  11.11111111,  22.22222222,  33.33333333,
        44.44444444,  55.55555556,  66.66666667,  77.77777778,
        88.88888889, 100.        ])

In [57]:
# Now, to construct an average of 10 logarithmically spaced elements between 10 and 100, we can do the following.

np.logspace (1, 2, 10) # starting point is log10 = 1, ending point is log100 = 2, number of points

array([ 10.        ,  12.91549665,  16.68100537,  21.5443469 ,
        27.82559402,  35.93813664,  46.41588834,  59.94842503,
        77.42636827, 100.        ])

In [65]:
np.log(250) # natural logarithm
np.log10(250) # log base 10

5.521460917862246

In [64]:
np.lnde(250)

AttributeError: module 'numpy' has no attribute 'ln'

In [62]:
# To construct array of ten logarithmically spaced elements between numbers say 250 and 500, we first
# need to take the base 10 logarithm of the numbers, 250 and 500, and then feed those into the logspace function.

np.logspace(np.log10(250), np.log10(500), 10)

array([250.        , 270.01493472, 291.63225989, 314.98026247,
       340.19750004, 367.43362307, 396.85026299, 428.62199143,
       462.93735614, 500.        ])

In [66]:
X = np.array([[1,2,3], [4,5,6]])

In [67]:
# Often we need to know the shape of an array or the number of elements in an array.

# To find out the shape of the array
X.shape


(2, 3)

In [68]:
# To check the number of elements of an array with size.
X.size 

# Notice that you don't have parentheses following the shape or size in the above examples.
# This is because shape and size are data attributes, not methods of the arrays.

6

In [None]:
Sometimes we need to examine whether any or all elements of an array fulfill some logical condition.

Let's generate a small one d array and check two things.
First, if any of the entries are greater than 0.9,
and second, if all of the entries are greater than or equal to 0.1.

In [69]:
# generate 10 random numbers drawn from the standard uniform distribution, meaning from the interval from 0 to 1.
x = np.random.random(10) 
x

array([0.78022022, 0.41933271, 0.4100051 , 0.8751233 , 0.78292814,
       0.56126632, 0.10810828, 0.28066728, 0.80241202, 0.91689955])

In [70]:
# we can use the np.any function to find out if any of the elements of x are greater than 0.9.

np.any(x > 0.9)

True

In [71]:
# I can then use the np.all function to find out if all of the elements in the array are greater than or equal to 0.1.

np.any(x >= .01)

True

In [72]:
x = 20
not np.any([x%i == 0 for i in range(2, x)])

# x%i == 0 tests if x has a remainder when divided by i. 
# If this is not true for all values strictly between 1 and x, it must be prime!

False