Introduction to numpy:



Getting started with ndarray


ndarrays are time and space-efficient multidimensional arrays at the core of numpy. Like the data structures in Week 2, let's get started by creating ndarrays using the numpy package.


How to create Rank 1 numpy arrays:

In [3]:
import numpy as np 

k_array = np.array([4,122,543]) # Create a rank 1 array
print(type(k_array))

<class 'numpy.ndarray'>


In [4]:
# test the shape of the array we just created, it should have just one dimension (Rank 1)
print(k_array.shape)

(3,)


In [5]:
k_array[0] = 444

In [6]:
print(k_array)

[444 122 543]


How to create a Rank 2 numpy array:

A rank 2 ndarray is one with two dimensions. Notice the format below of [ [row] , [row] ]. 2 dimensional arrays are great for representing matrices which are often useful in data science.

In [14]:
new_array = np.array([[4,122,543],[100,200,300]])  # Create a rank 2 array
print(new_array)
print("The shape is 2 rows, 3 columns: ", new_array.shape)  # rows x columns 

print("Accessing elements [0,0], [0,1], and [1,0] of the ndarray: ", new_array[0, 0], ", ",new_array[0, 1],", ",
      new_array[1, 0])

[[  4 122 543]
 [100 200 300]]
The shape is 2 rows, 3 columns:  (2, 3)
Accessing elements [0,0], [0,1], and [1,0] of the ndarray:  4 ,  122 ,  100


Here we create a number of different size arrays with different shapes and different pre-filled values. numpy has a number of built in methods which help us quickly and easily create multidimensional arrays.

In [17]:
zero = np.zeros((2,2))
print(zero)

[[0. 0.]
 [0. 0.]]


In [18]:
ex1 = np.full((2,2),9.0)
print(ex1)

[[9. 9.]
 [9. 9.]]


In [25]:
# create a 2x2 matrix with the diagonal 1s and the others 0
ex3 = np.eye(3,2)
print(ex3)

[[1. 0.]
 [0. 1.]
 [0. 0.]]


In [26]:
# create an array of ones
ex4 = np.ones((1,2))
print(ex4)

[[1. 1.]]


In [27]:
# notice that the above ndarray (ex4) is actually rank 2, it is a 2x1 array
print(ex4.shape)

# which means we need to use two indexes to access an element
print(ex4[0,1])

(1, 2)
1.0


Array Indexing 



Slice indexing:

Similar to the use of slice indexing with lists and strings, we can use slice indexing to pull out sub-regions of ndarrays.

In [29]:
# Rank 2 array of shape (3, 4)
me_array = np.array([[19,18,17,16], [21,22,23,24], [35,34,33,32]])
print(me_array)

[[19 18 17 16]
 [21 22 23 24]
 [35 34 33 32]]


In [31]:
#Use array slicing to get a subarray consisting of the first 2 rows x 2 columns.
me_slice = me_array[:2,1:3]
print(me_slice)

[[18 17]
 [22 23]]




Array Indexing for changing elements:

Sometimes it's useful to use an array of indexes to access or change elements.

In [32]:
# Create a new array
an_array = np.array([[11,12,13], [21,22,23], [31,32,33], [41,42,43]])

print('Original Array:')
print(an_array)

Original Array:
[[11 12 13]
 [21 22 23]
 [31 32 33]
 [41 42 43]]


In [37]:
col_indices = np.array([0,1,2,0])
print('\nCol indices picked : ', col_indices)

row_indices = np.arange(4)
print('\nrow indices picked : ', row_indices)


Col indices picked :  [0 1 2 0]

row indices picked :  [0 1 2 3]


In [40]:
for row,col in zip(row_indices,col_indices):
    print(row, ",",col)

0 , 0
1 , 1
2 , 2
3 , 0


In [41]:
# Select one element from each row
print('Values in the array at those indices: ',an_array[row_indices, col_indices])

Values in the array at those indices:  [11 22 33 41]


In [42]:
# create a 3x2 array
an_array = np.array([[11,12], [21, 22], [31, 32]])
print(an_array)

[[11 12]
 [21 22]
 [31 32]]


In [43]:
# create a filter which will be boolean values for whether each element meets this condition
filter = (an_array > 15)
filter

array([[False, False],
       [ True,  True],
       [ True,  True]])

In [44]:
# we can now select just those elements which meet that criteria
print(an_array[filter])

[21 22 31 32]


In [45]:
an_array[an_array % 2 == 0] +=100
print(an_array)

[[ 11 112]
 [ 21 122]
 [ 31 132]]


In [46]:
# setup a random 2 x 4 matrix
arr = 10 * np.random.randn(2,5)
print(arr)

[[ -5.95533617  21.32594689 -11.75855052   0.15712011  15.32771295]
 [  7.93538991 -19.64134951 -12.42678089   7.65742698  21.90659022]]


In [47]:
# create a 10 element array of randoms
unsorted = np.random.randn(10)

print(unsorted)

[ 0.9422433   0.85075446  1.00485154  0.41276785  0.43939216 -0.38678395
 -0.43119352  0.12968043  2.20900149 -1.29405558]


In [48]:
# inplace sorting
unsorted.sort() 

print(unsorted)

[-1.29405558 -0.43119352 -0.38678395  0.12968043  0.41276785  0.43939216
  0.85075446  0.9422433   1.00485154  2.20900149]


Broadcasting: 

Introduction to broadcasting. 
For more details, please see: 
https://docs.scipy.org/doc/numpy-1.10.1/user/basics.broadcasting.html

In [49]:
import numpy as np

start = np.zeros((4,3))
print(start)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


In [50]:
# create a rank 1 ndarray with 3 values
add_rows = np.array([1, 0, 2])
print(add_rows)

[1 0 2]


In [51]:
# add to each row of 'start' using broadcasting
y = start + add_rows 
print(y)

[[1. 0. 2.]
 [1. 0. 2.]
 [1. 0. 2.]
 [1. 0. 2.]]


In [52]:
# create an ndarray which is 4 x 1 to broadcast across columns
add_cols = np.array([[0,1,2,3]])
add_cols = add_cols.T

print(add_cols)

[[0]
 [1]
 [2]
 [3]]


In [53]:
y = start + add_cols 
print(y)

[[0. 0. 0.]
 [1. 1. 1.]
 [2. 2. 2.]
 [3. 3. 3.]]


In [54]:
# this will just broadcast in both dimensions
add_scalar = np.array([1])  
print(start+add_scalar)

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
