# <span style = "color:purple"> Array indexing

## <span style = "color:blue"> Slice indexing:

Similar to the use of slice indexing with list and strings, we can use slice indexing to pull out sub-regions of ndarrays.

In [30]:
import numpy as np

# create a ndarray of rank 2 and shape (3, 4)
an_array = np.array([[11,12,13,14], [21,22,23,24], [31,32,33,34]])
print(an_array)

[[11 12 13 14]
 [21 22 23 24]
 [31 32 33 34]]


Use array slicing to get a subarray consisting of the first 2 rows x 2 columns

In [27]:
a_slice = an_array[:2, 1:3]
print(a_slice)

[[12 13]
 [22 23]]


In [20]:
an_array[0,1]

12

In [21]:
a_slice[0,0]

12

When you modify a slice, you actually modify the underlying array

In [28]:
print("Before:", an_array[0,1])    # inspect the element at 0, 1
a_slice[0, 0] = 1000              # a_slice[0,0] is the same piece of data as an_array[0, 1]
print("After:", an_array[0, 1])

Before: 12
After: 1000


Repetimos:

import numpy as np

an_array = np.array([[11,12,13,14], [21,22,23,24], [31,32,33,34]])
print(an_array)

In [31]:
a_slice = np.array(an_array[:2,1:3])
print(a_slice)

[[12 13]
 [22 23]]


In [32]:
print("Before:", an_array[0,1])    # inspect the element at 0, 1
a_slice[0, 0] = 1000              # a_slice[0,0] is the same piece of data as an_array[0, 1]
print("After:", an_array[0, 1])

Before: 12
After: 12


## <span style = "color:blue"> Use both integer indexing & slice indexing

We can use combinations of indexing and slice indexing to create different shape matrices.

In [33]:
# Create a rank 2 array of shape (3,4)
an_array = np.array([[11,12,13,14],[21,22,23,24],[31,32,33,34]])
print(an_array)

[[11 12 13 14]
 [21 22 23 24]
 [31 32 33 34]]


In [34]:
# Using both integer indexing & slicing generates an array of lower rank
row_rank1 = an_array[1,:]

print(row_rank1, row_rank1.shape)    # notice only a single []

[21 22 23 24] (4,)


In [36]:
# Slicing alone: generates an array of the same rank as the an_array
row_rank2 = an_array[1:2, :]    # rank 2 view

print(row_rank2, row_rank2.shape)     # notice the [[ ]]

[[21 22 23 24]] (1, 4)


In [None]:
# We can do the same thing for the columns of an array:

print()
col_rank1 = an_array[:, 1]
col_rank2 = an_array[:, 1:2]

print(col_rank1, col_rank1.shape)     # rank 1
print()
print(col_rank2, col_rank2.shape)     # rank 2

## <span style = "color:blue"> Array indexing for changing elements.

Sometimes it's useful to use an array of indexes to access or change elements.

In [105]:
# Create a new array
an_array = np.array([[11,12,13],[21,22,23],[31,32,33],[41,42,43]])

print('Original array:')
print(an_array)

Original array:
[[11 12 13]
 [21 22 23]
 [31 32 33]
 [41 42 43]]


In [99]:
# Create an array of indices:
col_indices = np.array([0, 1, 2, 0])
print('\nCol indices picked : ', col_indices)

row_indices = np.arange(4)
print('\nRows indices picked : ', row_indices)


Col indices picked :  [0 1 2 0]

Rows indices picked :  [0 1 2 3]


In [41]:
# Examine the pairings of row_indices and col_indices. These are the elements we'll change next
for row,col in zip(row_indices, col_indices):
    print(row, ", ", col)

0 ,  0
1 ,  1
2 ,  2
3 ,  0


In [100]:
# Select one element from each row
print('Values in the array at those indices: ', an_array[row_indices, col_indices])

Values in the array at those indices:  [11 22 33 41]


In [106]:
# Change one element from each row using the indices selected.
an_array[row_indices, col_indices] += 100000

print('Changed Array: ')
print(an_array)

Changed Array: 
[[100011     12     13]
 [    21 100022     23]
 [    31     32 100033]
 [100041     42     43]]



<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>
Boolean Indexing
<br><br></p>

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>
Array Indexing for changing elements:
</p>

In [28]:
# Create a 3x2 array:
an_array = np.array([[11,12], [21,22], [31,32]])
print(an_array)

[[11 12]
 [21 22]
 [31 32]]


In [4]:
# Create a filter wich will be boolean values for whether each element mets this condition.
filter = (an_array > 15)
filter

array([[False, False],
       [ True,  True],
       [ True,  True]])

Notice that the filter is a same size ndarray as an an_array which is filled with 'True' for each element whose corresponding element in an_array which is greater than 15 and 'False' for those elements whose value is less than 15.

In [5]:
# We can now select just those elements which meet that criteria.
print(an_array[filter])

[21 22 31 32]


In [6]:
# For short, we could have just used the approach below without the need for the separate filter array.
an_array[an_array > 15]

array([21, 22, 31, 32])

In [16]:
# Get all the values between 20 and 30.
an_array[(an_array > 20) & (an_array < 30)]

array([21, 22])

In [17]:
# Asking for even values using the modulo symbol.
an_array[an_array % 2 == 0]

array([12, 22, 32])

What is particularly useful is that we can actually change elements in the array applying a similar logical filter. Let's add 100 to all the even values.

In [7]:
an_array[an_array % 2 == 0] += 100
print(an_array)

[[ 11 112]
 [ 21 122]
 [ 31 132]]


In [13]:
print(11%2, 12%2)
print(21%2, 22%2)
print(31%2, 32%2)

1 0
1 0
1 0



<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>
Datatypes and Array <br> <br> Operations
<br><br></p>

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>
Datatypes:
</p>

In [43]:
exp1 = np.array([11, 12])  # Python assigns the data type
print(exp1, exp1.dtype)

[11 12] int32


In [41]:
exp2 = np.array([11.0, 12.78])
print(exp2, exp2.dtype)

[11.   12.78] float64


In [42]:
exp3 = np.array([11, 12], dtype = np.int64)  # You can also tell Python the data type
print(exp3.dtype)

int64


In [44]:
# You can use this to force floats into integers (using 'floor' function)
exp4 = np.array([11.1, 12.7], dtype = np.int64)
print(exp4.dtype)
print()
print(exp4)

int64

[11 12]


In [45]:
# You can use this to force integers into floats if you anticipate
# the values may change to floats later.
exp5 = np.array([11, 12], dtype = np.float64)
print(exp5.dtype)
print()
print(exp5)

float64

[11. 12.]


<p style = "font-familiy: Arial; font-size: 1.75em; color: #2462C0; font-style: bold"><br>
    Arithmetic Array Operations:

In [47]:
x = np.array([[111, 112], [121, 122]], dtype = np.int)
y = np.array([[211.1, 212.1], [221.1, 222.1]], dtype = np.float64)

print(x)
print()
print(y)

[[111 112]
 [121 122]]

[[211.1 212.1]
 [221.1 222.1]]


In [48]:
# Add
print(x + y)  # The plus sign works
print()
print(np.add(x, y))  # So does the numpy function "add"

[[322.1 324.1]
 [342.1 344.1]]

[[322.1 324.1]
 [342.1 344.1]]


In [50]:
# Substract
print(x - y)
print()
print(np.subtract(x, y))

[[-100.1 -100.1]
 [-100.1 -100.1]]

[[-100.1 -100.1]
 [-100.1 -100.1]]


In [51]:
# Multiply
print(x * y)
print()
print(np.multiply(x, y))

[[23432.1 23755.2]
 [26753.1 27096.2]]

[[23432.1 23755.2]
 [26753.1 27096.2]]


In [52]:
# Divide
print(x / y)
print()
print(np.divide(x, y))

[[0.52581715 0.52805281]
 [0.54726368 0.54930212]]

[[0.52581715 0.52805281]
 [0.54726368 0.54930212]]


In [53]:
# Square root
print(np.sqrt(x))

[[10.53565375 10.58300524]
 [11.         11.04536102]]


In [54]:
# Exponent (e ** x)
print(np.exp(x))

[[1.60948707e+48 4.37503945e+48]
 [3.54513118e+52 9.63666567e+52]]


<p style = "font-family: Arial; font-size: 2.75em; color:purple; font-style: bold"><br> Statistical Methods, Sorting and <br> <br> Set Operations:<br><br></p>

<p style = "font-family: Arial; font-size: 1.75em; color:#2462C0; font-style: bold"><br>Basic Statistical Operations:</p>

In [56]:
# Setup a random 2x4 matrix
arr = 10 * np.random.randn(2,5)
print(arr)

[[ 1.06386214 -1.55057838 -3.64663996 -4.61024227 -0.57887558]
 [18.00740407 10.71814523 13.99862706  3.9252528  -7.33883554]]


In [57]:
# Compute the 'mean' for all elements
print(arr.mean())

2.998811957068853


In [65]:
# Compute the means by row
print(arr.mean(axis = 1))

[-1.86449481  7.86211872]


In [66]:
# Compute the means by column
print(arr.mean(axis = 0))

[ 9.53563311  4.58378342  5.17599355 -0.34249473 -3.95885556]


In [67]:
# Sum all the elements
print(arr.sum())

29.98811957068853


In [71]:
# Compute the medians by row
import numpy as np
print(np.median(arr, axis = 1))

[-1.55057838 10.71814523]


<p style = "font-family: Arial; font-size: 1.75em; color: #2462C0; font-style: bold"><br>Sorting:

In [74]:
# Create a 10 element array of random numbers
unsorted = np.random.randn(10)
print(unsorted)

[-0.6711252  -0.16903651  0.23609179  0.7776542  -0.62761402  1.01431839
  0.96245178 -0.25947013  0.39586777 -0.39635024]


In [77]:
# Create a copy of an array and sort it.
sorted = np.array(unsorted)
sorted.sort()

print(sorted)
print()
print(unsorted)

[-0.6711252  -0.62761402 -0.39635024 -0.25947013 -0.16903651  0.23609179
  0.39586777  0.7776542   0.96245178  1.01431839]

[-0.6711252  -0.16903651  0.23609179  0.7776542  -0.62761402  1.01431839
  0.96245178 -0.25947013  0.39586777 -0.39635024]


In [78]:
# Inplace sorting
unsorted.sort()

print(unsorted)

[-0.6711252  -0.62761402 -0.39635024 -0.25947013 -0.16903651  0.23609179
  0.39586777  0.7776542   0.96245178  1.01431839]


<p style = "font-family: Arial; font-size: 1.75em; color: #2462C0; font-style: bold"><br> Finding Unique elements:

In [79]:
array = np.array([1,2,1,4,2,1,4,2])

print(np.unique(array))

[1 2 4]


<p style = "font-family: Arial; font-size: 1.75em; color: #2462C0; font-style:bold"><br> Set Operations with 'np.array' data type:

In [80]:
s1 = np.array(['desk','chair','bulb'])
s2 = np.array(['lamp','bulb','chair'])
print(s1,s2)

['desk' 'chair' 'bulb'] ['lamp' 'bulb' 'chair']


In [81]:
# Using 'intersect' function:
print(np.intersect1d(s1,s2))

['bulb' 'chair']


In [82]:
# Using 'union' function:
print(np.union1d(s1,s2))

['bulb' 'chair' 'desk' 'lamp']


In [83]:
# Using 'difference' function:
print(np.setdiff1d(s1,s2))  # elements in s1 that are not in s2.

['desk']


In [85]:
# Using 'in' function:
print(np.in1d(s1,s2))  # which element of s1 is also in s2. 

#Return an array of booleans.

[False  True  True]


<p style = "font-family: Arial; font-size: 2.75em; color:purple; font-style: bold"><br><br> Broadcasting: <br><br>

Introduction to broadcasting. <br>
For more details, please see: <br>
https://docs.scipy.org/doc/numpy-1.10.1/user/basics.broadcasting.html

In [89]:
import numpy as np

start = np.zeros((4,3))
print(start)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


In [90]:
# Create a rank 1 ndarray with 3 values:
add_rows = np.array([1, 0, 2])
print(add_rows)

[1 0 2]


In [91]:
# let's do the Broadcasting
y = start + add_rows  # add to each row of 'start' using broadcasting.
print(y)

[[1. 0. 2.]
 [1. 0. 2.]
 [1. 0. 2.]
 [1. 0. 2.]]


In [100]:
# Create an ndarray which is 4x1 to broadcast across columns, 
# and later use transpose (T) to convert to an 1x4 ndarray.
add_cols = np.array([[0,1,2,3]])
add_cols = add_cols.T

print(add_cols)

[[0]
 [1]
 [2]
 [3]]


In [101]:
# We are gonna add to each column of 'start' using broadcasting:
y = start + add_cols
print(y)

[[0. 0. 0.]
 [1. 1. 1.]
 [2. 2. 2.]
 [3. 3. 3.]]


In [104]:
# We are gonna add an 'scalar' using broadcasting:
add_scalar = np.array([1])

print(add_scalar)
print()
print(start + add_scalar)

[1]

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


Example from the slides:

In [105]:
# Create a 3x4 matrix:
arrA = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
print(arrA)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [110]:
# Create a 4x1 array:
arrB = [0,1,0,2]
print(arrB)    # Fijate en la coma, esta divide cada renglon.

[0, 1, 0, 2]


In [107]:
# Add the two matrix together using 'broadcasting':
print(arrA + arrB)

[[ 1  3  3  6]
 [ 5  7  7 10]
 [ 9 11 11 14]]


<p style = "font-family: Arial; font-size: 2.75em; color:purple; font-style: bold"><br><br> Speedtest: ndarrays vs lists <br><br>

First setup parameters for the speed test. We'll be testing time to sum elements in a ndarray versus list.

In [10]:
from numpy import arange
from timeit import Timer

size    = 1000000
timeits = 1000

In [11]:
# Create the ndarray with values 0,1,2...,size-1
nd_array = arange(size)
print( type(nd_array) )

<class 'numpy.ndarray'>


In [12]:
# 'Timer' expects the operation as a parameter,
# here we pass nd.array.sum().
timer_numpy = Timer("nd_array.sum()", "from __main__ import nd_array")

print("Time taken by numpy ndarray: %f seconds" %
     (timer_numpy.timeit(timeits)/timeits))

Time taken by numpy ndarray: 0.000555 seconds


In [14]:
# Create the list with values 0,1,2,...,size-1
a_list = list(range(size))

print(type(a_list))

<class 'list'>


In [17]:
# Timer expects the operation as a parameter, here we pass sum(a_list)
timer_list = Timer("sum(a_list)", "from __main__ import a_list")

print("Time taken by list: %f seconds" %
     (timer_list.timeit(timeits)/timeits))

Time taken by list: 0.029353 seconds


<p style = "font-family: Arial; font-size: 2.75em; color:purple; font-style:bold"><br><br> Additional Common 'ndarray'<br><br> Operations.

<p style = "font-family: Arial; font-size: 1.75em; color: #2462C0; font-style: bold"><br><br> Dot Product on Matrices, <br><br>and Inner Product on Vectors:

In [87]:
# Determine the dot product of two matrices:
x2d = np.array([[1,1],[1,1]])
y2d = np.array([[2,2],[2,2]])

print(x2d.dot(y2d))
print()
print(np.dot(x2d,y2d))

[[4 4]
 [4 4]]

[[4 4]
 [4 4]]


In [88]:
x2d = np.array([[1,1],[1,1]])
print(x2d)

[[1 1]
 [1 1]]
