# Boolean Indexing, Set Operations, and Sorting

In [1]:
import numpy as np

In [2]:
X = np.arange(25).reshape(5, 5)

In presence of bug ndarray size (for instance: 1000x1000), it may be difficult to select elements by using their position.

For this reason, it's possibile to use Boolean indexing as follow

In [3]:
# We use Boolean indexing to select elements in X:
print('The elements in X that are greater than 10:\n', X[X > 10])
print()
print('The elements in X that less than or equal to 7:\n', X[X <= 7])
print()
print('The elements in X that are between 10 and 17:\n', X[(X > 10) & (X < 17)])

The elements in X that are greater than 10:
 [11 12 13 14 15 16 17 18 19 20 21 22 23 24]

The elements in X that less than or equal to 7:
 [0 1 2 3 4 5 6 7]

The elements in X that are between 10 and 17:
 [11 12 13 14 15 16]


This functionality allows us to replace also the values of the element that matches the boolean condition

In [4]:
X[(X>10) & (X<17)] = 100

In [5]:
X

array([[  0,   1,   2,   3,   4],
       [  5,   6,   7,   8,   9],
       [ 10, 100, 100, 100, 100],
       [100, 100,  17,  18,  19],
       [ 20,  21,  22,  23,  24]])

NumPy also allows for set operations. This useful when comparing ndarrays, for example, to find common elements between two ndarrays.

In [7]:
x = np.array([1,2,3,4,5])
y = np.array([6,7,2,8,4])

In [9]:
# intersection between two vector
np.intersect1d(x,y)

array([2, 4])

In [10]:
# The elements that are in x that are not in y
np.setdiff1d(x,y)

array([1, 3, 5])

In [12]:
# The elements that are in y that are not in x
np.setdiff1d(y,x)

array([6, 7, 8])

In [13]:
# union 
np.union1d(x,y)

array([1, 2, 3, 4, 5, 6, 7, 8])

Last but not least, Numpy offers the `sort()` method to order the ndarray

In [15]:
v = np.random.randint(1,20,size=(10,))

In [16]:
v

array([ 3, 12, 11, 17, 10, 16,  3,  9, 15, 11])

In [17]:
np.sort(v)

array([ 3,  3,  9, 10, 11, 11, 12, 15, 16, 17])

In [18]:
np.sort(np.unique(v))

array([ 3,  9, 10, 11, 12, 15, 16, 17])

For ordering a matrix, we must specify the axis 

In [20]:
V = v.reshape(2,5)

In [21]:
V

array([[ 3, 12, 11, 17, 10],
       [16,  3,  9, 15, 11]])

In [25]:
# sorted columns
np.sort(V, axis=0)

array([[ 3,  3,  9, 15, 10],
       [16, 12, 11, 17, 11]])

In [26]:
# sorted rows
np.sort(V, axis=1)

array([[ 3, 10, 11, 12, 17],
       [ 3,  9, 11, 15, 16]])