# Boolean Indexing, Set Operations and Sorting

Up to now we have seen how to make slices and select elements of an ndarray using indices. This is useful when we know the exact indices of the elements we want to select. However, there are many situations in which we don't know the indices of the elements we want to select. For example, suppose we have a 10,000 x 10,000 ndarray of random integers ranging from 1 to 15,000 and we only want to select those integers that are less than 20. Boolean indexing can help us in these cases, by allowing us select elements using logical arguments instead of explicit indices. Let's see some examples:

In [2]:
import numpy as np

## Boolean Operations

In [10]:
# Create a 5*5 ndarray that contains elements from 0 to 24
x = np.arange(25).reshape((5,5))
print('Original x\n',x)

# Use boolean indexing to select elements in x
print('The elements in x that are greater than 17 :',x[x>17])
print('The elements in x that are less than 10:',x[x<10])
print('The elements in x that are greater than 10 and less than 17:',x[(x>10) & (x<17)])


Original x
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]
The elements in x that are greater than 17 : [18 19 20 21 22 23 24]
The elements in x that are less than 10: [0 1 2 3 4 5 6 7 8 9]
The elements in x that are greater than 10 and less than 17: [11 12 13 14 15 16]


## Set Operations

In [13]:
# Create a rank 1 ndarray
x = np.array([1,3,5,6])

# Create another rank 1 ndarray
y = np.array([2,3,5,8,9])

print('The elements that are present in both x and y:',np.intersect1d(x,y))
print('The elements that are present in x but not in y:', np.setdiff1d(x,y))
print('All the elements of x and y:', np.union1d(x,y))

The elements that are present in both x and y: [3 5]
The elements that are present in x but not in y: [1 6]
All the elements of x and y: [1 2 3 5 6 8 9]


## Sorting ndarrays

We can also sort ndarrays in NumPy. We will learn how to use the np.sort() function to sort rank 1 and rank 2 ndarrays in different ways. Like with other functions we saw before, the sort function can also be used as a method. However, there is a big difference on how the data is stored in memory in this case. When np.sort() is used as a function, it sorts the ndrrays out of place, meaning, that it doesn't change the original ndarray being sorted. However, when you use sort as a method, ndarray.sort() sorts the ndarray in place, meaning, that the original array will be changed to the sorted one. Let's see some examples:

In [21]:
# Create an unsorted rank 1 ndarray
x = np.random.randint(1,11,size=10)
print('Unsorted x\n',x)

# Sorting the array without modifying the original array
print('Sorted x\n',np.sort(x))

# Check if original array is modified
print('Original x\n',x)

# Sorting the array by modifying the original array
x.sort()

# Check the original array
print('x\n',x)

Unsorted x
 [ 1  4 10  6  9  3 10  8  8  6]
Sorted x
 [ 1  3  4  6  6  8  8  9 10 10]
Original x
 [ 1  4 10  6  9  3 10  8  8  6]
x
 [ 1  3  4  6  6  8  8  9 10 10]


When sorting rank 2 ndarrays, we need to specify to the np.sort() function whether we are sorting by rows or columns. This is done by using the axis keyword. Let's see some examples:

In [27]:
# Create an unsorted rank 2 ndarray
x = np.random.randint(1,13,size=(4,3))
print('x\n',x)

# Sort the array by rows
print()
print('Sorting by rows\n',np.sort(x,axis=0))

# Sort by columns
print()
print('Sorting by columns\n',np.sort(x,axis=1))

x
 [[ 7 11  9]
 [ 9  5  9]
 [12  4  2]
 [ 2 10  7]]

Sorting by rows
 [[ 2  4  2]
 [ 7  5  7]
 [ 9 10  9]
 [12 11  9]]

Sorting by columns
 [[ 7  9 11]
 [ 5  9  9]
 [ 2  4 12]
 [ 2  7 10]]
