In [48]:
import numpy as np
x = np.array([1, 2, 3, 4, 5])
x

array([1, 2, 3, 4, 5])

In [49]:
x < 3  # less than

array([ True,  True, False, False, False])

In [50]:
x > 3  # greater than

array([False, False, False,  True,  True])

In [51]:
x <= 3  # less than or equal

array([ True,  True,  True, False, False])

In [52]:
x >= 3  # greater than or equal

array([False, False,  True,  True,  True])

In [53]:
x != 3  # not equal

array([ True,  True, False,  True,  True])

In [54]:
x == 3  # equal

array([False, False,  True, False, False])

It is also possible to do an element-wise comparison of two arrays, and to include compound expressions:

In [55]:
(2 * x) == (x ** 2)

array([False,  True, False, False, False])

As in the case of arithmetic operators, the comparison operators are implemented as ufuncs in NumPy; for example, when you write `x < 3`, internally NumPy uses `np.less(x, 3)`.
    A summary of the comparison operators and their equivalent ufuncs is shown here:

| Operator    | Equivalent ufunc  | Operator   | Equivalent ufunc |
|-------------|-------------------|------------|------------------|
|`==`         |`np.equal`         |`!=`        |`np.not_equal`    |
|`<`          |`np.less`          |`<=`        |`np.less_equal`   |
|`>`          |`np.greater`       |`>=`        |`np.greater_equal`|

Just as in the case of arithmetic ufuncs, these will work on arrays of any size and shape.
Here is a two-dimensional example:

In [56]:
rng = np.random.default_rng(seed=1701)
x = rng.integers(10, size=(3, 4))
x

array([[9, 4, 0, 3],
       [8, 6, 3, 1],
       [3, 7, 4, 0]])

In [57]:
x < 6

array([[False,  True,  True,  True],
       [False, False,  True,  True],
       [ True, False,  True,  True]])

In each case, the result is a Boolean array, and NumPy provides a number of straightforward patterns for working with these Boolean results.

## Working with Boolean Arrays

Given a Boolean array, there are a host of useful operations you can do.
We'll work with `x`, the two-dimensional array we created earlier:

In [58]:
print(x)

[[9 4 0 3]
 [8 6 3 1]
 [3 7 4 0]]


### Counting Entries

To count the number of `True` entries in a Boolean array, `np.count_nonzero` is useful:

In [59]:
# how many values less than 6?
np.count_nonzero(x < 6)

8

We see that there are eight array entries that are less than 6.
Another way to get at this information is to use `np.sum`; in this case, `False` is interpreted as `0`, and `True` is interpreted as `1`:

In [60]:
np.sum(x < 6)

8

The benefit of `np.sum` is that, like with other NumPy aggregation functions, this summation can be done along rows or columns as well:

In [61]:
# how many values less than 6 in each row?
np.sum(x < 6, axis=1)

array([3, 2, 3])

This counts the number of values less than 6 in each row of the matrix.

If we're interested in quickly checking whether any or all the values are `True`, we can use (you guessed it) `np.any` or `np.all`:

In [62]:
# are there any values greater than 8?
np.any(x > 8)

True

In [63]:
# are there any values less than zero?
np.any(x < 0)

False

In [64]:
# are all values less than 10?
np.all(x < 10)

True

In [65]:
# are all values equal to 6?
np.all(x == 6)

False

`np.all` and `np.any` can be used along particular axes as well. For example:

In [66]:
# are all values in each row less than 8?
np.all(x < 8, axis=1)

array([False, False,  True])

Here all the elements in the third row are less than 8, while this is not the case for others.

Finally, a quick warning: as mentioned in [Aggregations: Min, Max, and Everything In Between](02.04-Computation-on-arrays-aggregates.ipynb), Python has built-in `sum`, `any`, and `all` functions. These have a different syntax than the NumPy versions, and in particular will fail or produce unintended results when used on multidimensional arrays. Be sure that you are using `np.sum`, `np.any`, and `np.all` for these examples!

### Boolean Operators

We've already seen how we might count, say, all days with less than 20 mm of rain, or all days with more than 10 mm of rain.
But what if we want to know how many days there were with more than 10 mm and less than 20 mm of rain? We can accomplish this with Python's *bitwise logic operators*, `&`, `|`, `^`, and `~`.
Like with the standard arithmetic operators, NumPy overloads these as ufuncs that work element-wise on (usually Boolean) arrays.

For example, we can address this sort of compound question as follows:

In [67]:
x

array([[9, 4, 0, 3],
       [8, 6, 3, 1],
       [3, 7, 4, 0]])

Suppose we want an array of all values in the array that are less than, say, 5. We can obtain a Boolean array for this condition easily, as we've already seen:

In [68]:
x < 5

array([[False,  True,  True,  True],
       [False, False,  True,  True],
       [ True, False,  True,  True]])

Now, to *select* these values from the array, we can simply index on this Boolean array; this is known as a *masking* operation:

In [69]:
x[x < 5]

array([4, 0, 3, 3, 1, 3, 4, 0])

What is returned is a one-dimensional array filled with all the values that meet this condition; in other words, all the values in positions at which the mask array is `True`.

We are then free to operate on these values as we wish.
For example, we can compute some relevant statistics on our Seattle rain data:

By combining Boolean operations, masking operations, and aggregates, we can very quickly answer these sorts of questions about our dataset.

## Using the Keywords and/or Versus the Operators &/|

One common point of confusion is the difference between the keywords `and` and `or` on the one hand, and the operators `&` and `|` on the other.
When would you use one versus the other?

The difference is this: `and` and `or` operate on the object as a whole, while `&` and `|` operate on the elements within the object.

When you use `and` or `or`, it is equivalent to asking Python to treat the object as a single Boolean entity.
In Python, all nonzero integers will evaluate as `True`. Thus:

In [70]:
bool(42), bool(0)

(True, False)

In [71]:
bool(42 and 0)

False

In [72]:
bool(42 or 0)

True

When you use `&` and `|` on integers, the expression operates on the bitwise representation of the element, applying the *and* or the *or* to the individual bits making up the number:

In [73]:
bin(42)

'0b101010'

In [74]:
bin(59)

'0b111011'

In [75]:
bin(42 & 59)

'0b101010'

In [76]:
bin(42 | 59)

'0b111011'

Notice that the corresponding bits of the binary representation are compared in order to yield the result.

When you have an array of Boolean values in NumPy, this can be thought of as a string of bits where `1 = True` and `0 = False`, and `&` and `|` will operate similarly to in the preceding examples:

In [77]:
A = np.array([1, 0, 1, 0, 1, 0], dtype=bool)
B = np.array([1, 1, 1, 0, 1, 1], dtype=bool)
A | B

array([ True,  True,  True, False,  True,  True])

But if you use `or` on these arrays it will try to evaluate the truth or falsehood of the entire array object, which is not a well-defined value:

Similarly, when evaluating a Boolean expression on a given array, you should use `|` or `&` rather than `or` or `and`:

In [78]:
x = np.arange(10)
(x > 4) & (x < 8)

array([False, False, False, False, False,  True,  True,  True, False,
       False])

Trying to evaluate the truth or falsehood of the entire array will give the same `ValueError` we saw previously:

So, remember this: `and` and `or` perform a single Boolean evaluation on an entire object, while `&` and `|` perform multiple Boolean evaluations on the content (the individual bits or bytes) of an object.
For Boolean NumPy arrays, the latter is nearly always the desired operation.