# Comparisons, Masks, and Boolean Logic

In many case, we want select elements where our condition fulfill or perhaps remove all outliers that are above some threshold. In that case, masking coms up with us for that reason. NumPy implements comparison operators such as <, > as element-wise ufuncs and the result of these comparison operation is always an array with a Boolean data type

In [2]:
import numpy as np
x = np.array([1, 2, 3, 4, 5])

x < 3

x > 3

x <= 3

x >= 3

x != 3

x == 3

array([ True,  True, False, False, False])

array([False, False, False,  True,  True])

array([ True,  True,  True, False, False])

array([False, False,  True,  True,  True])

array([ True,  True, False,  True,  True])

array([False, False,  True, False, False])

Native Python's comparison operators like above is wrappers for NumPy ufuncs:

| Operator    | Equivalent ufunc    | Operator   | Equivalent ufunc    |
|---------------|---------------------|---------------|---------------------|
|``==``         |``np.equal``         |``!=``         |``np.not_equal``     |
|``<``          |``np.less``          |``<=``         |``np.less_equal``    |
|``>``          |``np.greater``       |``>=``         |``np.greater_equal`` |

## Working with Boolean Arrays

Using boolean arrays, we can do several works such as:
* Counting elements that meet the criteria
* Boolean operators

### Counting elements that meet the criteria

To test different criteria, we can use various method such as `np.count_nonzero, np.sum, np.any, np.all`

In [5]:
rng = np.random.RandomState(0)
x = rng.randint(10, size=(3, 4))
x

# how many values less than 6?
np.count_nonzero(x < 6)
np.sum(x < 6) # same task

# are there any values greater than 8?
np.any(x > 8)

# are all values less than 10?
np.all(x < 10)

array([[5, 0, 3, 3],
       [7, 9, 3, 5],
       [2, 4, 7, 6]])

8

8

True

True

Note that Python also has built-in function like `any(), all()`. However, These have a different syntax than NumPy versions, and may fail or produce wrong result when used on multidimensional arrays

### Boolean operators

We can implement logical operation by utilizing `&, |, ^, ~`

In [8]:
np.sum((x > 3) & (x < 10))

7

In [9]:
np.sum(~((x > 3) & (x < 10)))

5

## Boolean Arrays as Masks

In [10]:
x

x < 5

x[x < 5]

array([[5, 0, 3, 3],
       [7, 9, 3, 5],
       [2, 4, 7, 6]])

array([[False,  True,  True,  True],
       [False, False,  True, False],
       [ True,  True, False, False]])

array([0, 3, 3, 3, 2, 4])

## Difference between `or, and` vs `|, &`

The difference is this: and and `or` gauge the truth `or` falsehood of entire object, while `&` and `|` refer to bits within each object.

In [11]:
bool(42), bool(0)

bool(42 and 0)

bool(42 or 0)

(True, False)

False

True

In [13]:
bin(42)

bin(59)

bin(-1)

bin(42 & 59)

bin(42 | 59)

'0b101010'

'0b111011'

'-0b1'

'0b101010'

So remember this:***`and` and `or` perform a single Boolean evaluation on an entire object, while `&` and `|` perform multiple Boolean evaluations on the content (the individual bits or bytes) of an object.*** For Boolean NumPy arrays, the latter is nearly always the desired operation.

In [14]:
A = np.array([1, 0, 1, 0, 1, 0], dtype=bool)
B = np.array([1, 1, 1, 0, 1, 1], dtype=bool)
A | B
A or B

array([ True,  True,  True, False,  True,  True])

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()