# Boolean Arrays

## Objectives

- Illustrate using comparison operators and Boolean arrays in NumPy for filtering data.
- Demonstrate how to count and identify specific elements within arrays using Boolean conditions.
- Explain the application of bitwise Boolean operators for complex condition-based selections.

## Background

The notebook delves into the use of Boolean arrays in NumPy, a critical feature for data filtering and manipulation. Comparison operators enable the identification of elements that meet specific criteria, facilitating tasks such as counting positive, negative, or zero values and selecting data ranges. The discussion extends to the use of bitwise operators, showcasing how to perform element-wise logical operations. This feature enhances the flexibility and capability of condition-based data selection in array processing.

## Datasets Used

Synthetic datasets are generated using NumPy's random number generation capabilities.

## Comparison Operators

In [1]:
import numpy as np

# Fixing seed for reproducibility
np.random.seed(0)

In [2]:
x = np.random.randint(-2, 5, size=6)
x

array([ 2,  3, -2,  1,  1,  1])

x less than 0

In [3]:
x < 0

array([False, False,  True, False, False, False])

x greater than 0

In [4]:
x > 0

array([ True,  True, False,  True,  True,  True])

x greater than or equal to 1

In [5]:
x >= 1

array([ True,  True, False,  True,  True,  True])

not equal

In [6]:
# not equal
x != 1

array([ True,  True,  True, False, False, False])

equal

In [7]:
# equal
x == 2

array([ True, False, False, False, False, False])

**Working with 2 dimensional array**

In [8]:
x2 = np.random.randint(-2, 5, size=(3,4))
x2

array([[-1,  1,  3,  0],
       [ 2,  4, -2, -2],
       [ 2,  0, -1,  4]])

x2 less than or equal to 0

In [9]:
# less than or equal to 0
x2 <= 0

array([[ True, False, False,  True],
       [False, False,  True,  True],
       [False,  True,  True, False]])

Counting nonzero entries

In [10]:
# counting entries
np.count_nonzero(x2)

10

Counting negative values

In [11]:
# counting negative values
np.count_nonzero(x2 < 0)

4

Counting positive values

In [12]:
# counting positive values
np.count_nonzero(x2 > 0)

6

Counting positive values, another way

In [13]:
# counting positive values, another way
np.sum(x2 > 0)

6

`sum()` can work along columns

In [14]:
# sum() can work along columns
np.sum(x2 > 0, axis=0)

array([2, 2, 1, 1])

`sum()` can work along rows

In [15]:
# sum() can work along rows
np.sum(x2 > 0, axis=1)

array([2, 2, 2])

Are there any value equal to 0?

In [16]:
np.any(x2 == 0)

True

Columns with at least one 0

In [17]:
# Columns with at least one 0
np.any(x2 == 0, axis=0)

array([False,  True, False,  True])

Rows with at least one 0

In [18]:
# Rows with at least one 0
np.any(x2 == 0, axis=1)

array([ True, False,  True])

Are all values equal to 5?

In [19]:
# Are all values equal to 5?
np.all(x2 == 5)

False

## Boolean Arrays as Masks

Getting the negative values


In [20]:
x2[x2 < 0]

array([-1, -2, -2, -1])

Getting the positive values

In [21]:
x2[x2 > 0]

array([1, 3, 2, 4, 2, 4])

Getting the values greater than 4

In [22]:
x2[x2 > 4]

array([], dtype=int32)

## Boolean Operators

| a | b | a `and` b | a `or` b |
|---|---|---------|---------|
|False|False|False|False|
|False|True|False|True|
|True|False|False|True|
|True|True|True|True|

| a | `not` a |
|---|---|
|False|True|
|True|False|

**Bitwise Boolean Operators**

| Operator | Equivalent ufunc |
|----------|------------------|
| **`&`** | np.bitwise_and |
| **`\|`** | np.bitwise_or |
| **`^`** | np.bitwise_xor |
| **`~`** | np.bitwise_not |

- `and` and `or` gauge the truth or falsehood of entire object
- `&` and `|` refer to bits within each object

In Python, all nonzero integers will evaluate as True. 

In [23]:
a = 9
b = 10

True and True

In [24]:
# 
a and b

10

The result is different from 0, which means it is True. Python only evaluates the right-side part of an "and" expression if the left part is truthy.

True and True

In [25]:
# True and True
b and a

9

The result is different from 0, which means it is True.

Bitwise comparison

In [26]:
# Bitwise comparison
a & b

8

Let's convert `a` and `b` to binaries, and perform the bitwise `&` comparison.

binary

In [27]:
# binary 
bin(9)

'0b1001'

binary

In [28]:
# binary
bin(10)

'0b1010'

`1001 & 1010 = 1000`

Notice that the corresponding bits of the binary representation are compared in order to yield the result.

In [29]:
bin(8)

'0b1000'

When you have an array of Boolean values in NumPy, you can think of a string of bits where `1 = True` and `0 = False`, and the result of `&` and `|` operates similarly.

In [30]:
ar1 = np.random.randint(0, 2, size=10)
ar1

array([1, 1, 0, 0, 1, 1, 1, 1, 0, 1])

In [31]:
ar2 = np.random.randint(0, 2, size=10)
ar2

array([0, 1, 0, 1, 1, 0, 1, 1, 0, 0])

Using the bitwise operator  

In [32]:
# Using the bitwise operator  
ar1 & ar2

array([0, 1, 0, 0, 1, 0, 1, 1, 0, 0])

Using the bitwise operator  

In [33]:
# Using the bitwise operator 
ar1 | ar2

array([1, 1, 0, 1, 1, 1, 1, 1, 0, 1])

In [34]:
ar1 and ar2         # This will raise an error

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [35]:
# Remember
x

array([ 2,  3, -2,  1,  1,  1])

Getting the values between -1 and 1

In [36]:
x[(x >= -1) & (x <= 1)]

array([1, 1, 1])

In [37]:
# Remember
x2

array([[-1,  1,  3,  0],
       [ 2,  4, -2, -2],
       [ 2,  0, -1,  4]])

Getting the values between -1 and 1

In [38]:
x2[(x2 >= -1) & (x2 <= 1)]

array([-1,  1,  0,  0, -1])

## Conclusions

Key Takeaways:
- Boolean arrays enable efficient data filtering in NumPy using comparison operators.
- Functions like np.count_nonzero, np.any, and np.all facilitate counting and identifying elements according to specific criteria.
- Bitwise operators allow for complex condition-based data selections, enhancing data manipulation capabilities.
- These techniques are crucial for preprocessing and analyzing datasets, providing a foundation for advanced data exploration.

## References

- VanderPlas, J. (2017) Python Data Science Handbook: Essential Tools for Working with Data. USA: O’Reilly Media, Inc. chapter 2.