## Boolean operators

Let's go through Python's bitwise logic operators, &, |, ^, and ~.

In [2]:
import numpy as np
import pandas as pd
# Use pandas to extract rainfull inches as a numpy array

rainfall = pd.read_csv('../../CSV Files/O_Reilly_DS/ch02/Seattle2014.csv')

In [3]:
rainfall = rainfall['PRCP'].values
inches = rainfall / 254 # as 1/10 mm ->inches

In [5]:
np.sum((inches > 0.5) & (inches < 1))

29

because of operator precedence rules, with parentheses removed this expression would be evaluated as follows, which results in an error:

In [6]:
inches > (0.5 & inches) < 1

TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Using the equivalence of A AND B and NOT (A OR B) (which you may remember if you’ve taken an introductory logic course), we can compute the same result in a different manner:


In [7]:
np.sum(~((inches <= .5)| (inches >= 1)))

29

In [8]:
print("Number days without rain: ", np.sum(inches == 0))
print("Number days with rain: ", np.sum(inches != 0))
print("Days with more than 0.5 inches:", np.sum(inches > 0.5))
print("Rainy days with < 0.1 inches :", np.sum((inches > 0) & (inches < 0.2)))

Number days without rain:  215
Number days with rain:  150
Days with more than 0.5 inches: 37
Rainy days with < 0.1 inches : 75


![numpy bit wise operators](../../Pictures/Numpy%20bitwise%20compression%20operators.png)

## Boolean Arrays as Masks

In the preceding section, we looked at aggregates computed directly on Boolean arrays. A more powerful pattern is to use Boolean arrays as masks, to select particular subsets of the data themselves. Returning to our x array from before suppose we want an array of all values in the array that are less than, say, 5:

In [11]:
x = np.random.randint(10, size=(3, 4))

x

array([[8, 5, 6, 4],
       [6, 9, 0, 4],
       [5, 1, 7, 3]])

In [12]:
x < 5

array([[False, False, False,  True],
       [False, False,  True,  True],
       [False,  True, False,  True]])

> to select thse values form the array, we ban simply index on this Boolean array; This is known as masking operatons

In [14]:
x[x < 5]

array([4, 0, 4, 1, 3])

What is returned is a one-dimensional array filled with all the values that meet this condition; in other words, all the values in positions at which the mask array is True.


We can compute some relevant statistics on our Seattle rain data:

In [15]:
# construct a mask of all rainy days

rainy = (inches > 0)

In [18]:
# construct a mask of all summer days (june 21st is the 172nd day)

summer = (np.arange(365) - 172 < 90) & (np.arange(365) - 172 > 0) 

In [19]:
print("Median precip on rainy days in 2014 (inches): ", np.median(inches[rainy]))

print("Median precip on summer days in 2014 (inches): ", np.median(inches[summer]))

print("Maximum precip on summer days in 2014 (inches): ", np.max(inches[summer]))

print("Median precip on non-summer rainy days (inches):", np.median(inches[rainy & ~summer]))

Median precip on rainy days in 2014 (inches):  0.19488188976377951
Median precip on summer days in 2014 (inches):  0.0
Maximum precip on summer days in 2014 (inches):  0.8503937007874016
Median precip on non-summer rainy days (inches): 0.20078740157480315
