# Lecture 16 â€“ Comparisons and Boolean Operators

## Data 6, Summer 2022

## Comparisons

Comparisons in Python allow us to compare two values. The result of evaluating a comparison statement (e.g. `age > 21`) is a **boolean** value (`True` or `False`). This boolean is helpful for controlling the "flow" of Python code (we'll see more of this in detail in Lecture 17).

Comparisons also allow us to check **conditions**, like "is this person at least 21?" or "is the password the user entered the correct password?".

In [None]:
# is age at least age_limit?
age_limit = 21
age = 17

age >= age_limit

In [None]:
# is password_guess equal to true_password?
true_password = 'qwerty1093x!'
password_guess = 'QWERTY1093x!'

password_guess == true_password

Above, we saw both the greater than or equal to (`>=`) and the equal to (`==`) comparison operators in action. Notice that `==` (equals to operator) is **not** the same as `=` (assignment operator)

The code below works...

In [None]:
3 == 3

But this code doesn't:

In [None]:
3 = 3

Below are some other examples of comparison operators in action!

In [None]:
'hello' != 'howdy'

In [None]:
-3 > -2

In [None]:
-3 < -2

In [None]:
'alpha' >= 'beta'

Recall that `=` and `==` mean different things:

In [None]:
x = 5         # set x equal to 5

In [None]:
x == 5        # is x equal to 5?

This means that the weird-looking code below is actually completely valid!

In [None]:
y = x == 5
y

### Comparing Different Types

In Python, you can check for equality (`==` or `!=`) with values of **any** type.

In [None]:
17 == '17'

In [None]:
'zebra' != True

In [None]:
True == 1.0

But you can only check for inequality between values of the **same "category"**

In [None]:
5 > True

In [None]:
'alpha' >= 'beta'

In [None]:
'alpha' >= 5

Notice that `5 > True` still works, even though the two values or of different types. We will return to this a bit later.

### Floating Point Issues, Revisited

Be careful when comparing floating point numbers. Because of floating point rounding errors, Python might give us some weird results.

In [None]:
0.1 * 2 == 0.2

In [None]:
0.1 * 6

In [None]:
0.1 * 6 == 0.6

Instead of checking for equality directly, we can check if the difference between the two values is **negligible** (i.e. within some small amount):

In [None]:
abs(0.1 * 6 - 0.6) < 0.0001

### String Containment

Another useful Python **operator** is the `in` keyword, which checks if the first value is present within the second value:

In [None]:
'berkeley' in 'uc berkeley'

In [None]:
'stanford' in 'uc berkeley'

In [None]:
'berkeley' in 'UC BERKELEY'

### Quick Check 1

What is the value of `not_passed` after running the cell below? Try to figure out the answer before running the cell.

In [None]:
passed = 0.5 * 30 + 0.5 * 100 >= 65
not_passed = passed == False

In [None]:
not_passed

## Boolean Operators

Sometimes, we want to check multiple conditions at once. For example, in order to graduate you must be a `'senior'` and have at least 120 units. We can check both of these conditions separately...

In [None]:
year = 'junior'
units = 125

In [None]:
year_check = year == 'senior'
year_check

In [None]:
units_check = units >= 120
units_check

...Or we could check both conditions at the same time with a **boolean operator** like `and` or `or`.

In [None]:
ready_to_grad = year_check and units_check
almost_ready = year_check or units_check

In [None]:
ready_to_grad

In [None]:
units_check

### `and`

The boolean operator `and` requires that both values are `True` for the expression to evaluate to `True`. Otherwise, the value of the expression is `False`.

In [None]:
True and True

In [None]:
True and False

### `or`

The boolean operator `or` requires that at least one of the values is `True`. If both values are `False` then the whole expression is `False`.

In [None]:
True or True

In [None]:
True or False

In [None]:
False or False

### `not`

The boolean operator `not` _negates_ the value immediately after it. If the value is `True`, then the result will be `False`. If the value is `False`, then the result will be `True`.

In [None]:
not True

In [None]:
not False

### More Examples

(Recall that the modulo operator `%` calculates the remainder of a division)

In [None]:
n = 12
(n % 2 == 0) and (n % 4 == 0)

In [None]:
(n % 2 == 0) and not (n % 5 == 0)

In [None]:
(n % 3 != 0) and (n % 4 != 0)

`and` and `or` can also take more than two arguments.

In [None]:
True and False and True and True

In [None]:
True or False or True or True

The following are valid boolean expressions, but they are hard to interpret so we recommend against them.

In [None]:
3 < 4 <= 5

In [None]:
3 < 4 > 2 < 11 > -1

In [None]:
3 < 4 < 2 > 11 > -1

### Quick Check 3

What are the values of `wear_socks` and `wear_jacket` after running the following lines of code? Try to determine the answer before running the cell.

In [None]:
temp = 67
raining = bool(0)
wear_socks = (not not raining) and (temp < 60)
wear_jacket = (not wear_socks) or (temp > 65)
wear_jacket = wear_jacket and wear_socks

In [None]:
wear_socks

In [None]:
wear_jacket

## Demo

The first cell contains code that's mostly copied from a previous lecture. Ignore it once again!

In [None]:
from datascience import *
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import numpy as np

data = Table.read_table('data/countries.csv')
data = data.relabeled('Country(or dependent territory)', 'Country') \
           .relabeled('% of world', '%') \
           .relabeled('Source(official or UN)', 'Source')
data = data.with_columns(
    'Country', data.apply(lambda s: s[:s.index('[')].lower() if '[' in s else s.lower(), 'Country'),
    'Population', data.apply(lambda i: int(i.replace(',', '')), 'Population'),
    '%', data.apply(lambda f: float(f.replace('%', '')), '%')
)

def first_letter(s):
    return s[0]

def last_letter(s):
    return s[-1]

In [None]:
data

Below, assign `first_or_last` to a string containing a single lowercase letter.
We'll look at the distribution of populations of countries whose names either begin or end with `first_or_last`.

In [None]:
first_or_last = 's' # Assign `first_or_last` to a string containing a single lowercase letter

In [None]:
relevant_countries = data.where(data.apply(
    
    # Ignore this  # Focus on this part!
    lambda name: first_letter(name) == first_or_last or last_letter(name) == first_or_last
    
    
, 'Country')).sort('Population', descending = True)

In [None]:
relevant_countries

In [None]:
# Ignore everything except the last line!
plt.figure(figsize = (10, 7))
names = relevant_countries.column('Country')
pops = relevant_countries.column('Population')

if relevant_countries.num_rows > 15:
    names = names[:15]
    pops = pops[:15]

sns.barplot(x = pops, y = names, orient = 'h')

# Focus on this part!
plt.title('Populations of countries starting or ending with ' + first_or_last);