# Logic, control flow and filtering

## Comparison operators

### Equality

To check if two Python values, or variables, are equal you can use `==`. To check for inequality, you need `!=`.

In [1]:
# Write code to see if True and 1 are equal
True == 1

True

In [2]:
# Write Python code to check if -5 * 15 is not equal to 75
(-5*15) != 75

True

### Greater and less than

The comparison operators `<` and `>` can be combined with an equality sign: `<=`, `>=`.

In [3]:
# Comparison of integers
x = -3 * 6
print(x >= -10)

# Comparison of strings
y = "test"
print('test' <= y)

# Comparison of booleans
print(True > False)

False
True
True


### Compare arrays

You can use comparison operators with NumPy arrays.

In [5]:
import numpy as np
my_house = np.array([18.0, 20.0, 10.75, 9.50])
your_house = np.array([14.0, 24.0, 14.25, 9.0])

# Using comparison operators, generate boolean arrays that answer the following questions:
# Which areas in my_house are greater than or equal to 18?
print(my_house >= 18)

# Which areas in my_house are smaller than the ones in your_house?
print(my_house < your_house)

[ True  True False False]
[False  True  True False]


## Boolean operators

### Boolean operators

A boolean operator is either `True` or `False`. With **boolean operators** such as `and`, `or`, `not`, you can **combine boolean results** to perform more advanced queries on your data.  

In [6]:
my_kitchen = 18.0
your_kitchen = 14.0

# Write Python expressions to check whether:
# my_kitchen is bigger than 10 and smaller than 18
print(my_kitchen > 10 and my_kitchen < 18)

# my_kitchen is smaller than 14 or bigger than 17
print(my_kitchen < 14 or my_kitchen > 17)

# double the area of my_kitchen is smaller than triple the area of your_kitchen
print(my_kitchen*2 < your_kitchen*3)

False
True
True


### Boolean operators with NumPy

Comparison operators work fine with NumPy arrays, but unfortunately, boolean operators don't. To use these operators with NumPy, you will need `np.logical_and()`, `np.logical_or()` and `np.logical_not()`.  

In [7]:
# Generate boolean arrays that answer the following questions:
# Which areas in my_house are greater than 18.5 or smaller than 10?
print(np.logical_or(my_house > 18.5, my_house < 10))

# Which areas are smaller than 11 in both my_house and your_house?
print(np.logical_and(my_house < 11, your_house < 11))

[False  True False  True]
[False False False  True]


## Conditional operators

### The if statement 

This statement allows to run a code line only if a condition (evaluated as a boolean value) is met.

In [8]:
room = "kit"
area = 14.0

# Print out "looking around in the kitchen." if room equals "kit"
if room == 'kit':
    print("looking around in the kitchen.")

# Write another if statement that prints out "big place!" if area is greater than 15
if area > 15:
    print("big place!")

looking around in the kitchen.


### Add else

In [9]:
# Add an else statement to your second control structure so that "pretty small." is printed out if area > 15 evaluates to False
if area > 15:
    print("big place!")
else:
    print("pretty small.")

pretty small.


### Customize further: elif

In [10]:
# Add an elif to the second control structure such that "medium size, nice!" is printed out if area is greater than 10
if area > 15:
    print("big place!")
elif area > 10:
    print("medium size, nice!")
else:
    print("pretty small.")

medium size, nice!


## Filtering DataFrames

### Filter values from a DataFrame

In [13]:
# From the cars DataFrame extract the drives_right column as a Pandas Series and store it as dr
import pandas as pd
cars = pd.read_csv('./data/cars.csv', index_col=0)
dr = cars['drives_right']

# Use dr, a boolean Series, to subset the cars DataFrame 
# Store the resulting selection in sel
sel = cars[dr]

# Print sel, and assert that drives_right is True for all observations
display(sel, cars.shape)

Unnamed: 0,country,drives_right,cars_per_cap
US,United States,True,809
RU,Russia,True,200
MOR,Morocco,True,70
EG,Egypt,True,45


(7, 3)

### Using boolean operators with DataFrame

In [14]:
# Select the cars_per_cap column from cars as a Pandas Series and store it as cpc
cpc = cars['cars_per_cap']

# Use cpc in combination with a comparison operator and 500 
# You want to end up with a boolean Series that's True if the corresponding country has a cars_per_cap of more than 500 and False otherwise
# Store this boolean Series as many_cars
many_cars = cpc > 500

# Use many_cars to subset cars, similar to what you did before 
# Store the result as car_maniac
car_maniac = cars[many_cars]

# Print out car_maniac to see if you got it right
car_maniac

Unnamed: 0,country,drives_right,cars_per_cap
US,United States,True,809
AUS,Australia,False,731
JPN,Japan,False,588


In [16]:
# Create a DataFrame medium, that includes all the observations of cars that have a cars_per_cap between 100 and 500
medium = cars[np.logical_and(cars['cars_per_cap'] > 100, cars['cars_per_cap'] < 500)]

# Print out medium
medium

Unnamed: 0,country,drives_right,cars_per_cap
RU,Russia,True,200
