# Logic, Control Flow and Filtering

## Compare arrays
Out of the box, you can also use comparison operators with NumPy arrays.

This time there's two NumPy arrays: `my_house` and `your_house`. They both contain the areas for the kitchen, living room, bedroom and bathroom in the same order, so you can compare them.

### Instructions

Using comparison operators, generate boolean arrays that answer the following questions:

- Which areas in `my_house` are greater than or equal to `18`?
- You can also compare two NumPy arrays element-wise. Which areas in `my_house` are smaller than the ones in your_house?
- Make sure to wrap both commands in a **print()** statement so that you can inspect the output!

In [None]:
# Create arrays
import numpy as np
my_house = np.array([18.0, 20.0, 10.75, 9.50])
your_house = np.array([14.0, 24.0, 14.25, 9.0])

# my_house greater than or equal to 18
print(my_house >= 18)

# my_house less than your_house
print(my_house < your_house)

## Boolean operators with NumPy
Before, the operational operators like `<` and `>=` worked with NumPy arrays out of the box. Unfortunately, this is not true for the boolean operators `and`, `or`, and `not`.

To use these operators with NumPy, you will need **np.logical_and()**, **np.logical_or()** and **np.logical_not()**. Here's an example on the `my_house` and `your_house` arrays from before to give you an idea:

    np.logical_and(my_house > 13, 
                   your_house < 15)


### Instructions

- Generate boolean arrays that answer the following questions:
- Which areas in `my_house` are greater than `18.5` or smaller than `10`?
- Which areas are smaller than `11` in both `my_house` and `your_house`? Make sure to wrap both commands in **print()** statement, so that you can inspect the output.

In [None]:
# my_house greater than 18.5 or smaller than 10
print(np.logical_or(my_house > 18.5, my_house< 10))

# Both my_house and your_house smaller than 11
print(np.logical_and(my_house < 11, your_house < 11))

# Filtering pandas DataFrames

## Driving right (1)
Remember that `cars` dataset, containing the cars per 1000 people (`cars_per_cap`) and whether people drive right (`drives_right`) for different countries (`country`)? The code that imports this data in CSV format into Python as a DataFrame is included in the script.

In the video, you saw a step-by-step approach to filter observations from a DataFrame based on boolean arrays. Let's start simple and try to find all observations in `cars` where `drives_right` is `True`.

drives_right is a boolean column, so you'll have to extract it as a Series and then use this boolean Series to select observations from cars.

### Instructions

- Extract the `drives_right` column as a Pandas Series and store it as `dr`.
- Use `dr`, a boolean Series, to subset the `cars` DataFrame. Store the resulting selection in `sel`.
- Print `sel`, and assert that `drives_right` is `True` for all observations.

In [None]:
# Import cars data
import pandas as pd
cars = pd.read_csv('datasets/cars.csv', index_col = 0)

# Extract drives_right column as Series: dr
dr = cars['drives_right']

# Use dr to subset cars: sel
sel = cars[dr]

# Print sel
print(sel)

## Driving right (2)
The code in the previous example worked fine, but you actually unnecessarily created a new variable `dr`. You can achieve the same result without this intermediate variable. Put the code that computes dr straight into the square brackets that select observations from `cars`.

### Instructions

- Convert the code to a one-liner that calculates the variable `sel` as before.

In [None]:
# Convert code to a one-liner
sel = cars[cars['drives_right']]

# Print sel
print(sel)

## Cars per capita (1)
Let's stick to the `cars` data some more. This time you want to find out which countries have a high cars per capita figure. In other words, in which countries do many people have a car, or maybe multiple cars.

Similar to the previous example, you'll want to build up a boolean Series, that you can then use to subset the `cars` DataFrame to select certain observations. If you want to do this in a one-liner, that's perfectly fine!

### Instructions

- Select the `cars_per_cap` column from `cars` as a Pandas Series and store it as `cpc`.
- Use `cpc` in combination with a comparison operator and `500`. You want to end up with a boolean Series that's `True` if the corresponding country has a `cars_per_cap` of more than `500` and `False` otherwise. Store this boolean Series as `many_cars`.
- Use `many_cars` to subset `cars`, similar to what you did before. Store the result as `car_maniac`.
- Print out `car_maniac` to see if you got it right.

In [None]:
# Create car_maniac: observations that have a cars_per_cap over 500
car_maniac = cars[cars['cars_per_cap'] > 500]

# Print car_maniac
print(car_maniac)

## Cars per capita (2)
Remember about **np.logical_and()**, **np.logical_or()** and **np.logical_not()**, the NumPy variants of the `and`, `or` and `not` operators? You can also use them on Pandas Series to do more advanced filtering operations.

Take this example that selects the observations that have a `cars_per_cap` between 10 and 80. Try out these lines of code step by step to see what's happening.
    
    cpc = cars['cars_per_cap']
    between = np.logical_and(cpc > 10, cpc < 80)
    medium = cars[between]


### Instructions

- Use the code sample provided to create a DataFrame `medium`, that includes all the observations of cars that have a `cars_per_cap` between `100` and `500`.
- Print out `medium`.

In [None]:
# Create medium: observations with cars_per_cap between 100 and 500
# medium = cars[np.logical_and(cars['cars_per_cap'] > 100, cars['cars_per_cap'] > 500)]
cpc = cars['cars_per_cap']
between = np.logical_and(cpc > 100, cpc < 500)
medium = cars[between]

# Print medium
print(medium)