# This jupyter notebook adjoins the datacamp Logic, Control Flow, and Filtering lesson
<hr>

## Operators
Comparison operators can tell how Python values relate and results in a boolean. 

<img src="Images/Operators.jpg"></img>

<hr>

### Number Comparisons

In [1]:
2 < 3  

True

In [2]:
2 == 3

False

In [3]:
2 <= 3

True

In [4]:
3 <= 3

True

In [16]:
3 < 4.1 

True

In [6]:
x = 2
y = 3
x < y

True

<hr>

### String Comparisons: 

For string comparison, Python determines the relationship based on alphabetical order.

In [10]:
"Carl" < "Chris"

True

In [14]:
str(3) < "Chris"

True

<hr>

### Note: Make Sure you are using comparison operators on the same data types

<hr>

### Numpy Comparisons 

In [17]:
import numpy as np

In [18]:
array = np.array([1,2,3,4,5])
array > 3 

array([False, False, False,  True,  True])

Numpy figures out that you want to compare every element of array with 3 and returns corresponding booleans. 

Behind the scenes numpy builds a numpy array with the same size as the array we are doing the comparison on filled with the value 3 in our case. It then preforms an element wise comparison. This is concise and efficient. 

<hr>

## Equality
To check if two Python values, or variables, are equal you can use ==. To check for inequality, you need !=. As a refresher, have a look at the following examples that all result in True. Feel free to try them out in the IPython Shell.
```
2 == (1 + 1)
"intermediate" != "python"
True != False
"Python" != "python"
```
When you write these comparisons in a script, you will need to wrap a print() function around them to see the output.

### Instructions
- In the editor on the right, write code to see if True equals False.
- Write Python code to check if -5 * 15 is not equal to 75.
- Ask Python whether the strings "pyscript" and "PyScript" are equal.
- What happens if you compare booleans and integers? 
- Write code to see if True and 1 are equal.

In [19]:
# Comparison of booleans
print(True == False)

# Comparison of integers
print(-5 * 15 != 75)

# Comparison of strings
print("pyscript" == "PyScript")

# Compare a boolean with an integer
print(True == 1)

False
True
False
True


<hr> 

## Greater and less than
In the video, Filip also talked about the less than and greater than signs, < and > in Python. You can combine them with an equals sign: <= and >=. Pay attention: <= is valid syntax, but =< is not.

All Python expressions in the following code chunk evaluate to True:
```
3 < 4
3 <= 4
"alpha" <= "beta"
```
Remember that for string comparison, Python determines the relationship based on alphabetical order.

### Instructions
- Write Python expressions, wrapped in a print() function, to check whether:
    - x is greater than or equal to -10. x has already been defined for you.
    - "test" is less than or equal to y. y has already been defined for you.
    - True is greater than False.

In [20]:
# Comparison of integers
x = -3 * 6
print(x >= -10)

# Comparison of strings
y = "test"
print("test" <= y)

# Comparison of booleans
print(True > False)

False
True
True


<hr>
 
## Compare arrays
Out of the box, you can also use comparison operators with Numpy arrays.

Remember areas, the list of area measurements for different rooms in your house from the previous course? This time there's two Numpy arrays: my_house and your_house. They both contain the areas for the kitchen, living room, bedroom and bathroom in the same order, so you can compare them.

### Instructions
- Using comparison operators, generate boolean arrays that answer the following questions:
    - Which areas in my_house are greater than or equal to 18?
    - You can also compare two Numpy arrays element-wise. Which areas in my_house are smaller than the ones in your_house?

Make sure to wrap both commands in a print() statement, so that you can inspect the output.

In [21]:
# Create arrays
import numpy as np
my_house = np.array([18.0, 20.0, 10.75, 9.50])
your_house = np.array([14.0, 24.0, 14.25, 9.0])

# my_house greater than or equal to 18
print(my_house >= 18)

# my_house less than your_house
print(my_house < your_house)

[ True  True False False]
[False  True  True False]


### Note: Comparisons are done element wise for numpy arrays

This is not the case with python lists 

**i.e.**

In [23]:
list1 = [1,2,3,4]
list2 = [2,4,6,8]
print(list1 == list2)

False


<hr>

## Boolean Operators 

- ```and``` - returns true if both booleans themselves are true
- ```or``` - returns true if one of the booleans themselves are ture
- ```not``` - negates the boolean value you use it on 

To check if a value is greater than one value but less than another value use the and operator

In [26]:
x = 12 
x > 5 and x < 15

True

To check if at least one condition is met use the or operator

In [27]:
y = 5 
y < 7 or y > 13

True

The ```not``` operator is useful if you're combining different booleans operations and then want to negate that result. 

Examples:

In [29]:
print(not True)
print(not False)

False
True


<hr>

## Numpy Array Boolean Operators

Numpy arrays react differently to Boolean Operators 

In [30]:
array = np.array([1,2,3,4,5,6,7])

We can find out which values are higher than 4 but lower than 6

In [31]:
array > 4

array([False, False, False, False,  True,  True,  True])

In [32]:
array < 6

array([ True,  True,  True,  True,  True, False, False])

When we combine these comparisons with the ```and``` operator for numpy arrays we get the following error.

<img src="Images/numpy_and.jpg"></img>

The truth value of an array with more than one element is ambiguous and doesn't like an array of booleans to work on.

Instead we need to use the following functions which are preformed element wise

- ```logical_and()```
- ```logical_or()```
- ```logical_not()```



In [42]:
np.logical_and(array > 3, array < 6)

array([False, False, False,  True,  True, False, False])

In [46]:
print(array[np.logical_and(array > 3, array < 6)])

[4 5]


<hr>

## and, or, not (1)
A boolean is either 1 or 0, True or False. With boolean operators such as and, or and not, you can combine these booleans to perform more advanced queries on your data.

In the sample code on the right, two variables are defined: my_kitchen and your_kitchen, representing areas.

### Instructions
- Write Python expressions, wrapped in a print() function, to check whether:
    - my_kitchen is bigger than 10 and smaller than 18.
    - my_kitchen is smaller than 14 or bigger than 17.
    - double the area of my_kitchen is smaller than triple the area of your_kitchen.

In [47]:
# Define variables
my_kitchen = 18.0
your_kitchen = 14.0

# my_kitchen bigger than 10 and smaller than 18?
print(my_kitchen > 10 and my_kitchen < 18)

# my_kitchen smaller than 14 or bigger than 17?
print(my_kitchen < 14 or my_kitchen > 17)


# Double my_kitchen smaller than triple your_kitchen?
print(my_kitchen * 2 < your_kitchen * 3)

False
True
True


<hr> 

## and, or, not (2)
To see if you completely understood the boolean operators, have a look at the following piece of Python code:
```
x = 8
y = 9
not(not(x < 3) and not(y > 14 or y > 10))
```
What will the result be if you execute these three commands in the IPython Shell?

NB: Notice that not has a higher priority than and and or, it is executed first.

### Instructions
Possible Answers:

a. True

b. False

c. Running these commands will result in an error.

Correct Answer: b. false 

<hr>

## Boolean operators with Numpy
Before, the operational operators like < and >= worked with Numpy arrays out of the box. Unfortunately, this is not true for the boolean operators ```and```, ```or```, and ```not```.

To use these operators with Numpy, you will need ```np.logical_and()```, ```np.logical_or()``` and ```np.logical_not()```. Here's an example on the my_house and your_house arrays from before to give you an idea:
```
np.logical_and(your_house > 13, 
               your_house < 15)
```

### Instructions
- Generate boolean arrays that answer the following questions:
- Which areas in my_house are greater than 18.5 or smaller than 10?
- Which areas are smaller than 11 in both my_house and your_house? Make sure to wrap both commands in print() statement, so that you can inspect the output.




In [48]:
# Create arrays
import numpy as np
my_house = np.array([18.0, 20.0, 10.75, 9.50])
your_house = np.array([14.0, 24.0, 14.25, 9.0])

# my_house greater than 18.5 or smaller than 10
print(np.logical_or(my_house > 18.5, my_house < 10))

# Both my_house and your_house smaller than 11
print(np.logical_and(my_house < 11, your_house < 11))

[False  True False  True]
[False False False  True]


### Multiple Numpy array and comparisons

In [50]:
print(np.logical_and(np.logical_and(my_house < 11, your_house < 11), your_house < 9))

[False False False False]


<hr>

## ```if```, ```elif```, ```else```


### ```if``` statement:  
```
if condition : 
    expression
```
Read as if the condition is met execute the expression

**Note:**  the colon at the end of the conditon and the fact that you have to indent python code with four spaces to tell Python what to do in the case that the condition succeeds 

To exit the if statement, simply write Python code without indentation, and Python will know that it's not part of the if statement. 

You can have more than one line of code in the if statement 

If the condition does not pass the expression is not executed. 

### ```else``` statement:
```
if condition : 
    expression
else : 
    expression
```
Read as if the condition is not met then execute the expression in the else statement.

For an else statement you do not have to supply a specific condition. The corresponding expression gets run if the condition of the if statement it belongs to does not hold. 


### ```elif``` statement:
```
if condition : 
    expression
elif condition :
    expression
else : 
    expression
```

If the first condition is false then the second condition (```elif``` condition is checked), if the ```elif``` condition holds true then it executes its corresponding expression. 

If the first ```if``` condition and the ```elif``` condition both result to false then the ```else``` statement expression(s) will be run.

As soon as Python bumps into a condition that is true, it executes the corresponding code and then leaves the control structure after that, meaning no other conditions and corresponding code will be tested and run. 

Meaning that if the first statement is true then the second condition, corresponding to the ```elif```, is never reached. 

<hr>

## Warmup 
To experiment with if and else a bit, have a look at this code sample:
```
area = 10.0
if(area < 9) :
    print("small")
elif(area < 12) :
    print("medium")
else :
    print("large")
```
What will the output be if you run this piece of code in the IPython Shell?

### Instructions

Possible Answers
a. small

b. medium

c. large

d. The syntax is incorrect; this code will produce an error.

Correct Answer: b. medium 

<hr>

## ```if```
It's time to take a closer look around in your house.

Two variables are defined in the sample code: room, a string that tells you which room of the house we're looking at, and area, the area of that room.

### Instructions

- Examine the if statement that prints out "Looking around in the kitchen." if room equals "kit".
- Write another if statement that prints out "big place!" if area is greater than 15.

In [51]:
# Define variables
room = "kit"
area = 14.0

# if statement for room
if room == "kit" :
    print("looking around in the kitchen.")

# if statement for area
if area > 15 : 
    print("big place!")

looking around in the kitchen.


<hr>

## ```else```
On the right, the if construct for room has been extended with an else statement so that "looking around elsewhere." is printed if the condition room == "kit" evaluates to False.

Can you do a similar thing to add more functionality to the if construct for area?

Instructions
- Add an else statement to the second control structure so that "pretty small." is printed out if area > 15 evaluates to False.

In [52]:
# Define variables
room = "kit"
area = 14.0

# if-else construct for room
if room == "kit" :
    print("looking around in the kitchen.")
else :
    print("looking around elsewhere.")

# if-else construct for area
if area > 15 :
    print("big place!")
else: 
    print("pretty small.")

looking around in the kitchen.
pretty small.


## ```elif```
It's also possible to have a look around in the bedroom. The sample code contains an elif part that checks if room equals "bed". In that case, "looking around in the bedroom." is printed out.

It's up to you now! Make a similar addition to the second control structure to further customize the messages for different values of area.

### Instructions
- Add an elif to the second control structure such that "medium size, nice!" is printed out if area is greater than 10.

In [53]:
# Define variables
room = "bed"
area = 14.0

# if-elif-else construct for room
if room == "kit" :
    print("looking around in the kitchen.")
elif room == "bed":
    print("looking around in the bedroom.")
else :
    print("looking around elsewhere.")

# if-elif-else construct for area
if area > 15 :
    print("big place!")
elif area > 10 :
    print("medium size, nice!")
else :
    print("pretty small.")

looking around in the bedroom.
medium size, nice!


<hr>

## Filtering Pandas DataFrame

step 1. Select a area column (making sure it is a series)

step 2. Perform a comparison on this column 

step 3. store it as a variable so that we can use it to index the data frame


In [55]:
import pandas as pd
cars = pd.read_csv("cars.csv", index_col=0)
cars

Unnamed: 0,cars_per_cap,country,drives_right
US,809,United States,True
AUS,731,Australia,False
JAP,588,Japan,False
IN,18,India,False
RU,200,Russia,True
MOR,70,Morocco,True
EG,45,Egypt,True


<hr>
## Driving right
Remember that cars dataset, containing the cars per 1000 people (cars_per_cap) and whether people drive right (drives_right) for different countries (country)? 

In the video, you saw a step-by-step approach to filter observations from a DataFrame based on boolean arrays. Let's start simple and try to find all observations in cars where drives_right is True.

drives_right is a boolean column, so you'll have to extract it as a Series and then use this boolean Series to select observations from cars.

### Instructions
- Extract the drives_right column as a Pandas Series and store it as dr.
- Use dr, a boolean Series, to subset the cars DataFrame. Store the resulting selection in sel.
- Print sel, and assert that drives_right is True for all observations.

In [56]:
# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Extract drives_right column as Series: dr
dr = cars["drives_right"]
dr = cars.loc[:,"drives_right"]
dr = cars.iloc[:, 2]

# Use dr to subset cars: sel
sel = cars[dr == True]

# Print sel
print(sel)

     cars_per_cap        country  drives_right
US            809  United States          True
RU            200         Russia          True
MOR            70        Morocco          True
EG             45          Egypt          True


<hr>

## Driving right (2)
The code in the previous example worked fine, but you actually unnecessarily created a new variable dr. You can achieve the same result without this intermediate variable. Put the code that computes dr straight into the square brackets that select observations from cars.

### Instructions
Convert the code on the right to a one-liner that calculates the variable sel as before.

In [58]:
# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Convert code to a one-liner
dr = cars['drives_right']
sel = cars[dr == True]

# Conversion below
sel = cars[cars['drives_right'] == True] 

# Print sel
print(sel)

     cars_per_cap        country  drives_right
US            809  United States          True
RU            200         Russia          True
MOR            70        Morocco          True
EG             45          Egypt          True


<hr>

## Cars per capita (1)
Let's stick to the cars data some more. This time you want to find out which countries have a high cars per capita figure. In other words, in which countries do many people have a car, or maybe multiple cars.

Similar to the previous example, you'll want to build up a boolean Series, that you can then use to subset the cars DataFrame to select certain observations. If you want to do this in a one-liner, that's perfectly fine!

### Instructions
- Select the cars_per_cap column from cars as a Pandas Series and store it as cpc.
- Use cpc in combination with a comparison operator and 500. You want to end up with a boolean Series that's True if the corresponding country has a cars_per_cap of more than 500 and False otherwise. Store this boolean Series as many_cars.
- Use many_cars to subset cars, similar to what you did before. Store the result as car_maniac.
- Print out car_maniac to see if you got it right.

In [60]:
# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Create car_maniac: observations that have a cars_per_cap over 500
cpc = cars["cars_per_cap"]
many_cars = cpc > 500
car_maniac = cars[many_cars]

# one liner
car_maniac = cars[cars["cars_per_cap"] > 500]

# Print car_maniac
print(car_maniac)

     cars_per_cap        country  drives_right
US            809  United States          True
AUS           731      Australia         False
JAP           588          Japan         False


## Cars per capita (2)
Remember about np.logical_and(), np.logical_or() and np.logical_not(), the Numpy variants of the and, or and not operators? You can also use them on Pandas Series to do more advanced filtering operations.

Take this example that selects the observations that have a cars_per_cap between 10 and 80. Try out these lines of code step by step to see what's happening.
```
cpc = cars['cars_per_cap']
between = np.logical_and(cpc > 10, cpc < 80)
medium = cars[between]
```
### Instructions
- Use the code sample above to create a DataFrame medium, that includes all the observations of cars that have a cars_per_cap between 100 and 500.
- Print out medium.

In [61]:
# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Import numpy, you'll need this
import numpy as np

# Create medium: observations with cars_per_cap between 100 and 500
medium = cars[np.logical_and(cars["cars_per_cap"] > 100, cars["cars_per_cap"] < 500)]


# Print medium
print(medium)

    cars_per_cap country  drives_right
RU           200  Russia          True
