#**Logic, Control Flow and Filtering**
Boolean logic is the foundation of decision-making in Python programs. Learn about different comparison operators, how to combine them with Boolean operators, and how to use the Boolean outcomes in control structures. You'll also learn to filter data in pandas DataFrames using logic.

## **1. Comparison Operators**
Operators that can tell how two Python values relate, and result in a boolean.

### **1.1 Numpy recap**


In [15]:
# Code from Intro to Python for Data Science, Chapter 4
import numpy as np
np_height = np.array([1.73, 1.68, 1.71, 1.89, 1.79])
np_weight = np.array([65.4, 59.2, 63.6, 88.4, 68.7])
bmi = np_weight / np_height ** 2
bmi

array([21.85171573, 20.97505669, 21.75028214, 24.7473475 , 21.44127836])

In [16]:
bmi > 23

array([False, False, False,  True, False])

In [17]:
bmi[bmi > 23]

array([24.7473475])

### **1.2 Numeric comparisons**

In [18]:
2 < 3

True

In [None]:
2 == 3

In [19]:
2 <= 3

True

In [20]:
3 <= 3

True

In [21]:
x = 2
y = 3
x < y

True

### **1.3 Other comparisons**

In [None]:
"carl" < "chris" # Accordin with alphabet, carl comes before chris

In [None]:
3 < "chris" # error

In [None]:
3 < 4.1 # Numeric types are exceptions

In [22]:
bmi

array([21.85171573, 20.97505669, 21.75028214, 24.7473475 , 21.44127836])

In [23]:
bmi > 23

array([False, False, False,  True, False])

### **1.4 Comparators**

strictly less than ( < )

---


less than or equal ( <= )

---


strictly greater than ( > )

---


greater than or equal ( >= )

---


equal ( == )

---



---


not equal ( != )

### **1.5 Let's practice!**

#### **1.5.1 Equality**
To check if two Python values, or variables, are equal you can use ==. To check for inequality, you need !=. As a refresher, have a look at the following examples that all result in True. Feel free to try them out.

In [24]:
print(2 == (1 + 1))
print("intermediate" != "python")
print(True != False)
print("Python" != "python")

True
True
True
True


In [27]:
# write code to see if True equals False.
print(True == False)
#  check if -5 * 15 is not equal to 75
print((-5 * 15) != 75)
# Ask Python whether the strings "pyscript" and "PyScript" are equal.
print("pyscript" == "PyScript")
# What happens if you compare booleans and integers? Write code to see if True and 1 are equal.
print(True == 1)

False
True
False
True


#### **1.5.2 Greater and less than**
In the video, Hugo also talked about the less than and greater than signs, **<** and **>** bold text in Python. You can combine them with an equals sign: **<=** and **>=**. Pay attention: <= is valid syntax, but =< is not.
Remember that for string comparison, Python determines the relationship based on alphabetical order.
All Python expressions in the following code chunk evaluate to True:

In [28]:
print(3 < 4)
print(3 <= 4)
print("alpha" <= "beta")

True
True
True


In [30]:
# x is greater than or equal to -10. x has already been defined for you.
x = -3 * 6
print(x >= -10)

False


In [31]:
# "test" is less than or equal to y. y has already been defined for you.
y = "test"
"test" <= y

True

In [32]:
# True is greater than False.
True > False

True

#### **1.5.3 Compare arrays**
Out of the box, you can also use comparison operators with Numpy arrays.

Remember areas, the list of area measurements for different rooms in your house from Introduction to Python? This time there's two Numpy arrays: my_house and your_house. They both contain the areas for the kitchen, living room, bedroom and bathroom in the same order, so you can compare them.

In [33]:
# Create arrays
import numpy as np
my_house = np.array([18.0, 20.0, 10.75, 9.50])
your_house = np.array([14.0, 24.0, 14.25, 9.0])

In [37]:
# Which areas in my_house are greater than or equal to 18?
my_house >= 18

array([ True,  True, False, False])

In [38]:
# You can also compare two Numpy arrays element-wise. Which areas in my_house are smaller than the ones in your_house?
my_house < your_house

array([False,  True,  True, False])

## **2. Boolean Operators**

*  and
*  or
*  not





### **2.1. and**

In [41]:
True and True # the only one that brings the "true"

True

In [44]:
x = 12
x > 5 and x < 15
# True       True

True

In [45]:
False and True

False

In [46]:
True and False

False

In [47]:
False and False

False

### **2.2. or**

In [49]:
True or True

True

In [50]:
False or True

True

In [51]:
True or False

True

In [52]:
False or False

False

In [53]:
y = 5
y < 7 or y > 13

True

### **2.3. not**

In [54]:
not True

False

In [56]:
not False

True

### **2.4. NumPy**

In [58]:
bmi

array([21.85171573, 20.97505669, 21.75028214, 24.7473475 , 21.44127836])

In [59]:
bmi > 21

array([ True, False,  True,  True,  True])

In [60]:
bmi < 22

array([ True,  True,  True, False,  True])

In [61]:
bmi > 21 and bmi < 22 # error

ValueError: ignored

**The NumPy arrays equivalent to Boolean Operators**


*  logical_and()
*  logical_or()
*  logical_not()

In [62]:
np.logical_and(bmi > 21, bmi <22)

array([ True, False,  True, False,  True])

In [63]:
bmi[np.logical_and(bmi > 21, bmi < 22)] # selecting the bmis

array([21.85171573, 21.75028214, 21.44127836])

### **2.5. Let's practice!**

#### **2.5.1. and, or, not (1)**
A boolean is either 1 or 0, True or False. With boolean operators such as and, or and not, you can combine these booleans to perform more advanced queries on your data.

In the sample code on the right, two variables are defined: my_kitchen and your_kitchen, representing areas

In [65]:
# Define variables
my_kitchen = 18.0
your_kitchen = 14.0

In [69]:
# my_kitchen is bigger than 10 and smaller than 18.
my_kitchen > 10  and my_kitchen < 18

True

In [70]:
# my_kitchen is smaller than 14 or bigger than 17.
my_kitchen < 14 or my_kitchen > 17

True

In [71]:
# double the area of my_kitchen is smaller than triple the area of your_kitchen
2 * my_kitchen < 3 * your_kitchen

True

#### **2.5.2. and, or, bot (2)**
To see if you completely understood the boolean operators, have a look at the following piece of Python code:

*NB: Notice that not has a higher priority than and and or, it is executed first*

In [73]:
x = 8
y = 9
not(not(x < 3) and not(y > 14 or y > 10))
# What will the result be if you execute these three commands in the IPython Shell?

False

#### **2.5.3. Boolean operators with Numpy**
Before, the operational operators like < and >= worked with Numpy arrays out of the box. Unfortunately, this is not true for the boolean operators and, or, and not.

**To use these operators with Numpy, you will need np.logical_and(), np.logical_or() and np.logical_not().**

Here's an example on the my_house and your_house arrays from before to give you an idea:

In [74]:
np.logical_and(my_house > 13, 
               your_house < 15)

array([ True, False, False, False])

In [75]:
# Create arrays
import numpy as np
my_house = np.array([18.0, 20.0, 10.75, 9.50])
your_house = np.array([14.0, 24.0, 14.25, 9.0])

In [76]:
# Which areas in my_house are greater than 18.5 or smaller than 10?
np.logical_and(my_house > 18.5, my_house < 10)

array([False, False, False, False])

In [77]:
# Which areas are smaller than 11 in both my_house and your_house? Make sure to wrap both commands in print() statement, so that you can inspect the output.

print(np.logical_and(my_house < 11, your_house < 11))

[False False False  True]


## **3. if, elif, else**

### **3.1. Overview**

* Comparison operators
  * <, >, >=, ,=, ==, !=

* Boolean Operators
  * and, or, not

* Conditional Statements
  * if, else, elif

### **3.2. if**

    if condition :
        expression

control.py

In [81]:
z = 4
if z % 2 == 0 :     #True
  print("z is even")

z is even


    if condition :
        expression

* expression not part of if

    if condition :
        expression

control.py

In [83]:
z = 4
if z % 2 == 0 :
  print("checking " + str(z))
  print("z is even")

checking 4
z is even


In [84]:
z = 5
if z % 2 == 0 : # False =  No outcome
  print("checking " + str(z))
  print("z is even")

### **3.3 else**
    if condition :
        expression
    else :
        expression

control.py

In [86]:
z = 5
if z % 2 == 0 : # False
  print("z is even")
else :
  print("z is odd") #since is not True

z is odd


### **3.4 elif**
    if condition :
      expression
    elif condition :
      expression
    else :
      expression

control.py

In [88]:
z = 3
if z % 2 == 0 :
  print("z is divisible by 2") # False
elif z % 3 == 0 :
  print("z is divisible by 3") # True
else :
  print("z is neither divisible by 2 nor by 3")

z is divisible by 3


In [89]:
z = 6
if z % 2 == 0 :
  print("z is divisible by 2") # True
elif z % 3 == 0 :
  print("z is divisible by 3") # Never reached
else :
  print("z is neither divisible by 2 nor by 3")

z is divisible by 2


### **3.5. Let's Practice!**

#### **3.5.1. Warmup**
To experiment with if and else a bit, have a look at this code sample:

In [92]:
area = 10.0
if(area < 9) :
    print("small")
elif(area < 12) :
    print("medium")
else :
    print("large")

# What will the output be if you run this piece of code in the IPython Shell?

medium


#### **3.5.2. if**
It's time to take a closer look around in your house.

Two variables are defined in the sample code: room, a string that tells you which room of the house we're looking at, and area, the area of that room.

In [93]:
# Define variables
room = "kit"
area = 14.0

In [94]:
# Examine the if statement that prints out 
#"Looking around in the kitchen." if room equals "kit"
if room == "kit" :
  print("Looking around in the kitchen.")

Looking arround in the kitchen.


In [95]:
# Write another if statement that prints out "big place!" if area is greater than 15.
if area > 15 :
  print("big place!") # Will not print anything, since the are is smaller (14.0)

#### **3.5.3. Add else**
On the right, the **if** construct for room has been extended with an **else** statement so that "looking around elsewhere." is printed if the condition room == "kit" evaluates to False.

Can you do a similar thing to add more functionality to the if construct for area?

In [99]:
# Define variables
room = "kit"
area = 14.0

# if-else construct for room
if room == "kit" :
    print("looking around in the kitchen.")
else :
    print("looking around elsewhere.")


looking around in the kitchen.


In [100]:
# Add an else statement to the second control structure so that "pretty small." is printed out if area > 15 evaluates to False.
if area >  15 :
  print("big place!")
else:
  print("pretty small")

pretty small


#### **3.5.4. Customize further: elif**
It's also possible to have a look around in the bedroom. The sample code contains an **elif** part that checks if room equals "bed". In that case, "looking around in the bedroom." is printed out.

It's up to you now! Make a similar addition to the second control structure to further customize the messages for different values of area.

In [101]:
# Define variables
room = "bed"
area = 14.0

In [102]:
if room == "kit" :
  print("looking around in the kitcher.")
elif room == "bed" :
  print("looking around in the bedroom.")
else :
  print("looking around elsewere.")


looking around in the bedroom.


In [104]:
# Add an elif to the second control structure such that "medium size, nice!" is printed out if area is greater than 10.
if area > 15 :
  print("big place!")
elif area > 10 :
  print("medium size, nice!")
else:
  print("pretty small.")

medium size, nice!


## **4. Filtering pandas DataFrames**

### **4.1 brics**

In [108]:
import pandas as pd
brics = pd.read_csv("https://assets.datacamp.com/production/repositories/287/datasets/b60fb5bdbeb4e4ab0545c485d351e6ff5428a155/brics.csv", index_col = 0)
brics

Unnamed: 0,country,capital,area,population
BR,Brazil,Brasilia,8.516,200.4
RU,Russia,Moscow,17.1,143.5
IN,India,New Delhi,3.286,1252.0
CH,China,Beijing,9.597,1357.0
SA,South Africa,Pretoria,1.221,52.98


### **4.2. Goal**
* Select countries with area over 8 million km²
* 3 steps
  * Select the area column
  * Do comparison on area column
  * Use result to select countries

### **4.3. Step 1: Get column**

In [110]:
brics["area"]

BR     8.516
RU    17.100
IN     3.286
CH     9.597
SA     1.221
Name: area, dtype: float64

* **Alternatives:**

  brics.loc[:, "area"]
  
  brics.iloc[:,2]

### **4.4 Step 2: Compare**

In [111]:
brics["area"]

BR     8.516
RU    17.100
IN     3.286
CH     9.597
SA     1.221
Name: area, dtype: float64

In [112]:
brics["area"] > 8

BR     True
RU     True
IN    False
CH     True
SA    False
Name: area, dtype: bool

In [113]:
is_huge = brics["area"] > 8

### **4.5. Step 3: Subset DF**

In [115]:
is_huge

BR     True
RU     True
IN    False
CH     True
SA    False
Name: area, dtype: bool

In [116]:
brics[is_huge]

Unnamed: 0,country,capital,area,population
BR,Brazil,Brasilia,8.516,200.4
RU,Russia,Moscow,17.1,143.5
CH,China,Beijing,9.597,1357.0


### **4.6. Summary**

In [117]:
brics

Unnamed: 0,country,capital,area,population
BR,Brazil,Brasilia,8.516,200.4
RU,Russia,Moscow,17.1,143.5
IN,India,New Delhi,3.286,1252.0
CH,China,Beijing,9.597,1357.0
SA,South Africa,Pretoria,1.221,52.98


In [118]:
is_huge = brics["area"] > 8
brics[is_huge]

Unnamed: 0,country,capital,area,population
BR,Brazil,Brasilia,8.516,200.4
RU,Russia,Moscow,17.1,143.5
CH,China,Beijing,9.597,1357.0


In [119]:
brics[brics["area"] > 8]

Unnamed: 0,country,capital,area,population
BR,Brazil,Brasilia,8.516,200.4
RU,Russia,Moscow,17.1,143.5
CH,China,Beijing,9.597,1357.0


### **4.7. Boolean operators**

In [121]:
brics

Unnamed: 0,country,capital,area,population
BR,Brazil,Brasilia,8.516,200.4
RU,Russia,Moscow,17.1,143.5
IN,India,New Delhi,3.286,1252.0
CH,China,Beijing,9.597,1357.0
SA,South Africa,Pretoria,1.221,52.98


In [122]:
import numpy as np
np.logical_and(brics["area"] > 8, brics["area"] < 10)

BR     True
RU    False
IN    False
CH     True
SA    False
Name: area, dtype: bool

In [123]:
brics[np.logical_and(brics["area"] > 8, brics["area"] < 10)]

Unnamed: 0,country,capital,area,population
BR,Brazil,Brasilia,8.516,200.4
CH,China,Beijing,9.597,1357.0


### **4.8. Let's Practice!**

#### **4.8.1. Driving right (1)**
Remember that cars dataset, containing the cars per 1000 people (cars_per_cap) and whether people drive right (drives_right) for different countries (country)? The code that imports this data in CSV format into Python as a DataFrame is available on the right.

In the video, you saw a step-by-step approach to filter observations from a DataFrame based on boolean arrays. Let's start simple and try to find all observations in cars where drives_right is True.

drives_right is a boolean column, so you'll have to extract it as a Series and then use this boolean Series to select observations from cars.

In [124]:
# Import cars data
import pandas as pd
cars = pd.read_csv("https://assets.datacamp.com/production/repositories/287/datasets/79b3c22c47a2f45a800c62cae39035ff2ea4e609/cars.csv", index_col = 0)

In [126]:
# Extract the drives_right column as a Pandas Series and store it as dr.
dr = cars["drives_right"]

In [127]:
# Use dr, a boolean Series, to subset the cars DataFrame. Store the resulting selection in sel.
sel = cars[dr]

In [128]:
# Print sel, and assert that drives_right is True for all observations.
print(sel)

     cars_per_cap        country  drives_right
US            809  United States          True
RU            200         Russia          True
MOR            70        Morocco          True
EG             45          Egypt          True


#### **4.8.2. Driving right (2)**
The code in the previous example worked fine, but you actually unnecessarily created a new variable dr. You can achieve the same result without this intermediate variable. Put the code that computes dr straight into the square brackets that select observations from cars.

In [129]:
# Convert the code on the right to a one-liner that calculates the variable sel as before.
print(cars[cars["drives_right"]])

     cars_per_cap        country  drives_right
US            809  United States          True
RU            200         Russia          True
MOR            70        Morocco          True
EG             45          Egypt          True


#### **4.8.3. Cars per capita(1)**
Let's stick to the cars data some more. This time you want to find out which countries have a high cars per capita figure. In other words, in which countries do many people have a car, or maybe multiple cars.

Similar to the previous example, you'll want to build up a boolean Series, that you can then use to subset the cars DataFrame to select certain observations. If you want to do this in a one-liner, that's perfectly fine!

In [135]:
# Select the cars_per_cap column from cars as a Pandas Series and 
# store it as cpc.
cpc = cars["cars_per_cap"]

In [136]:
# Use cpc in combination with a comparison operator and 500. 
# You want to end up with a boolean Series that's True if the corresponding 
# country has a cars_per_cap of more than 500 and False otherwise. 
# Store this boolean Series as many_cars.
many_cars = cpc > 500

In [139]:
# Use many_cars to subset cars, similar to what you did before. Store the result as car_maniac.
car_maniac = cars[many_cars]
car_maniac

Unnamed: 0,cars_per_cap,country,drives_right
US,809,United States,True
AUS,731,Australia,False
JAP,588,Japan,False


#### **4.8.4. Cars per capita (2)**
Remember about np.logical_and(), np.logical_or() and np.logical_not(), the Numpy variants of the and, or and not operators? You can also use them on Pandas Series to do more advanced filtering operations.

Take this example that selects the observations that have a cars_per_cap between 10 and 80. Try out these lines of code step by step to see what's happening.

In [141]:
cpc = cars['cars_per_cap']
between = np.logical_and(cpc > 10, cpc < 80)
medium = cars[between]
medium

Unnamed: 0,cars_per_cap,country,drives_right
IN,18,India,False
MOR,70,Morocco,True
EG,45,Egypt,True


In [143]:
# Use the code sample above to create a DataFrame medium, that includes all the 
# observations of cars that have a cars_per_cap between 100 and 500.
cpc = cars['cars_per_cap']
between = np.logical_and(cpc >= 100, cpc <= 500)
medium = cars[between]
medium

Unnamed: 0,cars_per_cap,country,drives_right
RU,200,Russia,True
