# 🔷 Comparison Operators — Intermediate Python



## 📋 NumPy Recap

```python
import numpy as np

np_height = np.array([1.73, 1.68, 1.71, 1.89, 1.79])
np_weight = np.array([65.4, 59.2, 63.6, 88.4, 68.7])

bmi = np_weight / np_height ** 2
bmi
```

Output:
```
array([21.852, 20.975, 21.75 , 24.747, 21.441])
```

Check BMI > 23:
```python
bmi > 23
```
Output:
```
array([False, False, False,  True, False])
```

Filter values:
```python
bmi[bmi > 23]
```
Output:
```
array([24.747])
```

---

## 📋 Comparison Operators

Comparison operators allow you to test relationships between values.

### Numeric comparisons:
```python
2 < 3
```
Output:
```
True
```

```python
2 == 3
```
Output:
```
False
```

```python
2 <= 3
```
Output:
```
True
```

```python
3 <= 3
```
Output:
```
True
```

```python
x = 2
y = 3
x < y
```
Output:
```
True
```

---

### Other comparisons:
```python
"carl" < "chris"
```
Output:
```
True
```

```python
3 < "chris"
```
Output:
```
TypeError: unorderable types: int() < str()
```

```python
3 < 4.1
```
Output:
```
True
```

---

### BMI comparisons:
```python
bmi
```
Output:
```
array([21.852, 20.975, 21.75 , 24.747, 21.441])
```

```python
bmi > 23
```
Output:
```
array([False, False, False, True, False])
```

---

## 📋 Common Comparators

| Comparator | Meaning                 |
|------------|-------------------------|
| `<`        | Strictly less than      |
| `<=`       | Less than or equal      |
| `>`        | Strictly greater than   |
| `>=`       | Greater than or equal   |
| `==`       | Equal                   |
| `!=`       | Not equal               |

---



# Boolean Operators — Intermediate Python


---

## Boolean Operators in Python

Boolean operators allow you to combine multiple conditions.

Main operators:
- `and` → True only if **both** are True.
- `or` → True if **at least one** is True.
- `not` → flips True ↔ False.

---

## The `and` operator: both conditions must be True

```python
True and True
```
Output:  
`True` → because both sides are True.

---

```python
x = 12
x > 5 and x < 15
```
Output:  
`True` → because x = 12, which is >5 **and** <15.

---

Truth table for `and`:

| Left | Right | Result |
|------|-------|--------|
| True | True  | True   |
| True | False | False  |
| False| True  | False  |
| False| False | False  |

---

## The `or` operator: at least one condition is True

```python
True or True
```
Output:  
`True` → at least one is True.

---

```python
False or True
```
Output:  
`True` → because one is True.

---

```python
y = 5
y < 7 or y > 13
```
Output:  
`True` → because y = 5, which is <7.

---

Truth table for `or`:

| Left | Right | Result |
|------|-------|--------|
| True | True  | True   |
| True | False | True   |
| False| True  | True   |
| False| False | False  |

---

## The `not` operator: flips the boolean value

```python
not True
```
Output:  
`False`

---

```python
not False
```
Output:  
`True`

---

## Boolean Operators with NumPy Arrays

When working with arrays, comparisons produce arrays of booleans.

```python
import numpy as np

bmi = np.array([21.852, 20.975, 21.75, 24.747, 21.441])

bmi > 21
```
Output:  
`[ True, False, True, True, True ]` → checks if each value >21.

---

```python
bmi < 22
```
Output:  
`[ True, True, True, False, True ]` → checks if each value <22.

---

### ⚠️ Important:
Trying to use `and` directly on arrays does **not work**.

```python
bmi > 21 and bmi < 22
```
Output:  
```
ValueError: The truth value of an array with more than one element is ambiguous.
Use a.any() or a.all()
```

---

## Use NumPy logical functions for arrays

✅ For arrays, use:
- `np.logical_and(condition1, condition2)`
- `np.logical_or(condition1, condition2)`
- `np.logical_not(condition)`

---

Check which BMI values are >21 **and** <22:
```python
np.logical_and(bmi > 21, bmi < 22)
```
Output:  
`[ True, False, True, False, True ]` → shows where both conditions are True.

---

Filter the BMI values that are >21 and <22:
```python
bmi[np.logical_and(bmi > 21, bmi < 22)]
```
Output:  
`[21.852, 21.75, 21.441]` → only the BMI values that satisfy both conditions.

---

## Summary:
✅ Use `and/or/not` for single conditions.  
✅ Use `np.logical_and/or/not` for arrays.  
✅ Comparisons on arrays return arrays of True/False.  

---


In [None]:
# Boolean operators with NumPy
# Before, the operational operators like < and >= worked with NumPy arrays out of the box. Unfortunately, this is not true for the boolean operators and, or, and not.

# To use these operators with NumPy, you will need np.logical_and(), np.logical_or() and np.logical_not(). Here's an example on the my_house and your_house arrays from before to give you an idea:

# np.logical_and(my_house > 13, 
#                your_house < 15)
# Instructions
# 100 XP
# Generate boolean arrays that answer the following questions:
# Which areas in my_house are greater than 18.5 or smaller than 10?
# Which areas are smaller than 11 in both my_house and your_house? Make sure to wrap both commands in print() statement, so that you can inspect the output.


# Create arrays
import numpy as np
my_house = np.array([18.0, 20.0, 10.75, 9.50])
your_house = np.array([14.0, 24.0, 14.25, 9.0])

# my_house greater than 18.5 or smaller than 10
print(np.logical_or(my_house >18.5, my_house<10))

# Both my_house and your_house smaller than 11
print(np.logical_and(my_house <11, your_house <11))



# if, elif, else — Intermediate Python



---

## The `if` Statement

Use an `if` statement to execute code **only if a condition is True**.

---

### Example: check if a number is even

```python
z = 4
if z % 2 == 0:
    print("z is even")
```

Output:  
`z is even` → because 4 is divisible by 2.

---

### Example: multiple statements under if

```python
z = 4
if z % 2 == 0:
    print("checking " + str(z))
    print("z is even")
```

Output:
```
checking 4
z is even
```

---

### Example: condition is False

```python
z = 5
if z % 2 == 0:
    print("checking " + str(z))
    print("z is even")
```

Output:  
*(Nothing prints)* → because 5 is not divisible by 2.

---

## The `else` Clause

Use `else` to specify code to run if the `if` condition is False.

```python
z = 5
if z % 2 == 0:
    print("z is even")
else:
    print("z is odd")
```

Output:
```
z is odd
```

---

## The `elif` Clause

Use `elif` (else if) to test additional conditions if the first `if` is False.

---

### Example: check divisibility by 2 or 3

```python
z = 3
if z % 2 == 0:
    print("z is divisible by 2")
elif z % 3 == 0:
    print("z is divisible by 3")
else:
    print("z is neither divisible by 2 nor by 3")
```

Output:
```
z is divisible by 3
```

---

### Example: first condition is True, `elif` is skipped

```python
z = 6
if z % 2 == 0:
    print("z is divisible by 2")
elif z % 3 == 0:
    print("z is divisible by 3")
else:
    print("z is neither divisible by 2 nor by 3")
```

Output:
```
z is divisible by 2
```

---

## Summary
✅ `if` → runs if condition is True  
✅ `elif` → checks another condition if the previous ones are False  
✅ `else` → runs if none of the above conditions are True  



In [None]:
# Customize further: elif
# It's also possible to have a look around in the bedroom. The sample code contains an elif part that checks if room equals "bed". In that case, "looking around in the bedroom." is printed out.

# It's up to you now! Make a similar addition to the second control structure to further customize the messages for different values of area.

# Instructions
# 100 XP
# Add an elif to the second control structure such that "medium size, nice!" is printed out if area is greater than 10.


# Define variables
room = "bed"
area = 14.0

# if-elif-else construct for room
if room == "kit" :
    print("looking around in the kitchen.")
elif room == "bed":
    print("looking around in the bedroom.")
else :
    print("looking around elsewhere.")

# if-elif-else construct for area
if area > 15 :
    print("big place!")
elif area > 10 :
    print('medium size, nice!')
else :
    print("pretty small.")

# Filtering Pandas DataFrames — Intermediate Python


---

## Reading the DataFrame

Load the BRICS dataset.

```python
import pandas as pd

brics = pd.read_csv("path/to/brics.csv", index_col=0)
brics
```

Output:
```
            country      capital    area  population
BR           Brazil     Brasilia   8.516      200.40
RU           Russia       Moscow  17.100      143.50
IN            India    New Delhi   3.286     1252.00
CH            China      Beijing   9.597     1357.00
SA     South Africa     Pretoria   1.221       52.98
```

---

## 🎯 Goal
Select only the countries where area > 8 million km².

We’ll do this in 3 steps:
1. Select the `area` column.
2. Compare values to 8.
3. Use the result to filter rows.

---

## Step 1: Select the `area` column

```python
brics["area"]
```

Output:
```
BR     8.516
RU    17.100
IN     3.286
CH     9.597
SA     1.221
Name: area, dtype: float64
```

Alternate ways to get the column:
```python
brics.loc[:, "area"]
brics.iloc[:, 2]
```

---

## Step 2: Compare `area` values

Check which countries have area > 8:
```python
brics["area"] > 8
```

Output:
```
BR     True
RU     True
IN    False
CH     True
SA    False
Name: area, dtype: bool
```

We can also save this mask:
```python
is_huge = brics["area"] > 8
```

---

## Step 3: Use mask to filter DataFrame

```python
brics[is_huge]
```

Output:
```
      country   capital    area  population
BR     Brazil  Brasilia   8.516      200.4
RU     Russia    Moscow  17.100      143.5
CH      China   Beijing   9.597     1357.0
```

---

## Summary: One-liner

We can also write the filtering in a single line:
```python
brics[brics["area"] > 8]
```

Output:
```
      country   capital    area  population
BR     Brazil  Brasilia   8.516      200.4
RU     Russia    Moscow  17.100      143.5
CH      China   Beijing   9.597     1357.0
```

---

## Boolean operators: Multiple conditions

We can also use `np.logical_and()` to filter with **two conditions**.

```python
import numpy as np

np.logical_and(brics["area"] > 8, brics["area"] < 10)
```

Output:
```
BR     True
RU    False
IN    False
CH     True
SA    False
Name: area, dtype: bool
```

Filter rows where area >8 and <10:
```python
brics[np.logical_and(brics["area"] > 8, brics["area"] < 10)]
```

Output:
```
      country   capital   area  population
BR     Brazil  Brasilia  8.516      200.4
CH      China   Beijing  9.597     1357.0
```

---

## Summary
✅ Filter rows using a boolean Series.  
✅ Use `np.logical_and()` for combining conditions.  
✅ Filtering is powerful and concise in pandas.



# Exercises

In [None]:
# Driving right (1)
# Remember that cars dataset, containing the cars per 1000 people (cars_per_cap) and whether people drive right (drives_right) for different countries (country)? The code that imports this data in CSV format into Python as a DataFrame is included in the script.

# In the video, you saw a step-by-step approach to filter observations from a DataFrame based on boolean arrays. Let's start simple and try to find all observations in cars where drives_right is True.

# drives_right is a boolean column, so you'll have to extract it as a Series and then use this boolean Series to select observations from cars.

# Instructions
# 100 XP
# Extract the drives_right column as a Pandas Series and store it as dr.
# Use dr, a boolean Series, to subset the cars DataFrame. Store the resulting selection in sel.
# Print sel, and assert that drives_right is True for all observations.



# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Extract drives_right column as Series: dr
dr = cars['drives_right']

# Use dr to subset cars: sel
sel = cars[dr]
# Print sel
print(sel)


In [None]:
# Driving right (2)
# The code in the previous example worked fine, but you actually unnecessarily created a new variable dr. You can achieve the same result without this intermediate variable. Put the code that computes dr straight into the square brackets that select observations from cars.

# Instructions
# 100 XP
# Convert the code to a one-liner that calculates the variable sel as before.

# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Convert code to a one-liner
# dr = cars['drives_right']
# sel = cars[dr]
sel = cars[cars['drives_right']]
# Print sel
print(sel)

In [None]:
# Cars per capita (1)
# Let's stick to the cars data some more. This time you want to find out which countries have a high cars per capita figure. In other words, in which countries do many people have a car, or maybe multiple cars.

# Similar to the previous example, you'll want to build up a boolean Series, that you can then use to subset the cars DataFrame to select certain observations. If you want to do this in a one-liner, that's perfectly fine!

# Instructions
# 100 XP
# Select the cars_per_cap column from cars as a Pandas Series and store it as cpc.
# Use cpc in combination with a comparison operator and 500. You want to end up with a boolean Series that's True if the corresponding country has a cars_per_cap of more than 500 and False otherwise. Store this boolean Series as many_cars.
# Use many_cars to subset cars, similar to what you did before. Store the result as car_maniac.
# Print out car_maniac to see if you got it right.

# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Create car_maniac: observations that have a cars_per_cap over 500
cpc =cars['cars_per_cap']
many_cars = cars['cars_per_cap'] > 500
car_maniac = cars[many_cars]

# Print car_maniac
print(car_maniac)


In [None]:
# Cars per capita (2)
# Remember about np.logical_and(), np.logical_or() and np.logical_not(), the NumPy variants of the and, or and not operators? You can also use them on Pandas Series to do more advanced filtering operations.

# Take this example that selects the observations that have a cars_per_cap between 10 and 80. Try out these lines of code step by step to see what's happening.

# cpc = cars['cars_per_cap']
# between = np.logical_and(cpc > 10, cpc < 80)
# medium = cars[between]
# Instructions
# 100 XP
# Use the code sample provided to create a DataFrame medium, that includes all the observations of cars that have a cars_per_cap between 100 and 500.
# Print out medium.


# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Import numpy, you'll need this
import numpy as np

# Create medium: observations with cars_per_cap between 100 and 500
cpc = cars['cars_per_cap']
between = np.logical_and(cpc > 100, cpc < 500)
medium = cars[between]

# Print medium
print(medium)


# END...