<h1>Equality</h1>
To check if two Python values, or variables, are equal you can use ==. To check for inequality, you need !=. As a refresher, have a look at the following examples that all result in True. Feel free to try them out in the IPython Shell.
<pre>
2 == (1 + 1)
"intermediate" != "python"
True != False
"Python" != "python"
</pre>
When you write these comparisons in a script, you will need to wrap a print() function around them to see the output.

In [1]:
# Comparison of booleans
True == False

# Comparison of integers
-5 * 15 != 75

# Comparison of strings
"pyscript" == "PyScript"

# Compare a boolean with an integer
True == 1

True

<h1>Greater and less than</h1>
In the video, Filip also talked about the less than and greater than signs, < and > in Python. You can combine them with an equals sign: <= and >=. Pay attention: <= is valid syntax, but =< is not.

All Python expressions in the following code chunk evaluate to True:
<pre>
3 < 4
3 <= 4
"alpha" <= "beta"
</pre>
Remember that for string comparison, Python determines the relationship based on alphabetical order.

In [2]:
# Comparison of integers
x = -3 * 6
print(x>=-10)

# Comparison of strings
y = "test"
print("test"<=y)

# Comparison of booleans
print(True>False)

False
True
True


<h1>Compare arrays</h1>
Out of the box, you can also use comparison operators with Numpy arrays.

Remember <code>areas</code>, the list of area measurements for different rooms in your house from Introduction to Python? This time there's two Numpy arrays: <code>my_house</code> and <code>your_house.</code> They both contain the areas for the kitchen, living room, bedroom and bathroom in the same order, so you can compare them.

In [3]:
# Create arrays
import numpy as np
my_house = np.array([18.0, 20.0, 10.75, 9.50])
your_house = np.array([14.0, 24.0, 14.25, 9.0])

# my_house greater than or equal to 18
print(my_house>=18)

# my_house less than your_house
print(my_house<your_house)

[ True  True False False]
[False  True  True False]


<h1>and, or, not (1)</h1>
A boolean is either <code>1</code> or <code>0</code>, <code>True</code> or <code>False</code>. With boolean operators such as <code>and</code>, <code>or</code> and <code>not</code>, you can combine these booleans to perform more advanced queries on your data.

In the sample code on the right, two variables are defined: <code>my_kitchen</code> and <code>your_kitchen</code>, representing areas.

In [4]:
# Define variables
my_kitchen = 18.0
your_kitchen = 14.0

# my_kitchen bigger than 10 and smaller than 18?
print(my_kitchen>10 and my_kitchen<18)

# my_kitchen smaller than 14 or bigger than 17?
print(my_kitchen<14 or my_kitchen>17)

# Double my_kitchen smaller than triple your_kitchen?
print(2*my_kitchen < 3*your_kitchen)

False
True
True


<h1>Boolean operators with Numpy</h1>
Before, the operational operators like <code>&lt;</code> and <code>&gt;=</code> worked with Numpy arrays out of the box. Unfortunately, this is not true for the boolean operators <code>and</code>, <code>or</code>, and <code>not</code>.

<p>To use these operators with Numpy, you will need <a href="http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.logical_and.html" target="_blank" rel="noopener noreferrer"><code>np.logical_and()</code></a>, <a href="http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.logical_or.html" target="_blank" rel="noopener noreferrer"><code>np.logical_or()</code></a> and <a href="http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.logical_not.html" target="_blank" rel="noopener noreferrer"><code>np.logical_not()</code></a>. Here's an example on the <code>my_house</code> and <code>your_house</code> arrays from before to give you an idea:</p>

<pre><code>np.logical_and(my_house &gt; 13, 
               your_house &lt; 15)
</code></pre>

In [5]:
# Create arrays
import numpy as np
my_house = np.array([18.0, 20.0, 10.75, 9.50])
your_house = np.array([14.0, 24.0, 14.25, 9.0])

# my_house greater than 18.5 or smaller than 10
print(np.logical_or(my_house>18.5, my_house<10))

# Both my_house and your_house smaller than 11
print(np.logical_and(my_house<11, your_house<11))

[False  True False  True]
[False False False  True]


In [6]:
print(my_house[np.logical_or(my_house>18.5, my_house<10)])

[20.   9.5]


<h1>if</h1>
<div class=""><p>It's time to take a closer look around in your house.</p>
<p>Two variables are defined in the sample code: <code>room</code>, a string that tells you which room of the house we're looking at, and <code>area</code>, the area of that room.</p></div>

In [7]:
# Define variables
room = "kit"
area = 14.0

# if statement for room
if room == "kit" :
    print("looking around in the kitchen.")

# if statement for area
if area > 15.0:
    print("big place!")

looking around in the kitchen.


<h1>Add else</h1>
<div class=""><p>On the right, the <code>if</code> construct for <code>room</code> has been extended with an <code>else</code> statement so that "looking around elsewhere." is printed if the condition <code>room == "kit"</code> evaluates to <code>False</code>.</p>
<p>Can you do a similar thing to add more functionality to the <code>if</code> construct for <code>area</code>?</p></div>

In [8]:
# Define variables
room = "kit"
area = 14.0

# if-else construct for room
if room == "kit" :
    print("looking around in the kitchen.")
else :
    print("looking around elsewhere.")

# if-else construct for area
if area > 15 :
    print("big place!")
else:
    print("pretty small.")

looking around in the kitchen.
pretty small.


<h1>Customize further: elif</h1>
<div class=""><p>It's also possible to have a look around in the bedroom. The sample code contains an <code>elif</code> part that checks if <code>room</code> equals "bed". In that case, "looking around in the bedroom." is printed out.</p>
<p>It's up to you now! Make a similar addition to the second control structure to further customize the messages for different values of <code>area</code>.</p></div>

In [9]:
# Define variables
room = "bed"
area = 14.0

# if-elif-else construct for room
if room == "kit" :
    print("looking around in the kitchen.")
elif room == "bed":
    print("looking around in the bedroom.")
else :
    print("looking around elsewhere.")

# if-elif-else construct for area
if area > 15 :
    print("big place!")
elif area > 10:
    print("medium size, nice!")
else :
    print("pretty small.")

looking around in the bedroom.
medium size, nice!


<h1>Driving right (1)</h1>
<div class=""><p>Remember that <code>cars</code> dataset, containing the cars per 1000 people (<code>cars_per_cap</code>) and whether people drive right (<code>drives_right</code>) for different countries (<code>country</code>)? The code that imports this data in CSV format into Python as a DataFrame is available on the right.</p>
<p>In the video, you saw a step-by-step approach to filter observations from a DataFrame based on boolean arrays. Let's start simple and try to find all observations in <code>cars</code> where <code>drives_right</code> is <code>True</code>.</p>
<p><code>drives_right</code> is a boolean column, so you'll have to extract it as a Series and then use this boolean Series to select observations from <code>cars</code>.</p></div>

In [11]:
# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Extract drives_right column as Series: dr
dr = cars["drives_right"]

# Use dr to subset cars: sel
sel = cars[dr]

# Print sel
print(sel)

     cars_per_cap        country  drives_right
US            809  United States          True
RU            200         Russia          True
MOR            70        Morocco          True
EG             45          Egypt          True


<h1>Driving right (2)</h1>
<div class=""><p>The code in the previous example worked fine, but you actually unnecessarily created a new variable <code>dr</code>. You can achieve the same result without this intermediate variable. Put the code that computes <code>dr</code> straight into the square brackets that select observations from <code>cars</code>.</p></div>

In [12]:
# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Convert code to a one-liner
sel = cars[cars['drives_right']]

# Print sel
print(sel)

     cars_per_cap        country  drives_right
US            809  United States          True
RU            200         Russia          True
MOR            70        Morocco          True
EG             45          Egypt          True


<h1>Cars per capita (1)</h1>
<div class=""><p>Let's stick to the <code>cars</code> data some more. This time you want to find out which countries have a high <em>cars per capita</em> figure. In other words, in which countries do many people have a car, or maybe multiple cars.</p>
<p>Similar to the previous example, you'll want to build up a boolean Series, that you can then use to subset the <code>cars</code> DataFrame to select certain observations. If you want to do this in a one-liner, that's perfectly fine!</p></div>

In [13]:
# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Create car_maniac: observations that have a cars_per_cap over 500
cpc = cars["cars_per_cap"]
many_cars = cpc>500
car_maniac = cars[many_cars]

# Print car_maniac
print(car_maniac)

     cars_per_cap        country  drives_right
US            809  United States          True
AUS           731      Australia         False
JAP           588          Japan         False


<h1>Cars per capita (2)</h1>
<div class=""><p>Remember about <a href="http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.logical_and.html" target="_blank" rel="noopener noreferrer"><code>np.logical_and()</code></a>, <a href="http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.logical_or.html" target="_blank" rel="noopener noreferrer"><code>np.logical_or()</code></a> and <a href="http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.logical_not.html" target="_blank" rel="noopener noreferrer"><code>np.logical_not()</code></a>, the Numpy variants of the <code>and</code>, <code>or</code> and <code>not</code> operators? You can also use them on Pandas Series to do more advanced filtering operations.</p>
<p>Take this example that selects the observations that have a <code>cars_per_cap</code> between 10 and 80. Try out these lines of code step by step to see what's happening.</p>
<pre><code>cpc = cars['cars_per_cap']
between = np.logical_and(cpc &gt; 10, cpc &lt; 80)
medium = cars[between]
</code></pre></div>

In [15]:
# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)

# Import numpy, you'll need this
import numpy as np

# Create medium: observations with cars_per_cap between 100 and 500
cpc = cars['cars_per_cap']
medium = cars[np.logical_and(cpc>100, cpc<500)]

# Print medium
print(medium)

    cars_per_cap country  drives_right
RU           200  Russia          True
