In [None]:
%reload_ext postcell
%postcell register

In [None]:
import pandas as pd

# Conditonal statement: if/else/elif

As we have seen many times, Python and almost all programming langauges provide `if` and `else` statements:

In [None]:
x = -100 # <= could be a value coming from a database lookup or user input

if x > 0:
    print("You have some money in your account")
else:
    print("You need overdraft protection!")

### `elif`
There are times when you need more than just two conditions. For example:

In [None]:
x = -100

if x > 0:
    print("You have some money in your account")
elif x == 0:
    print("You don't owe us anything, and you don't have any money")
else:
    print("You need overdraft protection!")

In many languages, `elfif` is not a separate statement. Porgrammers are expected to chain together `if` and `else` statements as such:

```python
#non-python languages
if x > 0: print("you have cash")
else if: x == 0: print("you have no cash") # notice the use of "else if" instead of "elif"
else: print("You need overdraft protection!")
```

**Exercise** Please re-arrange the code below so the `if` and `else` are swapped? Hint, perhaps _negation_ will help?

In [None]:
%%postcell exercise_025_220_a

data_file_location = "../../datasets/deaths-in-gameofthrones/game-of-thrones-deaths-data.csv"

with open(data_file_location, 'r', encoding='utf8') as file:
    file_contents = file.readlines() # This returns a list of lines in this text file

if len(file_contents) > 0: # <= Swap the two branches
    print("You can proceed with further processing of this file")
else:
    print("This file is empty, are you sure this is the right file?")

### Truthiness

When using `if` conditions or loops, any of the following evaluate as `False`:
1. `False`
2. Any zero value: `0`, `0.0`, `Decimal(0)`, etc.
3. Any empty container such as lists, dictionaries, sets
4. Empty strings: `''`, `"       ".strip()`
5. `range(0)`
6. `None`

In [None]:
def evaluates_as(i):
    if i: return True
    else: return False

for item in [False, 0, 0.0, list(), [], set(), dict(), "", "   \n   ".strip(), range(0), None, True, 1, [1,2]]:
    print(item, f" evaluates to {evaluates_as(item)}")

**Exercise** There is a "cleaner" way to write the if/else clause. A common error:

In [None]:
%%postcell exercise_025_220_c

# Re-write the following more cleanly
count_header = True

data_file_location = "../../datasets/deaths-in-gameofthrones/game-of-thrones-deaths-data.csv"
counter = 0

with open(data_file_location, 'r', encoding='utf8') as file:
    for index, line in enumerate(file):
        counter = counter + 1

if count_header == True: # <= Make this "cleaner"
    print(counter - 1)
else:
    print(counter)

**Exercise** You have read in a file, you need to execute on set of lines if the file actually contains content and another if it doesn't. Python provides a "cleaner" way to write this:

In [None]:
%%postcell exercise_025_220_c

data_file_location = "../../datasets/deaths-in-gameofthrones/game-of-thrones-deaths-data.csv"

with open(data_file_location, 'r', encoding='utf8') as file:
    file_contents = file.readlines() # This returns a list of lines in this text file

if len(file_contents) > 0: # <= Make this line "cleaner"
    print("You can proceed with further processing of this file")
else:
    print("This file is empty, are you sure this is the right file?")

### How to correctly break up a continuous range (common bug)

Breaking up a continuous scale into discrete, yet fully covered sets has a subtelty that some students miss. The idea is similar to what the consulting world calls MECE: "mutually exclusive, collectively exhaustive."

**Exercise** There is a bug (potentialy more than one) in the code below. Please fix the code.
Hint: what if Lisa's GPA was 3.5?

In [None]:
%%postcell exercise_025_220_d

# Grade thresholds in a grading system
grade_A = 3.5
grade_B = 3.0
grade_C = 2.5
grade_D = 2.0

# Input from grader for Lisa Simpson
student_grade = 3.75

if   student_grade < 4.0 and student_grade > 3.5: print("Student receives grade A")
elif student_grade < 3.5 and student_grade > 3.0: print("Student receives grade B")
elif student_grade < 3.0 and student_grade > 2.5: print("Student receives grade C")
else: print("Student fails")

The proper way to break up a continuous range of values requires careful (but not difficult) use of "or equal to" operators.

### Getting the parenthesis correct is extremely important in boolean expressions

In [None]:
data_file_location = "../../datasets/life-expectancy/life-expectancy-who.zip2"

who_df = pd.read_csv(data_file_location, compression='zip')
who_df.columns

In [None]:
who_df[who_df.Year == 2015].head()

**Exercise** Let's take a look at data from the World Health Org, from the year 2015 (the most recent data we have available to us).
I am interested in seeing all the countries which are "Developing" or have life expectancy of below 80 years.

How come I'm seeing data from years other than 2015? Please fix it

Recall that Pandas boolean expression don't use `and`, `or` keywords. They use set notation `&` and `|`. Also recall that multiple boolean expression have to be encased in parenthesis. See pandas notebooks for further details.

In [None]:
%%postcell exercise_025_220_e

who_df[(who_df.Year == 2015) & (who_df.Status == 'Developing') | (who_df['Life expectancy '] < 80)]

### Why use if/else, instead of just a series of if statements?

Let's revisit an earlier example:

In [None]:
student_grade = 3.3

if   student_grade < 4.0 and student_grade >= 3.5: print("A")
elif student_grade < 3.5 and student_grade >= 3.0: print("B")
elif student_grade < 3.0 and student_grade >= 2.5: print("C")
else : print("Fail")

Notice that we can dramatically simplify our code by remembeing that _only_ one of the branches will be taken in the code above:

In [None]:
student_grade = 3.3

if   student_grade >= 3.5: print("A")
elif student_grade >= 3.0: print("B") # <= This branch already exluedes anything above (or equal to) 3.5!
elif student_grade >= 2.5: print("C")
else : print("Fail")

However, if we can't be sure that the branches are mutually exlusive, then this logic no longer works:

In [None]:
student_grade = 3.3

if   student_grade >= 3.5: print("A")
if student_grade >= 3.0: print("B") # <= This branch already exluedes anything above (or equal to) 3.5!
if student_grade >= 2.5: print("C")
if student_grade < 3.0 : print("Fail")

### Ternary if/else (single line if/else)

There are times when you want a very compact version of an if/else statemnt. For example:

In [None]:
x = -100

owes_money =  True if x < 0 else False

In [None]:
owes_money

This syntax provides an additional benefit, the full logic of `if` and `else` can be written in a single line and assigned to a variable. In more technical terms, this syntax is an _expression_ rather than a _statement_.

Note that almost all languages provide similar ternary operators. In many languages, the code above will like like this: `owes_money = x < 0 ? True : False`

### Short circuiting
Novice programmers often miss a subtlety related to how boolean statements are evaluated.

We will write custom functions which return True or False, but also print some debugging information

In [None]:
def MyTrue():
    print("Executing MyTrue")
    return True

def MyFalse():
    print("Executing MyFalse")
    return False

In [None]:
MyTrue()

In [None]:
if MyFalse(): print("Should print True")

Notice that in order to evaluate the conditional statement, MyFalse() had to be executed (this should be obvious)

In [None]:
if MyTrue(): print("Should print True")

This is essentially the same statement as before, so no surprises.

In [None]:
if MyTrue() and MyTrue() and MyTrue() : print("Should print True")

Recall that when a boolean expression is true, only if _all_ sub-expressions are true. In other words, all `MyTrue` functions have to be evaluated. Perhaps not very surprising.

In [None]:
if MyTrue() or MyTrue() or MyTrue() : print("Should print True")

This _should_ be surprising! Recall that in an `or` expression, any one of the values has to be true, in order for the whole expression to be true. This is why, once the first `True` was found, the remaining expression was not even evaluated (it was _short-circuited_ )

**Exercise** What will this expression print?
```python
if MyTrue() and MyTrue() or MyTrue() and MyTrue() or MyTrue() : print("Should print True")
```

**Exercise** What will this expression print?
```python
if MyFalse() and MyTrue() : print("Should print True")
```

# None
Almost every programming language has a construct known as "null", which represents a lack of value. Not zero, but a missing value. Python chose to call its version of "null" `None`. When "null" is not available, it often has to be invented. 

#### Null in the real world

Imagine you are in the field, collecting data on how many ounces of milk babies are drinking in each household. There will be times when a household won't have any babies. You will record `0` as the ounces of milk babies are drinking in this household. What about the times when you have information about a household, but they refuse to tell you anything about what the baby is consuming? In your notebook, you may write a `-1` or some other non-sensical value to represent a "missing vaule." 

#### Null in computers

Recall that so far, we have used the `return` statement in every function. What if a function didn't return anything?

In [None]:
def ok_function(): return 1+1
def bad_function(): 1+1

In [None]:
print(ok_function())

In [None]:
print(bad_function())

Not having a `return` statement is NOT an error, it just returns `None`!

Just like 1, 2 and 3 belong to the `int` type and "hello" belongs to the `str` type, `None` belongs to the `NoneType` (in fact, `None` is the _only_ member of that type)

In [None]:
type(None)

`None` can be passed around like any other value:

In [None]:
my_list = [1, 2, 3, "hello", 5, None, 6, True, 7, False, None, None, "yo"]
my_list

In [None]:
print(my_list[5])

In [None]:
None == True

In [None]:
for e in [1, 2, 3, "hello", 5, None, 6, True, 7, False, None, None, "yo"]:
    print(e == True)

Notice that only `1` and `True` are true statements, all others, including `None` are evaluated as false.

### `==` vs `is`
Although the detail for ths distinction is out of scope for this lecture, when comparing values to `None`, you should use the `is` statement, rather than `==`

In [None]:
x = None

In [None]:
x == None

In [None]:
x is None