# Conditional statements 

Very often, we need to process lists in a way that varies with list contents. 
This leads to a need for *conditional statements* such as `if`

# Example: filter a list for missing values
In python, there is a universal representation for missing values: `None`.
This is a value that represents the fact that a value is missing. We will often get data that has completely missing values, and those will be represented by this (pseudo-) value. 

Consider the following:

In [None]:
# here's a data file with some missing values: 
f = open('data.txt', 'r')
for line in f: 
    print(line.strip())
f.close()

In [None]:
# let's split that and represent as a list
f = open('data.txt', 'r')
records = []
for line in f: 
    item, price, tax, state = line.strip().split(',')
    if item == '': 
        item = None
    if price == '': 
        price = None
    if tax == '': 
        tax = None
    if state == '': 
        state = None
    record = (item, price, tax, state)
    records.append(record)
f.close()
records

There are some perhaps strange things going on here: 

* The code 
```
item, price, tax, state = line.strip().split(',')
``` 
takes a `line`, strips off the `\n`s, and splits at commas. The code puts the results into the variables `item`, `price`, `tax`, `state` via what Python calls *parallel assignment*. The result of the `split` is a list, and the assignment places the first few elements of the list into those variables. 

* The code: 
```
if price == '': 
    price = None
```
sets fields to `None` if they're empty. 
* The code: 
```
record = (item, price, tax, state) 
```
creates a tuple of the four elements. 

* The variable `records` is a *list of tuples.* This is common practice in Python for representing data. 

# List or tuple?

Why did I choose tuples for the elements, and lists for the overall container? This is a really good question. In general: 
1. Lists are used when all entries of the list have the same meaning regardless of position in the list. 
2. Tuples are used when the position of an item determines its meaning. 

For example, in the above situation, 
* All list items represent the same kind of thing, but 
* The first, second, third, and fourth tuple elements mean different things that is dependent upon their position in the tuple. 

if `t` is one of those tuples
* `t[0]` is the item name
* `t[1]` is the price
* `t[2]` is the tax paid
* `t[3]` is the state abbreviation. 
# we now face several quandaries
1. The data is string. We want numbers. 
2. There are None's in the data. We don't want to do anything with them. 
So, we need to resort to unusual measures. Consider

In [None]:
# you need records for this to work
rec2 = []
for r in records: 
    item, price, tax, state = r
    try: 
        p2 = float(price)
    except: 
        p2 = None
    try: 
        t2 = float(tax)
    except: 
        t2 = None
    rec2.append((item, p2, t2, state))
rec2

Some new ideas: 
* The code
```
try:  
    p2 = float(price)
except: 
    p2 = None
```
is an *exception handler*.  It means: 
1. Try to do the thing in the `try` block. 
2. If that fails, do the thing in the `except` block. 
If we try to convert a non-number to a number, the value is None. This is simply *another form of `if` statement!*

As a result, our tuples are now "clean", in the sense that non-existent values are always `None`, and numeric values are always numbers. 


Let's sum up the tax, without having our code crash. 

Consider: 

In [None]:
tax = 0.0
for r in rec2: 
    tax += r[2]
tax

This makes sense, but doesn't work at all. Run it to see what it does. 

# Manipulating values that can be None
So far, we know that trying to add `None` to `float` doesn't work. 
Alas, you might think you can write: 
```
if v == None: 
    sum += v 
```
but this gives a runtime error. If v is None, then one cannot use its value for anything. Thus, there is a special form of `if` statement that handles this: 
```
if v is not None: 
    sum += v 
```
or 
```
if v is None: 
    {something else} 
```
So, we need to check whether we're operating on None, as follows: 

In [None]:
tax = 0.0
for r in rec2: 
    if r[2] is not None: 
        tax += r[2]
tax

# Here are some exercises
First, please run this cell to register yourself with the grader. 


In [None]:
# Don't change this cell; just run it. 
from client.api.notebook import Notebook
ok = Notebook('01-07-conditional-statements.ok')
ok.auth(inline=True)

1. Write code that sums up the prices. Deal with prices that are `None`. 

In [None]:
# write your answer here. Compute the sum of all price fields in rec2. 
# Put the price into the variable "price"
price

In [None]:
# run this to check your work
_ = ok.grade('q1')

2. Write code that generates a list of states from the tuples, omitting states that are `None`. 

In [None]:
# write code that generates a variable 'states', from `rec2`.
states

In [None]:
# run this to check your work
_ = ok.grade('q2')

3. Create a list with only the tuples that have no missing data. 

In [None]:
# use rec2. put your answer into "cleaned"
cleaned

In [None]:
# run this to check your work
_ = ok.grade('q3')

# When you're done, submit the notebook

You can submit a notebook by saving it as PDF. In the cluster environment, it's File | Print (Save as PDF) and submit to Gradescope. https://www.gradescope.com/courses/182658,On other versions, it may be File | Download As (PDF) and then submit to Gradescope.

To submit to Gradescope, log into the [website](https://www.gradescope.com/courses/182658), add course **9W7PW3** (if not already added) and submit. The assignment name should match the name of this notebook.
