# Controlling Program Flow with Conditional Statements

<a href="https://colab.research.google.com/github/bradleyboehmke/uc-bana-4080/blob/main/example-notebooks/15_conditional_statements.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook accompanies [this textbook chapter](https://bradleyboehmke.github.io/uc-bana-4080/16-conditional-statements.html) and allows you to run the code examples interactively.


## Prerequisites

In [None]:
import pandas as pd
import numpy as np


Also, most of the examples use toy data; however, when illustrating concepts integrated with Pandas we will use the  Complete Journey transaction data:


In [None]:
from completejourney_py import get_data

df = get_data()['transactions']


## if statement


In [None]:
x = -8

if x < 0:
    print('x contains a negative number')

In [None]:
# multiple lines in the statement are fine as long as they have the
# same indentation
if x < 0:
    new_value = abs(x)
    print(f'absolute value of x is {new_value}')

In [None]:
# single line approach
if x < 0: print('x contains a negative number')

In [None]:
# a conditional statement on an empty object is equivalent to False
empty_list = []
if empty_list:
    print("since empty_list is False this won't exectute")

In [None]:
# a conditional statement on a non-empty object is equivalent to True
non_empty_list = ['not', 'empty']
if non_empty_list:
    print("This list is not empty")


Python uses `and` and `or` operators to evaluate multiple expressions. They always return a single `True` or `False`. Moreover, Python will stop evaluating multiple expressions as soon as the result is known.


In [None]:
x = -1
y = 4
if x < 0 or y < 0:
    print('At least one of these objects are less than zero.')

In [None]:
if x < 0 and y < 0:
    print('Both x and y or less than zero')

## Multiway Branching

In [None]:
x = 22.50

if 0 <= x < 10:
    print('low')
elif 10 <= x < 20:
    print('medium-low')
elif 20 <= x < 30:
    print('medium')
else:
    print('preferred')

## Switch Statements

In [None]:
choice = 'ham'

if choice == 'spam':
    print(1.25)
elif choice == 'ham':
    print(1.99)
elif choice == 'eggs':
    print(0.99)
elif choice == 'bacon':
    print(1.10)
else:
    print('Bad choice')


However, this approach is a bit verbose. An efficient alternative is to use a dictionary that provides the same key-value matching as a `switch` statement. 


In [None]:
options = {'spam': 1.25, 'ham': 1.99, 'eggs': 0.99, 'bacon': 1.10}


You can either index this dictionary for the matching key:


In [None]:
options[choice]


Or, a more trustworthy approach is to use the `get()` method. This allows you to provide a default response in the case that the key you are looking for is not in the dictionary



In [None]:
options.get(choice, 'Bad choice')

In [None]:
choice = 'broccoli'
options.get(choice, 'Bad choice')

In [None]:
def produce_revenue(sqft, visits, trend):
    total = 9.91 * sqft * visits * trend
    return round(total, 2)

def frozen_revenue(sqft, visits, trend):
    prod = produce_revenue(sqft, visits, trend)
    total = 3.28 * sqft * visits * trend - prod * .005
    return round(total, 2)

expected_annual_revenue = {
    'produce':    produce_revenue,
    'frozen':     frozen_revenue,
    'pharmacy':   lambda: 16.11 * visits * trend
    }

choice = 'frozen'
expected_annual_revenue.get(choice, 'Bad choice')(sqft=937, visits=465, trend=0.98)

## Applying in Pandas

In [None]:
df['value'] = df['sales_value'].apply(lambda x: 'high value' if x > 10 else 'low value')
df.head()

In [None]:
df.groupby('value').size()


An alternative, and much faster approach is to use `np.where()`, which requires numpy to be loaded. `np.where` has been show to be over 2.5 times faster than `apply()`:


In [None]:
df['value'] = np.where(df['sales_value'] > 10, 'high value', 'low value')
df.head()


As our conditions get more complex, it often becomes useful to create a separate function and use `apply`.  This approach is probably the most legible; however, not always the fastest approach if you are working with significantly large data.


In [None]:
def flag(df):
    if (df['quantity'] > 20) or (df['sales_value'] > 10):
        return 'Large purchase'
    elif (df['quantity'] > 10) or (df['sales_value'] > 5):
        return 'Medium purchase'
    elif (df['quantity'] > 0) or (df['sales_value'] > 0):
        return 'Small purchase'
    else:
        return 'Alternative transaction'

df['purchase_flag'] = df.apply(flag, axis = 1)
df.head()

In [None]:
df.groupby('purchase_flag').size()