In [1]:
# Set up packages for lecture. Don't worry about understanding this code, but
# make sure to run it if you're following along.
import numpy as np
import babypandas as bpd
import pandas as pd
from matplotlib_inline.backend_inline import set_matplotlib_formats
import matplotlib.pyplot as plt
%reload_ext pandas_tutor
%set_pandas_tutor_options {'projectorMode': True}
set_matplotlib_formats("svg")
plt.style.use('fivethirtyeight')

np.set_printoptions(threshold=20, precision=2, suppress=True)
pd.set_option("display.max_rows", 7)
pd.set_option("display.max_columns", 8)
pd.set_option("display.precision", 2)

# Lecture 11 – Booleans and Conditionals, Iteration

## DSC 10, Summer 2022

### Announcements

- Homework 3 is due **Sat at 11:59pm.**
- Lab 4 is due **Tues 4/26 at 11:59pm**.
- Midterm Project will be released later this week and will be due **Sat 8/6 at 11:59pm**.
    - See [the spreadsheet for finding a partner][pairs].
    - If you work with a partner, you **must** follow these [pair programming guidelines](https://dsc10.com/pair-programming).
    - **Start early and come to office hours!**
- The midterm exam will take place on Fri, July 29 from 11:00-11:50am in class.
    - You can bring one 8.5" by 11" page of handwritten notes (double-sided is ok).
    - We'll also provide [the DSC 10 reference sheet][ref] during the exam.

[ref]: https://drive.google.com/file/d/1mQApk9Ovdi-QVqMgnNcq5dZcWucUKoG-/view
[pairs]: 

### Agenda

- Booleans.
- Conditionals (i.e. `if`-statements).
- Iteration (i.e. `for`-loops).

**Note:** 
- We've finished introducing new DataFrame manipulation techniques. 
- The content we're covering today will become more relevant as we start to cover more ideas in statistics (next week).

## Booleans

## Booleans

- `bool` is a data type in Python, just like `int`, `float`, and `str`. 
    - It stands for "Boolean", named after George Boole, an early mathematician.
- There are only two possible Boolean values: `True` or `False`.
    - Yes or no.
    - On or off.
    - 1 or 0.
- There are three operators that allow us to perform arithmetic with Booleans – `not`, `and`, and `or`.
- Comparisons result in Boolean values.

In [2]:
x = True

In [3]:
type(x)

bool

In [4]:
3 > 5

False

### The `not` operator

Flips a `True` to a `False`, and a `False` to a `True`.

In [5]:
is_sunny = True

not is_sunny

False

### The `and` operator

- Placed between two `bool`s.
- `True` if **both** are true, otherwise `False`.

In [6]:
is_sunny = True
is_warm = False

is_sunny and is_warm

False

### The `or` operator

- Placed between two `bool`s.
- `True` if **at least one** of them is `True`, otherwise `False`.

In [7]:
is_sunny = True
is_warm = False

is_sunny or is_warm

True

In [8]:
# Both can be True as well!
True or True

True

### Comparisons and Boolean operators

- Remember, comparisons result in Boolean values.
- As usual, use **(parentheses)** to make expressions more clear.
    - By default, the order of operations is `not`, `and`, `or`.

In [9]:
first_name = 'king'
last_name = 'triton'
age = 19

first_name == 'triton' and age >= 21

False

In [10]:
last_name == 'triton' or (first_name == 'triton' and age >= 21)

True

In [11]:
# Different meaning!
(last_name == 'triton' or first_name == 'triton') and age >= 21

False

In [12]:
# `and` has precedence:
last_name == 'triton' or first_name == 'triton' and age >= 21

True

### Be careful!

In [13]:
a = True
b = False

not (a and b)

True

In [14]:
(not a) and (not b)

False

### Note: `&` and `|` vs. `and` and `or`

When performing Boolean arithmetic...
- Use the `&` and `|` operators between two Series. Arithmetic will be done in an elementwise fashion (i.e. separately for each row).
    - This is relevant when writing DataFrame queries, e.g. `df[(df.get('x') == 2) & (df.get('y') != 'ucsd')]`.
- Use the `and` and `or` operators between two Booleans.
    - e.g. `(x > 2) and (y != 'ucsd')`.

<div class="menti">
<div>

### Discussion Question

  
Suppose we define `a = True` and `b = True`. What does the following expression evaluate to?

```py
not (((not a) and b) or ((not b) or a))
```

A. `True`

B. `False`

C. Could be either one    

</div>
<div>

### To answer, go to **[menti.com](https://www.menti.com/v42ge81t5d)** and enter the code 2863 3386 or use this QR code:

![](images/menti-qr.png)
    
</div>
</div>


## Conditionals

### `if`-statements

- Often, we'll want to run a block of code only if a particular conditional expression is `True`.
- The syntax for this is as follows (don't forget the colon!):

```py
if <condition>:
    <body>
```
            
- Indentation matters!

In [15]:
is_sunny = True

if is_sunny:
    print('Wear sunglasses.')

Wear sunglasses.


### `else`

`else`: Do something else if the specified condition is `False`.

In [16]:
is_sunny = False

if is_sunny:
    print('Wear sunglasses.')
else:
    print('Stay inside.')

Stay inside.


### `elif`

- What if we want to check more than one condition? Use `elif`.
- `elif`: if the specified condition is `False`, check the next condition.
- If that condition is `False`, check the next condition, and so on, until we see a `True` condition.
    - After seeing a `True` condition, it evaluates the indented code and stops.
- If none of the conditions are `True`, the `else` body is run.

In [17]:
is_raining = False
is_warm = True
is_sunny = True

if is_raining:
    print('Grab an umbrella.')
elif is_warm:
    print('Wear shorts.')
elif is_sunny:
    print('Wear sunglasses.')
else:
    print('All conditions are false!')

Wear shorts.


In [18]:
# What happens if you use `if` instead of `elif`?
is_raining = False
is_warm = True
is_sunny = True

if is_raining:
    print('Grab an umbrella.')
if is_warm:
    print('Wear shorts.')
if is_sunny:
    print('Wear sunglasses.')
else:
    print('All conditions are false!')

Wear shorts.
Wear sunglasses.


### Example: sign function

Below, complete the implementation of the function `sign`, which takes a single number (`x`) and returns `'positive'` if the number is positive, `'negative'` if the number is negative, and `'neither'` if it is neither.

In [19]:
def sign(x):
    if x > 0:
        return 'positive'
    elif x < 0:
        return 'negative'
    else:
        return 'neither'

In [20]:
sign(7)

'positive'

In [21]:
sign(-2)

'negative'

In [22]:
sign(0)

'neither'

### Example: percentage to letter grade

Below, complete the implementation of the function, `grade_converter`, which takes in a percentage grade (`grade`) and returns the corresponding letter grade, according to this table:

| Letter | Range |
| --- | --- |
| A | [90, 100] |
| B | [80, 90) |
| C | [70, 80) |
| D | [60, 70) |
| F | [0, 60)

In [23]:
def grade_converter(grade):
    if grade >= 90:
        return 'A'
    elif grade >= 80:
        return 'B'
    elif grade >= 70:
        return 'C'
    elif grade >= 60:
        return 'D'
    else:
        return 'F'

In [24]:
grade_converter(84)

'B'

In [25]:
grade_converter(55)

'F'

<div class="menti">
<div>

### Discussion Question

```py

def mystery(a, b):
    if (a + b > 4) and (b > 0):
        return 'bear'
    elif (a * b >= 4) or (b < 0):
        return 'triton'
    else:
        return 'bruin'
```

What is returned when `mystery(2, 2)` is called?

A. `'bear'`

B. `'triton'`

C. `'bruin'`

D. More than one of the above
    
</div>
<div>

### To answer, go to **[menti.com](https://www.menti.com/v42ge81t5d)** and enter the code 2863 3386 or use this QR code:

![](images/menti-qr.png)
    
</div>
</div>

## Iteration

### `for`-loops

In [26]:
import time

print("Launching in...")

for x in [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]:
    print("t-minus", x)
    time.sleep(0.5) # Pauses for half a second
    
print("Blast off! 🚀")

Launching in...
t-minus 10
t-minus 9
t-minus 8
t-minus 7
t-minus 6
t-minus 5
t-minus 4
t-minus 3
t-minus 2
t-minus 1
Blast off! 🚀


### `for`-loops

- `for`-loops repeat specified code for every value in a sequence.
    - Lists and arrays are sequences.
    - "Iterate" means "repeat".
- `for`-loop syntax (don't forget the colon!):

```py
for <loop variable> in <sequence>:
    <body>
```

- Indentation matters!


### Example: squares

In [27]:
num = 4
print(num, 'squared is', num ** 2)

num = 2
print(num, 'squared is', num ** 2)

num = 1
print(num, 'squared is', num ** 2)

num = 3
print(num, 'squared is', num ** 2)

4 squared is 16
2 squared is 4
1 squared is 1
3 squared is 9


In [28]:
# The loop variable can be anything
list_of_numbers = [4, 2, 1, 3]
for num in list_of_numbers:
    print(num, 'squared is', num ** 2)

4 squared is 16
2 squared is 4
1 squared is 1
3 squared is 9


The line `print(num, 'squared is', num ** 2)` is run four times:
- On the first iteration, `num` is 4.
- On the second iteration, `num` is 2.
- On the third iteration, `num` is 1.
- On the fourth iteration, `num` is 3.

This happens, even though there is no `num = ` anywhere.

### Example: colleges

In [29]:
colleges = np.array(['Revelle', 'John Muir', 'Thurgood Marshall', 
            'Earl Warren', 'Eleanor Roosevelt', 'Sixth', 'Seventh'])

In [30]:
for college in colleges:
    print(college + ' College')

Revelle College
John Muir College
Thurgood Marshall College
Earl Warren College
Eleanor Roosevelt College
Sixth College
Seventh College


### Ranges

- Recall, each element of a list/array has a numerical position.
    - The position of the first element is 0, the position of the second element is 1, etc.
- We can write a `for`-loop that accesses each element in an array by using its position.
- `np.arange` will come in handy.

In [31]:
colleges

array(['Revelle', 'John Muir', 'Thurgood Marshall', 'Earl Warren',
       'Eleanor Roosevelt', 'Sixth', 'Seventh'], dtype='<U17')

In [32]:
len(colleges)

7

In [33]:
np.arange(len(colleges))

array([0, 1, 2, 3, 4, 5, 6])

In [34]:
for i in np.arange(len(colleges)):
    print(i)

0
1
2
3
4
5
6


In [35]:
for i in np.arange(len(colleges)):
    print(colleges[i])

Revelle
John Muir
Thurgood Marshall
Earl Warren
Eleanor Roosevelt
Sixth
Seventh


### Building an array by iterating

- **Question: How many letters are in each college's name?**
- We can figure it out one college at a time, but we want to save our results!
- One idea:
    - Create an empty array.
    - For each college, figure out the length of its name, and store this length in the array that we created.
        - Use `np.append`, which appends (adds) an element to the end of an array.
    - At the end, the empty array we created will contain the lengths of each college's name.
- We will follow this pattern **very often** when generating data and running experiments or simulations. 
    - It is called the **accumulator pattern**, because the empty array that we created "accumulates" the values we want.

In [36]:
colleges

array(['Revelle', 'John Muir', 'Thurgood Marshall', 'Earl Warren',
       'Eleanor Roosevelt', 'Sixth', 'Seventh'], dtype='<U17')

In [37]:
# Creating an empty array to store our results
lengths = np.array([])

for college in colleges:
    # For each college, calculate the length of its name and add it to the lengths array
    lengths = np.append(lengths, len(college))
    
lengths

array([ 7.,  9., 17., 11., 17.,  5.,  7.])

In [38]:
# Notice how .apply works the same way!
df = bpd.DataFrame().assign(colleges=colleges)
df

Unnamed: 0,colleges
0,Revelle
1,John Muir
2,Thurgood Marshall
3,Earl Warren
4,Eleanor Roosevelt
5,Sixth
6,Seventh


In [39]:
df.assign(length=df.get('colleges').apply(len))

Unnamed: 0,colleges,length
0,Revelle,7
1,John Muir,9
2,Thurgood Marshall,17
3,Earl Warren,11
4,Eleanor Roosevelt,17
5,Sixth,5
6,Seventh,7


### Working with strings

String are sequences, so we can iterate over them, too!

In [40]:
for letter in 'uc san diego':
    print(letter.upper())

U
C
 
S
A
N
 
D
I
E
G
O


In [41]:
'california'.count('a')

2

### Example: vowel count

Below, complete the implementation of the function `vowel_count`, which returns the number of vowels in the input string `s` (including repeats). Example behavior is shown below.

```py
>>> vowel_count('king triton')
3

>>> vowel_count('i go to uc san diego')
8
```

In [42]:
def vowel_count(s):
    # We need to keep track of the number of vowels seen so far somewhere
    number = 0
    # For each of the 5 vowels
    for vowel in 'aeiou':
        # Count the number of occurrences of this vowel in s
        
        # Add this count to the variable number
        pass
        
    
    return number

### Reflecting on the previous example

- The implementation of `vowel_count` used the accumulator pattern.
- More generally: if we want to keep track of the number of times something occurred, we can initialize a variable to be 0 and add to it in our `for`-loop.
    - See Question 4 in Lab 4.
- The vast majority of `for`-loops you write in DSC 10 will be similar to this one.
    - Do **not** use `for`-loops to perform mathematical operations on every element of an array or Series; use built-in array/Series methods for that.
    - We will see **lots** of `for`-loops in the second half of the quarter.

## Summary

### Summary

- The `bool` data type has two possible values: `True` and `False`.
- The Boolean operators, `not`, `and`, and `or`, allow us to make expressions that involve multiple Booleans.
- `if`-statements allow us to run pieces of code depending on whether certain conditions are `True`.
- `for`-loops are used to repeat the execution of code for every element of a sequence.
    - Lists, arrays, and strings are examples of sequences.
- **Next time**: Probability.