# Week 6 Notes: Control Flow

In past weeks, the Python interpreter has executed every single one of our lines once, in the order that they appear in our code.  This week, we start to control which lines of code are run under what conditions, and to execute our code many times over.  Read more at [Wikipedia](https://en.wikipedia.org/wiki/Control_flow).  

## Part 1: Boolean data and Boolean logic

Before we start, we need one more piece of the puzzle: a new data type called a Boolean data type or, in Python, `bool`. [Boolean algebra](https://en.wikipedia.org/wiki/Boolean_algebra) is the math of true and false, logical values and operations. You probably learned some in grade school  In Python, a Boolean can be `True` or `False`.  

A Boolean expression is one that evaluates to `True` or `False`, often using Boolean operators. Those include `and`, `or`, and `not`.

Experiment with combining `bool` values and these three new operators below:

False


Another way to use `bool`s is to evaluate comparisons, such as inequalities. From the [Python docs](https://docs.python.org/3/library/stdtypes.html#comparisons):

| Operation | Meaning |
| --------- | ------- |
| `<`      | strictly less than |
| `<=`     | less than or equal |
| `>`      | strictly greater than |
| `>=`     | greater than or equal |
| `==`     | equal |
| `!=`     | not equal |
| `is`     | object identity |
| `is not` | negated object identity |
| `in`     | is an element in a list |

We can use these to create <ins>Boolean expressions</ins>, or bits of code that evaluate to `True` or `False`.  Try a few below using familar numbers:

In [30]:
2 + 2 < 4

False

Now combine some inequalities along with `and`, `or`, and `not` to create some more complicated Boolean expressions:

In [35]:
(1 > 3) and (2 == 2)

False

Our last new operator for the day is a mathematical operator. No need for NumPy, it's one of the few built into Python. The modulo operator, `%` gives you the remainder after division. So `9 % 3` is 0 because there is no remainder after 9 is divided by 3, but `10 % 3` is 1 because the remainder after dividing 10 by 3 is 1.

**Example:** The summer olympics is every four years, and the last one in Paris was in 2024. Write a Boolean expression that returns `True` if the year you type is a year of the summer olympics, and `false` if the year you type is not a summer olympics year.  If you have time, write another for the winter olympics!  If we have time, we'll talk about "truthy" and "falsey".

In [41]:
year = 2734

is_summer_olympics = (year % 4) == 0
is_winter_olympics = (year % 4) == 2

0

In [50]:
not (     12    %    4   )

True

## Part 2: The `if` statement

Here it is, our first control structure. We'll let the program decide whether to evaluate some code, conditional on a Boolean expression.  For a nice in-depth `if` statement tutorial, [RealPython](https://realpython.com/python-conditional-statements/) does a nice job.  

An `if` statement looks like

```{python}
if <Boolean expression>:
    <some code>
```

There are five parts and all of them are required:
1. The `if` keyword
2. The Boolean expression, which evaluates to `True` or `False`
3. The colon before the line break
4. The indent after the line break
5. The code you want to evaluate if the Boolean expression is `True`. Your code doesn't include the <>.

Python doesn't care how many spaces you put between the numbers and the plus sign in `1 + 1` but it really cares that you indent the code that you want to evaluate if the condition is `True`.  Four spaces is standard. When you hit the `<tab>` key, most code editors will default to four spaces or let you choose what gets inserted.  

### Exercise 2.1: 

According to the trusty [GSA Time Scale](https://rock.geosociety.org/net/documents/gsa/timescale/timescl.pdf?v=2022) and our best available U-Pb geochronology, the Cenozoic period started 66.02 million years ago (abbreviated Ma). Write a line of code that assigns a number like 12 to the variable `age_in_Ma`. Then write an `if` statement that prints "That age is in the Cenozoic!" if the age is in the Cenozoic Era.  

In [58]:
age_in_Ma = 66

if age_in_Ma < 66.02:
    print("that's in the Cenozoic!")
    print("that's recent!")
    print("compared to the Archean")

print("that was easy!")

that's in the Cenozoic!
that's recent!
compared to the Archean
that was easy!


### Part 2.2 `else:`

Now we'd like to print out some more information about our age, like for instance whether the `age_in_Ma` is before the Cenozoic. Instead of writing a new `if` statement with a new Boolean expression, we can just use the one we already made and tack on an `else` statement.  The new structure should look like:

```{python}
if <Boolean expression>:
    <some code if expression is True>
else:
    <some code if expression is False>
```

Copy and paste your time scale code below and add an `else:` that prints "That age is before the Cenozoic." if the age is pre-Cenozoic.

In [60]:
age_in_Ma = 100

if age_in_Ma < 66.02:
    print("That's in the Cenozoic!")
else:
    print("That's before the Cenozoic")


That's before the Cenozoic


### Part 2.3: `elif:`

That's more informative! Now we want to add more eras to our code. To do that, instead of adding another `if` statement, we'll tack onto the one we already have, since we are just up to one thing: figuring out what era our age belongs to.  

But! We do need another condition that includes the the age of the Permo-Triassic boundary, the end of the Paleozoic Era and the start of the Mesozoic Era.  We can add an additional condition to our if statement using `elif`, short for "else if".  

```{python}
if <Boolean expression>:
    <some code if expression is True>
elif <if first Boolean is false, evaluate this next>:
    <and run this code if the second condition is True>
else:
    <some code if all above expressions are False>
```

You can stack as many `elif` statements as you want. They all get tested in sequence, and the the first one that is `True` runs the code indented underneath it.  I think of this like a [coin sorter](https://youtu.be/ykvUE8Ad8Ls?feature=shared&t=662) where the series of `if` and `elif` statements are a series of holes and the indented code underneath them are the holes that the coins fall into.

**Exercise 2.3** Copy and paste your time scale code from above, and modify it so that it tells you if `age_in_Ma` is in the "Cenozoic", "Mesozoic", or "Paleozoic or earlier".  Bonus: add the Proterozoic, Archean, and Hadean. No matter what the value of `age_in_Ma` is, have the computer print "You're in the present. Have a good day!" afterwards.


In [73]:
age_in_Ma = 100

if age_in_Ma < 66.02:
    print("That's in the Cenozoic!")
elif age_in_Ma < 252:
    print("That's in the Mesozoic!")
    print("Time of the dinosaurs!!!")
    print("and mosquitos")
elif age_in_Ma < 541:
    print("That's in the Paleozoic!")
else:
    print("That's before the Paleozoic")

print("You're in the present. Have a good day!")

number = 8
if number % 2 == 0: print("even")



That's in the Mesozoic!
Time of the dinosaurs!!!
and mosquitos
You're in the present. Have a good day!
even
 steven


Further reading, from RealPython:

- [One-line `if` statements](https://realpython.com/python-conditional-statements/#one-line-if-statements)
- [Conditional expressions](https://realpython.com/python-conditional-statements/#conditional-expressions-pythons-ternary-operator)

## Part 3: `for` statements

This is the way to repeat a block of code many times -- as many as you like! It's one of several kinds of loops you find in programming langauges, but it's probably the most frequently used in scientific programming.  If you know this one, you're covered 99.9% of the time.  

How you set up the for loop depends mostly on what you want to do inside the for loop.  We'll learn a basic template and then see how you can skip a few steps using some Python powers.  

### 3.1 The most basic `for` loop

A basic `for` loop executes a block of code a pre-arranged number of times.  We're going to use NumPy arrays to make the for loop workings transparent, then we'll introduce some new tricks.

If you want to repeat a process 10 times, you might use the code that looks something like:

```{python}
loop_count = 10
loop_indices = np.arange(loop_count)

for index in loop_indices:
    print("This is loop ", index)
```

Copy and paste the code below, then let's talk through what happens at each step in the loop.


In [4]:
import numpy as np

loop_count = 10
loop_indices = np.arange(loop_count)

for index in loop_indices:
    print("This is loop ", index)

This is loop  0
This is loop  1
This is loop  2
This is loop  3
This is loop  4
This is loop  5
This is loop  6
This is loop  7
This is loop  8
This is loop  9


In [12]:
loop_count = 10
loop_indices = np.arange(loop_count)
loop_indices

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

There were six ingredients to writing this for loop:
1. The `for` keyword
2. A brand new variable that gets created by the `for` structure. In this case, it's called `index.`
3. The keyword `in`
4. The *iterable* or list/collection of things to iterate over. We'll do the loop once for each item in this list/collection. In our example, we used a NumPy array, and we'll explore other iterables below.
5. The colon. Non-optional!
6. An indented code block that gets iterated.  Like the `if` structure, everything indented to the same level is included in the block.

Here's an example of using a list as an iterable:  The first names in our class, input as a Python list, are 
```{python}
class_names = ["Charles", "Baris", "Caden", "Laiken", "Elijah", "Megan", "Dan", "Brooke", "Lily", "Yakov"]
```
Make a for loop that says "Hello!" to everyone in class by name.

In [36]:
class_names = ["Charles", "Baris", "Caden", "Laiken", "Elijah", "Megan", "Dan", "Brooke", "Lily", "Yakov"]

loop_count = 9
loop_indices = class_names

for name in loop_indices:
    print("Hey ", name, "!")

len(class_names)

Hey  Charles !
Hey  Baris !
Hey  Caden !
Hey  Laiken !
Hey  Elijah !
Hey  Megan !
Hey  Dan !
Hey  Brooke !
Hey  Lily !
Hey  Yakov !


10

Instead of creating a whole NumPy array with a value for each number you want to iterate over you can use a new data type called a range. These are especially handy in `for` loops. The syntax is 
- `range(stop)` (start at 0, stop just before `stop`, increment by 1)
- `range(start, stop)` (start at `start`, stop just before `stop`, increment by 1)
- `range(start, stop, step)` (start at `start`, stop just before `stop`, increment by `step`)

You can see how long a range is with the `len()` function, and see its `max()` and `min()` values, but you can't see it all of its elements displayed at the same time with the `print()` command like you can a NumPy array.  All those values don't exist in memory at once, the object is just the rule for creating them via `start`, `stop`, `step`.

Here's another loop that runs 10 times

```{python}
for index in range(10):
    print("This is loop ", index)
```

Exercise: Make a loop that prints out all the even numbers between 0 and 20. Bonus: code this in as many ways as you can think of.

In [40]:
for index in range(10):
    print("This is loop ", index)

This is loop  0
This is loop  1
This is loop  2
This is loop  3
This is loop  4
This is loop  5
This is loop  6
This is loop  7
This is loop  8
This is loop  9


In [42]:
for number in range(0, 21, 2):
    print("evens", number)

evens 0
evens 2
evens 4
evens 6
evens 8
evens 10
evens 12
evens 14
evens 16
evens 18
evens 20


### Part 3.2: Doing math in loops

Loops are particularly helpful in doing math by brute force.  For instance, say we want to find the cumulative sum of the integers from 1 to $n$. For instance, we might want to track the number of food items the very hungry caterpillar has eaten. The formula is $n(n+1)/2$, but you forgot. You can use a `for` loop to help you out.

Example: write a `for` loop to evaluate the cumulative sum of the integers from 1 to $n$.  Test that your code works using the formula above.

In [64]:
sum_up_to_integer = 12

# calculate the cumulative sum of integers from 1 to sum_sum_up_to_integer

my_sum = 0

for integer in range(1, sum_up_to_integer + 1):
    my_sum = my_sum + integer

print("my sum =", my_sum)

my sum = 78


In [2]:
# alternative method using the '+=' operator
sum_up_to_integer = 12

# calculate the cumulative sum of integers from 1 to sum_sum_up_to_integer

my_sum = 0

for integer in range(1, sum_up_to_integer + 1):
    my_sum += integer

print("my sum =", my_sum)

my sum = 78


In [84]:
#check with the formula that was given

formula_sum = (sum_up_to_integer * (sum_up_to_integer + 1) / 2)
print("formula sum =", formula_sum)

formula sum = 78.0


### For loop no-nos!

When you're making a for loop, there are a couple things to avoid.
1. Modifying the loop index. Use it but don't change it!
2. Modifying the iterable itself. This can be done carefully, but it will throw a warning -- it's easier to find a way not to do this.

### Part 3.3 Using the loop index to index into an array or list

That loop index is really useful.  For instance, you might have an array called square_side_length and you'd like to calculate a new array called square_area but squaring each value in your original array.  You could do that (awkwardly) with a for loop like so

```{python}
square_side_length = np.array([3, 4, 8, 2, 3, 2.2, 9])
n_squares = len(square_side_length)
square_areas = np.zeros(n_squares)

for square_index in range(n_squares):
    square_areas[square_index] = square_side_length[square_index] ** 2
```

This syntax looks awkward but it's quite flexible. It's also the syntax that most non-Python languages use to create `for` loops.  In general, we'll try and avoid this altogether. For instance, NumPy arrays let us shorten this to 
```{python}
square_areas = square_side_length ** 2`
```

### Exercise 3.4

The Fibonacci numbers are a recursive series where each new number is calculated as the sum of the previous two.  The series starts as $[0, \,1, \,\ldots]$ and continues.  Write a `for` loop to calculate the $n$th Fibonacci number, were $n$ is input on the first line of the code.  Bonus: provide an array of all the Fibonnaci numbers up to and including the $n$th.

In [129]:
n_numbers = 12

# create an array that we can add fibonacci numbers to
fib_seq = np.zeros(n_numbers + 1)

# fill the first two spots of that array with 1
fib_seq[0] = 0
fib_seq[1] = 1

for fib_index in range(2, n_numbers + 1):
    fib_seq[fib_index] = fib_seq[fib_index - 1] + fib_seq[fib_index - 2]
    print(fib_seq[fib_index])

print("The 12th Fibonacci number is", fib_seq[fib_index])

1.0
2.0
3.0
5.0
8.0
13.0
21.0
34.0
55.0
89.0
144.0
The 12th Fibonacci number is 144.0


### Part 3.5: Using `enumerate()`

Sometimes you want to loop over a list or a pandas Series but you really need that loop index to help out.  You can do that with Python's `enumerate()` function.  Going back to our class roster, if we wanted to count off the students (starting at 0), we could use something like

```{python}
class_names = ["Charles", "Baris", "Caden", "Laiken", "Elijah", "Megan", "Dan", "Brooke", "Lily", "Yakov"]
for index, name in enumerate(class_names):
    print(name, "is student number", index)
```

Now we have two variables -- `index` and `name` that change on each loop iteration!

**Exercise:** If the final grades for the class are (randomly) `class_grades = [99.1, 99.3, 99.7, 99.2, 99.3, 99.9, 99.2, 99.5, 99.8, 99.2]` then print the final grades next to each name on the class roster. Congratulations on your grades!!

