# Iteration Review

## What are loops and why should we care about them?

Loops are also called _iterations_:

> it·er·ate &ndash; _To say or perform again; repeat._

## Loops are a fundamental control structure

- Loops are a fundamental building block of comptutational solutions to problems.

- They are an example of a **control structure**. 

- Conditionals are another example of control structures:

- Control structures allow you to control how you do things:
    + Conditionals control *when / whether*
    + Loops (aka iterations) *how many times* 

It's hard to build programs without a concise way to instruct the computer to do *repeated actions*.

Here are some simple examples. Try to think of how you might solve these without loops!

- Put 6 cups of flour into this box

- Stir occasionally until the sauce starts to reduce (user presses `0`)

**Ex.** Put 6 cups of flour into this box

_Without loops:_

In [None]:
def scoop_into_box():
    print("Scooping!")

scoop_into_box()
scoop_into_box()
scoop_into_box()
scoop_into_box()
scoop_into_box()
scoop_into_box()

**Ex.** Put 6 cups of flour into this box

_With loops:_

In [None]:
num_cups = 6
for cup in range(num_cups):
    scoop_into_box()

**Ex.** Stir occasionally until the sauce starts to reduce (user presses `0`)

_Without loops:_

In [None]:
def stir():
    print("stirring")

def check_sauce():
    user_input = input("Checking sauce (0 = thick): ")
    if user_input == "0":
        return "OK"
    else:
        return "thick"

def serve():
    print("sauce is ready")
    
if check_sauce() == "thick":
    stir()
    if check_sauce() == "thick":
        stir()
    else:
        serve()
else:
    serve()

_With loops:_

In [None]:
thickness = check_sauce()
while thickness == "thick":
    stir()
    thickness = check_sauce()
serve()

With loops these get a LOT easier to specify, and become more robust and reusable too.

Loops also enable many useful algorithms/patterns that go nicely with lists. 

You'll be practicing and applying them in PCEs and Projects this module!

For example:
- Searching through a list
- Filtering a list of items
- Counting occurrences in some collection

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---

## Two fundamental kinds of loops: definite and indefinite


### Definite loops (`for` loops)

Quite often we have a list of items of the lines in a file &ndash; effectively a finite set of things. 

We can write a loop to do some operation once for each of the items in a set using the Python `for` construct.

These loops are called “*definite loops*” because they execute an exact number of times.

We say that “definite loops iterate through the members of a set”

#### Rule of Thumb (definite loops)

Use definite / `for` when you know in advance how many times you want to do something.

This is the use case in our running example.

Other examples:
- Do an action $N$ times
- Take $M$ steps
- Do something for every item in a finite list of length $L$

### Indefinite loops (`while` loops)

Sometimes you want to repeat actions, but you don't know in advance how many times you want to repeat. 

But you do have a *stopping condition*. (You know when you should stop.) 

In this situation, you can use indefinite loops, which are called so because they keep going until a logical condition becomes `False`.

Examples:
- Keep going until I tell you to stop
- Keep stirring until the sauce thickens
- Keep taking candy from the box until your bucket is full or the box is empty

#### Rule of Thumb (indefinite loops)

Use indefinite/while when you don't know in advance how many times you want to do something, but do have a stopping condition you can clearly express.

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---

## Anatomy of a definite (`for`) loop in Python  

- The **iteration variable** "iterates" through the **sequence** (ordered set)

- The **block (body)** of code is executed once for each value **in** the **sequence**

- The **iteration variable** moves through all of the values in the **sequence**

In [None]:
nums = [5, 4, 3, 2, 1]

# here, i is the iteration variable
for i in nums: 
    # block/body
    print("taking something from the list")
    print(i) 

The iteration variable is a *variable*: this means you can name it whatever you like, subject to the basic syntax rules and of course our heuristic to name things to make the logic of the program legible.

In [None]:
nums = [5, 4, 3, 2, 1]

# here, num is the iteration variable
for num in nums: 
    # block/body
    new_num = num * 20
    print(new_num) 

In [None]:
nums = [5, 4, 3, 2, 1]

# here, num is the iteration variable
for num in nums:
    # begin for loop block/body
    if num % 2 == 0: # check if even
        # begin if branch
        print(num) 

Since lists can hold any type of data, the iteration variables can take any type as well as it iterates through a list

In [None]:
for name in ["john", "terrell", "qian", "malala"]:
    print(name)

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---

### Builtin function: `range()`

The range function produces an iterable sequence of numbers (a _Range_).

It can be used in three ways:

1. `range(n)` starts at `0` and stops at `n - 1`  

2. `range(m, n)` starts at `m` and stops at `n - 1` (with `n > m`)

3. `range(m, n, s)` starts at `m`, stops at `n - 1` incrementing by `s` (default: `s=1`)

To convert the range into a list, use the `list()` type converter.

Source: https://docs.python.org/3/library/functions.html

In [None]:
range(5)

Use this if you want to specify doing something N times e.g., here, take a step 5 times

In [None]:
for i in range(5):
    print("I has the value", i)
    print("Taking a step")

Without `range`, you need to specify the list explicitly:

In [None]:
for i in [0, 1, 2, 3, 4]:
    print("I has the value", i)
    print("Taking a step")

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---

## Anatomy of an indefinite (`while`) loop in Python

- The **stopping condition** defines when the loop will stop and go to the next block of code
  - It's composed of a *Boolean expression*
  - Eventually, the Boolean expression should become `False`!

- The **block (body)** of code is executed once for each iteration in the loop

- **Stopping condition update**: It is _essential_ that the body of the loop has some operation it that modifies what is checked in the stopping condition

In [None]:
n = 5
while n > 0:  # STOPPING CONDITION
    # body of the loop
    print(n)
    n = n - 1  # STOPPING CONDITION UPDATE
print("Blast off!")

__Ex.__ keep taking steps until you hit a limit:

In [None]:
steps = 0
limit = 20

while steps < limit: #STOPPING CONDITION
    print("Taking a step", steps)
    steps += 1 # STOPPING CONDITION UPDATE
print("Done!")

The stopping condition can use complex boolean expressions.

__Ex.__ user must enter their guess of a number using `input()` func. Keep trying until the number entered is the correct one, or until the user enters `"exit"`.

In [None]:
number = 5  # the number to guess

guess = input("Try to guess the number between 1 and 10, or say `exit` to quit: ")
found = False

while guess != "exit" and not found:
    if int(guess) == number:
        print("You got it!")
        found = True
    else:
        guess = input("Try to guess the number between 1 and 10, or say `exit` to quit: ")

if guess == "exit":
    print("Exiting.")

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---

## Breaking a loop with the `break` statement

The break statement ends the current loop and jumps to the statement immediately following the loop. 

It is like a loop test that can happen anywhere in the body of the loop

__Ex.__ User enters a name to be searched in a list of names. Stop when the name is found.

Using `break` in a definite loop:

In [None]:
names = ["Joel", "John", "Jane", "Jamie", "Lisa", "Anna", "Fred"]

to_find = input("enter a name to find: ")

found = False # default is we haven't found it

for name in names:
    print("Checking", name)
    if name == to_find:
        found = True # set found to true
        print("  Found!")
        break

print("We're done with the loop")

if found:
    print("Found", to_find + "!")
else:
    print("Didn't find", to_find)

<br/>
<br/>
<br/>
<br/>

---

Using `break` in an indefinite loop:

We can use the __`list.pop()`__ function to go _consume_ the list: https://docs.python.org/3/tutorial/datastructures.html

In [None]:
names = ["Joel", "John", "Jane", "Jamie", "Lisa", "Anna", "Fred"]

to_find = input("enter a name to find: ")

found = False # default is we haven't found it

while len(names) > 0 and not found:
    name = names.pop(0)
    print(name)
    if name == to_find:
        found = True
        print("  Found!")
        break
    
if found:
    print("Found", to_find + "!")
else:
    print("Didn't find", to_find)

<br/>
<br/>
<br/>
<br/>

---

`break` is used most often with indefinite loops. With `break`, we can effectively move the stop condition _inside_ the loop body

In [None]:
while True:
    line = input('> ')
    if line == 'done' :
        break
    print(line)
print('Done!')

<br/>
<br/>
<br/>
<br/>

---

Let's use a `break` statement in the number-guessing example

In [None]:
guess = input("Try to guess the number between 1 and 10, or say `exit` to quit")
number = 5
while guess != "exit":
    if int(guess) == number:
        print("You got it!")
        break # we're done, exit the loop
    else:
        guess = input("Try to guess the number between 1 and 10, or say `exit` to quit")
print("Thanks for playing!")

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---

## Aside: Indentation is key!

The way that Python knows what counts as the body of code for a loop (whether definite or indefinite) is through indentation. 

You must indent all code that goes in the body underneath the `for` / `while` statement (after the colon).

If you fail to indent the first line of code in the body, you will get an `IndentationError`.

If you fail to indent anything _after_ the first line of code in the body, you will be committing a _semantic error_ 

Python will not alert you because it is legal code. But your program will probably malfunction.

_This will throw `IndentationError`:_

In [None]:
for i in range(5):
print(i)

<br/>
<br/>
<br/>
<br/>

---

_This will work but not in the intended way._

__Ex.__ I want to step through a list of numbers, multiply each of them by 5 and print them out:

In [None]:
nums = [1, 2, 3, 4, 5]
for num in nums:
    new_num = num * 5
print(new_num)

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---


## Common design patterns with loops

### Counting

A common situation: you have a list of stuff, and you want to count how many times a certain kind of thing shows up in that list. 

In the simplest case: your counting condition is a single value

In [None]:
names = ["Joel", "John", "Jane", "Jamie", "Lisa", "Anna", "Fred"]
count = names.count("John")
print(count)

In the more general case, your counting condition will be based on a boolean expression.

_Iteration is a really helpful way to do this._

__Ex.__ I want to count the number of "high performers" in a list of scores (where high performing means score of 95 or greater).
```python
# initialize count variable

# for every item in list
    # check if we should count it
        # increase count if it checks out
```

In [None]:
# input list
scores = [65, 78, 23, 97, 100, 25, 95]

# score of A
threshold = 93

# define the count variable, initialize to 0
n_highperformers = 0

# go through each item
for score in scores:
    # check if it's above my threshold / meets my criteria for being counted
    if score >= threshold:
        # increase the count
        n_highperformers += 1
        
# report result
print(n_highperformers)

<br/>
<br/>
<br/>
<br/>

Especially in problem formulation, this code goes well into a function:

In [None]:
def count_high_performers(input_scores, threshold):
    count = 0 # define the count variable, initialize to 0
    for score in input_scores: # go through each item
        if score >= threshold: # check if it's above my threshold / meets my criteria for being counted
            count += 1 # increase the count
    return count

# input list
new_scores = [65, 78, 23, 97, 100, 25, 85, 99, 85, 95]

# call
print(count_high_performers(new_scores, threshold=90))

<br/>
<br/>
<br/>
<br/>

__Another example:__ I have a list of names, and I want to count how many names don't start with the letter J.

In [None]:
names = ["Joel", "John", "Jane", "Jamie", "John", "Michael", "Sarah", "Joseph", "Chris", "Ray"]

# define the count variable, initialize to 0
count_not_j = 0 

# go through each item
for name in names: 
    # check if name doesn't start with j / meets my criteria for being counted
    if not name.startswith("J"): 
        count_not_j += 1 # increase the count

print(count_not_j)

<br/>
<br/>
<br/>
<br/>

If you want to count occurrences based on a simple exact match, you can use the `.count()` list method.

In [None]:
names = ["Joel", "John", "Jane", "Jamie", "John", "Michael", "Sarah", "Joseph", "Chris", "Ray"]
names.count("John")

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---

### Searching

Another common situation is checking whether you should proceed with a list. Does it contain a value that meets some condition? 

This would be a variation on the counting and filtering again. __Ex.__ find name in list of names:

In [None]:
# names to search
names = ["Joel", "John", "Jane", "Jamie"]

# define a found variable, initialize to False
found = False

# go through each item
for name in names: 
    # check if is john / meets my criteria for being found
    if name == "John":
        # if we find john, set found to True
        found = True
        # and exit
        break 

# print out the result
print(found)

# or use it
if found:
    print("Found john!")
else:
    print("Didn't find john")

<br/>
<br/>
<br/>
<br/>

This pattern becomes more useful when the condition is more complicated than finding a particular value. __Ex.__ find an high-performer (score above threshold)

In [None]:
# input list
scores = [65, 78, 23, 97, 25, 85]

threshold = 95
# define a found variable, initialize to False
found = False 

 # go through each item
for score in scores:
    # check if it's above my threshold / meets my criteria for being found
    if score >= threshold:
        # if we find a high performer, set found to True
        found = True
        # and exit
        break

# print the result
print(found)

# or use it
if found:
    print("Found a high performer!")
else:
    print("Didn't a high performer")

<br/>
<br/>
<br/>
<br/>

Sometimes you also want to find the index position.

In [None]:
scores = [65, 78, 23, 97, 25, 85] # input list

threshold = 95
# define a found variable, initialize to False
found = False
# define a found index, initialize to -1 (not found)
found_index = -1

for i in range(len(scores)):
    if scores[i] >= threshold:
        found = True
        found_index = i
        break
        
print("Found high achiever at", found_index)

<br/>
<br/>
<br/>
<br/>

If you only want to find the first occurrence of something, based on an exact match, you can use the `.index()` list method

In [None]:
scores = [65, 78, 23, 97, 25, 85]
scores.index(23)

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---

### Filtering

We can take the counting and searching cases and go further: what if we want to not only count, but also "grab" the things that meet our criteria?

We'd want to create a new list, and make sure we have a bit of code that adds to that new list based on the criteria we have.

In [None]:
# input list to be filtered
scores = [65, 82, 23, 97, 100, 95]

# output list, initialize to empty list
to_grab = []
threshold = 80

# go through each item
for score in scores:
    # check if it's above my threshold / meets my criteria for being filtered
    if score >= threshold: 
        # add the item to the output list
        to_grab.append(score)

print(to_grab)

<br/>
<br/>
<br/>
<br/>

For some simple filtering, you can use the `filter()` builtin function.

It takes two arguments:
1. A function that takes an element as argument and returns True if it is to be kept, False otherwise
2. The list of elements to filtered

The result needs to be converted to a list using the `list()` type conversion function.

In [None]:
scores = [65, 82, 23, 97, 100, 95] # input list to be filtered

def above_threshold(x):
    threshold = 80
    if x > threshold:
        return True
    else:
        return False

to_grab = list(filter(above_threshold, scores))
print(to_grab)

<br/>
<br/>
<br/>
<br/>

__Ex.__ filter names that do not start with J

In [None]:
names = ["Joel", "John", "Lane", "Jamie", "Freddy"]

# output list, initialize to empty list
to_grab = []

# go through each item
for name in names:
    # check if name doesn't start with J / meets my criteria for being filtered
    if not name.startswith("J"):
        # add the item to the output list
        to_grab.append(name)

# print out the result
print(to_grab)

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---

### Mapping / transforming

Finally, sometimes you want to modify some / all elements in a list into a new list. 

An example might be data cleaning, or data transformation.

__Ex.__ Convert to percentages.

In [None]:
scores = [65, 82, 23, 97, 100, 95]

# output list
percentages = []

# go through every item
for score in scores:
    # apply the transformation
    percent = score / 100
    # add the transformed value to the output list
    percentages.append(percent)

print(percentages)

<br/>
<br/>
<br/>
<br/>

__Ex.__ Change outliers (those above 1000) to missing ("NA")

In [None]:
scores = [65, 82, 2323, 97, 100, 95000]

# output list
no_outliers = []

# go through every item
for score in scores:
    # check if it's an outlier
    if score > 1000:
        # apply transformation and add to list
        no_outliers.append("NA")
    else:
        no_outliers.append(score)
print(no_outliers)

<br/>
<br/>
<br/>
<br/>

For some simple transformations, you can use the `map()` built-in function.

It takes two arguments:
1. A function that takes an element as argument and returns a transformed element
2. The list of elements to filtered

The result needs to be converted to a list using the `list()` type conversion function.

In [None]:
scores = [65, 82, 23, 97, 100, 95]

def to_percentage(x):
    return x / 100

percentages = list(map(to_percentage, scores))

print(percentages)

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---

### Coordinated iteration across multiple sequences

One of the Project problems relies on a design pattern I haven't yet explicitly shown you in clear terms. 

So I want to quickly review it. 

How do you go through the elements of a list, index by index? 

In [None]:
# basic iteration through a list using indices
names = ["Joel", "John", "Lane", "Jamie", "Freddy"]
eligibilities = [True, False, True, True, False]

# make a range of numbers that start at 0, and stop before
# the length of the names listb
for index in range(len(names)):
    name = names[index]
    eligible = eligibilities[index]
    print(name, eligible)

You can figure out how this might generalize to the rock paper scissors problem, where you need to go through two lists in lockstep (first item from both lists, then second item from both lists, and so on)

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---

# Common errors

### IndexError when looping through a list

This comes up mostly with `while` loops. So, while it's possible to convert any `for` loop into a while loop, you want to be careful with it.

In [None]:
names = ["Joel", "John", "Jane", "Jamie", "John"]
to_grab = [] # output list, initialize to empty list

# set initial index to zero
index = 0 

# until you reach the end of the list
while index < 10:
    print(index)
    
    # get the name at this index
    name = names[index]
    
    # check if is john / meets my criteria for being filtered
    if name == "John":
        # add the item to the output list
        to_grab.append(name)
    
    # increment the index
    index += 1 

# print out the result
print(to_grab)

In [None]:
# basic iteration through a list using indices
names = ["Joel", "John", "Lane", "Jamie", "Freddy"]

for index in range(6):
    name = names[index]
    print(index, name)

<br/>
<br/>
<br/>
<br/>

### Infinite loops

Remember that with indefinite loops, we need the **stopping condition** to be `False` at some point. 

Or at least, give the loop a way to exit / `break`. 

Otherwise, it will go forever! 

A common error is to forget to include any block of code in the **body (block)** of the loop that modifies the **stopping condition** or provides a **break** condition.

<div class="alert alert-warning">Run this, then go to <tt>Kernel &rarr; Interrupt</tt> to stop the infinite loop</div>

In [None]:
n = 5
while n > 0:
    print(n)
#    n = n - 1
print("Blast off!")

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---

# Coding Challenge

Convert this problem formulation into code. Use a `for` loop.

<img src="https://terpconnect.umd.edu/~gciampag/INST126/images/Week%205%20-%20Problem%20Formulation%20-%20Frame%201.jpg" />

In [None]:
# key variables:
# the input LIST of strings
inputs = [
    "hello sarah@umd.edu",
    "from: giovanniciampaglia@umd.edu",
    "some other text that doesn't have an email"
]

# Your code here
...

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

---

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>


---

# Solutions

## Coding Challenge

In [None]:
# key variables:
# the input LIST of strings
inputs = [
    "hello sarah@umd.edu",
    "from: giovanniciampaglia@umd.edu",
    "some other text that doesn't have an email"
]

### BEGIN SOLUTION
# a LIST to store the email addresses
emails = []

# LOOP over every text input
for text_input in inputs:
    
    # extract an email address
    # split the text into subsets
    chunks = text_input.split()
    
    # LOOP over the list of chunks to check each one
    for chunk in chunks:
        # check if it has @ and .
        if "@" in chunk and "." in chunk:
            # put the chunk in the email list
            emails.append(chunk)
# give the email address back to the user
print(emails)
### END SOLUTION