## APS106 Lecture Notes - Week 6, Lecture 1
# Loops

## Some Programmer History

![Worsley](images/Beatrice_Worsley.jpeg)

(From Wikipedia) [Beatrice Worsley]([https://en.wikipedia.org/wiki/Beatrice_Worsley) earned the first PhD in computer science (University of Cambridge supervised by Alan Turing and Douglas Hartree). "She wrote the first program to run on EDSAC, co-wrote the first compiler for Toronto's Ferranti Mark 1, wrote numerous papers in computer science, and taught computers and engineering at Queen's University and the University of Toronto for over 20 years before her death at the age of 50."

![Worsley](images/worsley.jpeg)

![Worsley](images/worsley_location2.png)

## Lectures This Week


| Lecture | Topics | Reading |
| --- | --- | --- | 
| 6.1 | For-loops: `while` is so last month | Sect 9.3, 9.4 |
| 6.2 | For-loops on indices, nested loops | Sect 9.5-9.9 |
| 6.3 | Design Problem: MadLib |  |

 

Problem: You have had your DNA sequenced and each of your chromosomes is represented by a string of nucleotides: adenine (A), thymine (T), guanine (G), and cytosine (C). Each nucleotide is represented by its corresponding letter. For example:
 
```
chrome_4 = "ATGGGCAATCGATGGCCTAATCTCTCTAAG"
```

You want to do some data analysis of your genome and to start (this is called "Exploratory Data Analysis (EDA)" in data science), you want to count the number of occurences of each letter.

There are a number of ways to do this.

First, there exists a handy method on the string objects which counts sub-strings.

In [1]:
help(str.count)

Help on method_descriptor:

count(...)
    S.count(sub[, start[, end]]) -> int
    
    Return the number of non-overlapping occurrences of substring sub in
    string S[start:end].  Optional arguments start and end are
    interpreted as in slice notation.



In [5]:
chrome_4 = "ATGGGCAATCGATGGCCTAATCTCTCTAAG"
print("A",chrome_4.count('A'))
print("C",chrome_4.count('C'))
print("W",chrome_4.count('W'))

A 8
C 7
W 0


`count()` doesn't just apply to sub-strings of size 1.

In [7]:
print("TC",chrome_4.count('TC'))

TC 4


**A Second Way**: since we know about indexing and a while-loop, can you do this with a loop?

In [10]:
i = 0
while i < len(chrome_4):
    print(i, chrome_4[i])
    i += 1

0 A
1 T
2 G
3 G
4 G
5 C
6 A
7 A
8 T
9 C
10 G
11 A
12 T
13 G
14 G
15 C
16 C
17 T
18 A
19 A
20 T
21 C
22 T
23 C
24 T
25 C
26 T
27 A
28 A
29 G


In [12]:
i = 0
counter = 0
while i < len(chrome_4):
    if chrome_4[i] == 'A':
        counter += 1
    i += 1
    
print('A', counter)

A 8


In fact, we can put this in a function so that it looks a bit like `count()`

In [14]:
def my_count(target, letter):
    '''
    (str, str) -> int
    Returns the number of times that letter appears in target
    '''
    i = 0
    counter = 0
    while i < len(target):
        if target[i] == letter:
            counter += 1
        i += 1

    return counter

print("A",my_count(chrome_4, 'A'))
print("C",my_count(chrome_4, 'C'))
print("W",my_count(chrome_4, 'W'))

A 8
C 7
W 0


Now let's look at a new way to do this -- a more convenient looping construct: `for`.

## For Loops

The general form of a for loop is:
```
for item in iterable:
    body
```
Where "iterable" can be anything that can be 'iterated' over. So far, the only iterable we know about are strings.

Similar to `if` and `while` statements, there are two things to note here:
- There must be a colon (:) at the end of the `for` statement.
- The body must be indented.

The best way to understand for loops is to look at a few examples.

**For Loops Over Strings**

The general form of a for loop over a string is:
```
for variable in string:
    body
```
The variable refers to each character of the string in order and executes the body of the loop for each item. So let's go back to our example.

In [16]:
chrome_4 = "ATGGGCAATCGATGGCCTAATCTCTCTAAG"

for ch in chrome_4:
    print(ch, end=" ")
    
print()


A T G G G C A A T C G A T G G C C T A A T C T C T C T A A G 


This is really just an easier way to do what we did with the `while` loop above. However, notice the differences:
- in the `while` loop the loop variable (`i`) was the index of each character, while in the `for` loop the loop variable (`ch`) is the **value** of each character.
- we do not have to worry about how long the string is (e.g., use `len()`) because the `for` loop will go through every character of the string exactly once
- we do not have to worry about incrementing the loop variable (`i += 1`) as the `for` loop takes care of this.

Let's re-write our `my_count` function.

In [17]:
def my_count(target, letter):
    '''
    (str, str) -> int
    Returns the number of times that letter appears in target
    '''
    counter = 0
    for c in target:
        if c == letter:
            counter += 1

    return counter

print("A",my_count(chrome_4, 'A'))
print("C",my_count(chrome_4, 'C'))
print("W",my_count(chrome_4, 'W'))

A 8
C 7
W 0


**Example: Numeric Accumulator**

Write a function that takes in a string and returns the number of vowels in the string.

Hint: The `in` operator can be very helpful here.

In [18]:
if 'a' in 'abc':
    print("yes")
else:
    print('no')

if 'w' in 'abc':
    print("yes")
else:
    print('no')


yes
no


In [13]:
def count_vowels(s):
    """
    (str) -> int
    Return the number of vowles in s
    """

    num_vowels = 0
    
    for char in s:
        if char in 'aeiouAEIOU':
            num_vowels += 1
    return num_vowels

In [15]:
print(count_vowels('Happy Anniversary!'))

5


In [16]:
print(count_vowels('xyz'))

0


The loop in the function above will loop over each character in `s`, in turn. The body of the loop is executed for each character, and when a character is a vowel, the `if` condition is `True` and the value that `num_vowels` refers to is increased by one.

The variable `num_vowels` is an "accumulator", because it accumulates information. It starts out referring to the value 0 and by the end of the function it refers to the number of vowels in s.

**Example: String Accumulator**

Let's do the same thing but rather than return the number of vowels, return a list of all the vowels encountered. 

Hint: Your accumulator needs to be a string variable and you need to add each vowel to the end of it.

In [1]:
def collect_vowels(s):
    """ (str) -> str """
    vowels = ''
    for char in s:
        if char in 'aeiouAEIOU':
            vowels += char
    return vowels

In [2]:
print(collect_vowels('Happy Anniversary!'))

aAiea


In [3]:
print(collect_vowels('xyz'))




Variable `vowels` initially refers to the empty string, but over the course of the function it accumulates the vowels from `s`.

<div class="alert alert-block alert-info">
<big><b>This Lecture</b></big>
<ul>  
    <li>Looping over strings</li>
    <li>Accumulators</li>
<b>See Chapter 9 of the Gries textbook. This is all in there.</b>
</div>