# Loops: Describing How to Repeat Steps in a Program

## For Loops:

Like R and Matlab, Python makes it straightford repeat commands "For-Each" element in a data collection.  For example, to print each element in a list:

```python
names = ['Ted', 'Roy', 'Keeley']
for name in names:
   print(name)
# prints "Ted" then "Roy" then "Keeley"
```
.



**Exercises**

Example: Print 10 times each number in the list 
For example, `[2, 4]` should print `20` then `40`.

In [1]:
nums = [4, 8, 10, 5]

In [2]:
for num in nums:
    print(num * 10)

40
80
100
50


Print the first letter of each name in the list

In [3]:
names = ["John", "Harry", "Moe", "Luke"]

In [4]:
for name in names:
    print(name[0])

J
H
M
L


Print the sum of each tuple in the list:

In [8]:
results = [(3, 5, 2), (2, 3, 7), (8, 9, 1)]

In [9]:
for result in results:
    print(sum(result))

10
12
18


## List-Append Loop Pattern

This can also be used to create new data collections.  For example:

```python
old_names = ['ted', 'roy', 'keeley']
new_names = []
for name in old_names:
   new_names.append(name.title())
```

**Exercises**

Example: Make a list with the first codon (first three letters) of each sequence

In [10]:
seqs = ["GTAATCG", "GTACCAAA", "GGTAGTACCAC"]

In [12]:
codons = []
for seq in seqs:
    codon = seq[:3]
    codons.append(codon)
codons

['GTA', 'GTA', 'GGT']

Clean up the data.  Make a list with that formats each sequence the same way (transform: `str.upper()`).

In [14]:
seqs = ["GTAATCG", "gtaccaaa", "GGtAGtACCaC"]

In [17]:
newseqs = []
for seq in seqs:
    newseq = seq.upper()
    newseqs.append(newseq)
newseqs

['GTAATCG', 'GTACCAAA', 'GGTAGTACCAC']

Make a list of the number of A in each sequence (transform: `str.count()`)

In [18]:
seqs = ["GTAATCG", "gtaccaaa", "GGtAGtACCaC"]

In [20]:
acounts = []
for seq in seqs:
    acount = seq.upper().count('A')
    acounts.append(acount)
acounts

[2, 4, 3]

Make a list of sequences from the list of sequences combinations (transform: `str.split()`)


In [21]:
seqs = ['GCAGA GATATC', 'GGTAAAA ACTAGA GGTATA', 'GGTAA']

In [23]:
all_seqs = []
for seq in seqs:
    all_seq = seq.split(' ')
    all_seqs.extend(all_seq)
all_seqs

['GCAGA', 'GATATC', 'GGTAAAA', 'ACTAGA', 'GGTATA', 'GGTAA']

## Sequence-Unpacking Loop Pattern

Loops are also useful when you have a collection of collections (e.g. a list of lists).
When it's a collection of same-length sequences, you can break apart each sequence inside the loop and work with each value ias its own variable:

```python
pairs = [[4, 5], [7, 8], [2, 9]]
for first, second in pairs:
    print(first + second)
```

**Exercises**

Example: Make a list of the sum of each number pair

In [26]:
pairs = [[4, 7], [7, 2], [10, 2], [5, 9]]

In [27]:
sums = []
for a, b in pairs:
    sums.append(a + b)
sums

[11, 9, 12, 14]

Make a list of filenames with the corresponding base name and file extension:

In [30]:
parts = [
    ('seq1', '.txt'),
    ('seqA', '.fasta'),
    ('myseq', '.dat'),
]

In [33]:
filenames = []
for base, ext in parts:
    filename = base + ext
    filenames.append(filename)
filenames

['seq1.txt', 'seqA.fasta', 'myseq.dat']

Calculate `y` for each sets of coeffients using the formula $ y = a ^ 2 + b + c $:

In [34]:
abcs = [(3, 2, 5), (7, 2, 1), (8, -3, 4)]

In [36]:
ys = []
for a, b, c in abcs:
    ys.append(a**2 + b + c)
ys

[16, 52, 65]

## The Zip Loop Pattern

Most of the times, you don't have a collection of pairs--sometimes, you need to make that collection yourself before you can loop over them.  The `zip()` function makes this straightforward!

```python
names = ['Zanarah', 'Joe', 'Weiwei',]
ages = [20, 21, 22]
for name, ages in zip(names, ages)):
    print(name, age)
```
.




**Exercises**

Example: Add each pair of numbers

In [38]:
firsts = [1, 2, 3, 4, 5]
seconds = [10, 20, 30, 40, 50]

In [39]:
sums = []
for first, second in zip(firsts, seconds):
    sums.append(first + second)
sums

[11, 22, 33, 44, 55]

Print the patient number and treatment group of each patient
(e.g. "Patient 32341: control") (transform: `str.format()`)

In [41]:
patients = [32451, 435679, 4211235, 123121]
groups = ['control', 'treatment', 'treatment', 'control']

In [43]:
for patient, group in zip(patients, groups):
    print("Patient {}: {}".format(patient, group))

Patient 32451: control
Patient 435679: treatment
Patient 4211235: treatment
Patient 123121: control


Make a list of True and False values if the first sequence has more Cs than the second sequence in each pair of sequnces. (transform: `str.count()`)

In [44]:
firsts = ['GAGATTACA', 'CAGATGATA', 'GGAGGACCAAG']
seconds = ['GGAACCAA', 'CACAGGAGA', 'GATATAACA']

In [45]:
more_cs = []
for first, second in zip(firsts, seconds):
    more_c = first.count('C') > second.count('C')
    more_cs.append(more_c)
more_cs

[False, False, True]

Make a list of filenames called `filenames`, combining the basename and extension from each matching pair in the list.

In [46]:
basenames = ['requirements', 'README', 'main', 'docs']
extensions = ['.txt', '.md', '.py', '.rst']

In [47]:
filenames = []
for base, ext in zip(basenames, extensions):
    filenames.append(base + ext)
filenames

['requirements.txt', 'README.md', 'main.py', 'docs.rst']

## Enumerate Pattern

Sometimes you want to store the **index** of items in a sequence.  You could calculate this in a loop:

```python
bundesländer = ['Baden-Württemberg', 'Bayern', 'Thuringen']
idx = 0
for bundesland in bundesländer:
    print(idx, bundesland)  # prints 0 Baden-Württemberg
    idx += 1
```
.

Python's `enumerate()` function generates a list of (index, element) pairs:

```python
bundesländer = ['Baden-Württemberg', 'Bayern', 'Thuringen']
indices_bundesländer = list(enumerate(bundesländer))  # [(0, 'Baden-Württemberg'), ...]
for idx, bundesland in indices_bundesländer:
    print(idx, bundesland)  # prints 0 Baden-Württemberg
```
.

Like with `zip()`, this can be shortened by just putting it in the header of the for loop:

```python
bundesländer = ['Baden-Württemberg', 'Bayern', 'Thuringen']
for idx, bundesland in enumerate(bundesländer):
    print(idx, bundesland)  # prints 0 Baden-Württemberg
```
.



**Exercises**

Example: Print the index and name of each City:

In [48]:
cities = ['Paris', 'Nice', 'Marseille', 'Bordeaux']
for idx, city in enumerate(cities):
    print(idx, city)

0 Paris
1 Nice
2 Marseille
3 Bordeaux


Print the index and name of each U.S. president:

In [49]:
presidents = ['Washington', 'Adams', 'Jefferson', 'Madison', 'Monroe']

In [50]:
for idx, president in enumerate(presidents):
    print(idx, president)

0 Washington
1 Adams
2 Jefferson
3 Madison
4 Monroe


## The `range` Iterator

What if you don't have a collection of data, but you already know how many times you want to repeat a task? Well, to use a for-loop, we need to **generate** a sequence for the for loop to iterate through! `range()` generates a series of integers, using the same closed-open convention as Python slicing. For example:

```python
>>> range(5)  # Create the generator
range(0, 5)

>>> list(range(5))  # Iterate through the generator to make a list
[0, 1, 2, 3, 4]

>>> tuple(range(-2, 2))  # Iterate to make a tuple
(-2, 1, 0, 1)
```
.


**Exercises**

Using For-Loops and the range() function, do the following tasks:

Print the numbers 0 through 4:

In [None]:
for x in range(5):
    print(x)

0
1
2
3
4


Print the numbers 0 through 9:

Print "Hello World" five times.

Make a list with this sequence: `['a', 'ab', 'abc', 'abcd', 'abcde', 'abcdef']`

In [None]:
from string import ascii_lowercase
ascii_lowercase

'abcdefghijklmnopqrstuvwxyz'

## Extra: Other Iterators

Any iterator, often created from a generator function, can be used in for-loops.

In [None]:
from itertools import combinations, combinations_with_replacement, product, permutations, repeat

**Exercises**

Example: Repeat "Hello World" five times using an itertools function.

In [None]:
for msg in repeat('Hello World', 5):
    print(msg)

Hello World
Hello World
Hello World
Hello World
Hello World


Repeat "Iterators are neat!" 10 times  using an itertools function.

Print all the two-letter sequence combinations of A, B, C, and D  using an itertools function:

AB, AC, AD, BC, BC, CD

Print all the different combinations of paired values in the two lists (i.e. the "product")

a1, a2, a3, b1, b2, b3, c1, c2, c3

In [None]:
letters = ['a', 'b', 'c']
numbers = [1, 2, 3]

Lots of great iterators exist in the Python ecosystem: here are some ideas for more for your reference: https://docs.python.org/3/library/itertools.html#itertools-recipes