# For-Each Loops: Repetition through Iteration

## For-Each Loops:

Like R and Matlab, Python makes it straightford repeat commands "For-Each" element in a data collection.  For example, to print each element in a list:

```python
names = ['Ted', 'Roy', 'Keeley']
for name in names:
   print(name)
# prints "Ted" then "Roy" then "Keeley"
```
.



**Exercises**

One by one, print each line in this haiku

In [2]:
haiku = ['“A World of Dew” by Kobayashi Issa', '----------------------------------', '', 'A world of dew,', 'And within every dewdrop', 'A world of struggle.']

for line in haiku:
    print(line)


“A World of Dew” by Kobayashi Issa
----------------------------------

A world of dew,
And within every dewdrop
A world of struggle.


Print 10 times each number in the list 
For example, `[2, 4]` should print `20` then `40`.

In [3]:
nums = [4, 8, 10, 5]
for num in nums:
    print(num * 10)

40
80
100
50


Print the first letter of each name in the list

In [None]:
names = ["John", "Harry", "Moe", "Luke"]

## List-Append Loop Pattern

This can also be used to create new data collections.  For example:

```python
old_names = ['ted', 'roy', 'keeley']
new_names = []
for name in old_names:
   new_names.append(name.title())
```

**Exercises**

Make a list with the first codon of each sequence

In [5]:
seqs = ["GTAATCG", "GTACCAAA", "GGTAGTACCAC"]
codons = []
for seq in seqs:
    codon = seq[0:3]
    codons.append(codon)
codons

['GTA', 'GTA', 'GGT']

Clean up the data.  Make a list with that formats each sequence the same way.

In [None]:
seqs = ["GTAATCG", "gtaccaaa", "GGtAGtACCaC"]

Make a list of the number of A in each sequence

In [9]:
seqs = ["GTAATCG", "gtaccaaa", "GGtAGtACCaC"]
a_list = []
for seq in seqs:
    upper_seq = seq.upper()
    num_as = upper_seq.count('A')
    a_list.append(num_as)
a_list

[2, 4, 3]

Make a list sequences from the list of sequences combinations.


In [22]:
multi_seqs = ['GCAGA GATATC', 'GGTAAAA ACTAGA GGTATA', 'GGTAA']
new_seqs = []
for multi_seq in multi_seqs:
    for seq in multi_seq.split(' '):
        new_seqs.append(seq)
new_seqs

['GCAGA', 'GATATC', 'GGTAAAA', 'ACTAGA', 'GGTATA', 'GGTAA']

In [20]:
multi_seqs = ['GCAGA GATATC', 'GGTAAAA ACTAGA GGTATA', 'GGTAA']
new_seqs = []
for multi_seq in multi_seqs:
    new_seqs.extend(multi_seq.split(' '))
new_seqs

['GCAGA', 'GATATC', 'GGTAAAA', 'ACTAGA', 'GGTATA', 'GGTAA']

In [21]:
multi_seqs = ['GCAGA GATATC', 'GGTAAAA ACTAGA GGTATA', 'GGTAA']
new_seqs = []
for multi_seq in multi_seqs:
    new_seqs += multi_seq.split(' ')
new_seqs

['GCAGA', 'GATATC', 'GGTAAAA', 'ACTAGA', 'GGTATA', 'GGTAA']

## Sequence-Unpacking Loop Pattern

Loops are also useful when you have a collection of collections (e.g. a list of lists).
When it's a collection of same-length sequences, you can break apart each sequence inside the loop and work with it:

```python
pairs = [[4, 5], [7, 8], [2, 9]]
sums = []
for pair in pairs:
    first = pair[0]
    second = pair[1]
    sums.append(first + second)
```
.

This is so common, that python provides some syntactic sugar (i.e. a shortcut) for breaking known-length sequences apart, called **"Tuple Unpacking"**
.

```python
pairs = [[4, 5], [7, 8], [2, 9]]
sums = []
for first, second in pairs:
    sums.append(first + second)
```

**Exercises**

Make a list of the sum of each number pair

In [None]:
pairs = [[4, 7], [7, 2], [10, 2], [5, 9]]

Make a list of Trues and Falses, saying whether the first sequence (the wild-type) in each triplet has more G's than both the second and third sequence in the triplet (the two experimental sequences)

In [None]:
seq_collections = [
    ('GACAGGAGATTA', 'GACCAGATA', 'GCCAGAGGATAA'),
    ('gacccatagag', 'CAGATAcaga', 'GAGGAACCaca'),
    ('ACCAGATA', 'GAGAAAGACCA', 'CCAGAGATATTA'),
    ('AGGGACCCCA', 'CGCCCACCACCG', 'CCCATTATC'),
]


Make a Dict of GC Counts, with each sequence as the dict's key and the total number of Gs and Cs as the value.

E.g. `['GCA', 'AAT'] -> {'GCA': 2, 'AAT': 0}`

*Reminder*: To append to a dictionary, just assign to its key: `gc_count['GCA'] = 2`

In [6]:
seqs = ['ATCGAGC', 'TAATA', 'GCCATCT', 'CACCT']

## The Zip Loop Pattern

Most of the times, you don't have a collection of pairs--sometimes, you need to make that collection yourself before you can loop over them.  The `zip()` function makes this straightforward!

```python
names = ['Zanarah', 'Joe', 'Weiwei',]
ages = [20, 21, 22]
combined = list(zip(names, ages))  # [('Zanara', 20), ('Joe', 21), ...]

for name, ages in combined:
    print(name, age)
```
.

To make it more concise, This can also be done inside the header of the for-loop:

```python
names = ['Zanarah', 'Joe', 'Weiwei',]
ages = [20, 21, 22]
for name, ages in zip(names, ages)):
    print(name, age)
```
.




**Exercises**

Add each pair of numbers

In [None]:
firsts = [1, 2, 3, 4, 5]
seconds = [10, 20, 30, 40, 50]

Print the patient number and treatment group of each patient
(e.g. "Patient 32341: control")

In [None]:
patients = [32451, 435679, 4211235, 123121]
groups = ['control', 'treatment', 'treatment', 'control']


Compare the number of Cs in each sequence.  Does the first have more or less than the second? 

In [None]:
firsts = ['GAGATTACA', 'CAGATGATA', 'GGAGGACCAAG']
seconds = ['GGAACCAA', 'CACAGGAGA', 'GATATAACA']

## Enumerate Pattern

Sometimes you want to store the **index** of items in a sequence.  You could calculate this in a loop:

```python
bundesländer = ['Baden-Württemberg', 'Bayern', 'Thuringen']
idx = 0
for bundesland in bundesländer:
    print(idx, bundesland)  # prints 0 Baden-Württemberg
    idx += 1
```
.

Python's `enumerate()` function generates a list of (index, element) pairs:

```python
bundesländer = ['Baden-Württemberg', 'Bayern', 'Thuringen']
for idx, bundesland in enumerate(bundesländer):
    print(idx, bundesland)  # prints 0 Baden-Württemberg
```
.



**Exercises**

Print the index and name of each person in the list

In [9]:
lost_fisherman = ['Innigo', 'Fezzik', 'Vizzinie']

## What about the C-Style For Loop?

There is another approach to for-loops: calculating only the index of each loop and using it inside the body.  For example:

```python
seqs = ['AT', 'GC', 'CA']
for idx in [0, 1, 2]:
    seq = seqs[idx]
    print(seq)  # prints 'AT'
```
.

Python's `range()` function can automatically calculate the sequence of integers from 0 to any value.  As a result, the `range()` function is great for telling Python *how many times* you want to repeat a loop:

```python
seqs = ['AT', 'GC', 'CA']
for idx in range(3):
    seq = seqs[idx]
    print(seq)  # prints 'AT'
```
.

Instead of hard-coding the length of the range (e.g. `range(3)`), passing the length of the sequence automatically finds the write range (e.g. `range(len(seqs))`)

When complete, this pattern looks like:

```python
seqs = ['AT', 'GC', 'CA']
for idx in range(len(seqs)):
    seq = seqs[idx]
    print(seq)  # prints 'AT'
```
.

let's get some practice using the `range()` function!

**Exercises**:  using the `range()` function, write code that does the following tasks:

Print 'Hello World!' 5 times

Print the first letter of each name in the list

In [None]:
names = ["John", "Harry", "Moe", "Luke"]

Print the patient number and treatment group of each patient (e.g. "Patient 32341: control")

In [None]:
patients = [32451, 435679, 4211235, 123121]
groups = ['control', 'treatment', 'treatment', 'control']

Make a dictionary with  the patient number and treatment group of each patient (e.g. `{32341: "control"}`)

In [None]:
patients = [32451, 435679, 4211235, 123121]
groups = ['control', 'treatment', 'treatment', 'control']

### Extra Exercises: Writing to Files in Loops

The exercises below use for-loops while introducing the pathlib.Path type

Print the file extension of each file in the list

*Hint*: the pathlib.Path class is useful for this:

```python
from pathlib import Path
filename = 'hello.csv'
filepath = Path(filename)
extension = filepath.suffix  # '.csv'
```

In [4]:
filenames = ['virus.fasta', 'birthday.jpg', 'hospital.csv', 'letter.pdf']

Write each sequence to its corresponding file

```python
from pathlib import Path
filepath = Path('hello.txt')
text = 'Hi!'
filepath.write_text(text)
```

In [None]:
dataset = [
    ('seq1.txt', 'GACCAGTA'),
    ('seqA.txt', 'GGAGAGTATAC'),
    ('myseq.txt', 'GTTTAAC'),
]

Write these sequences to filenames named after their corresponding animal id (e.g. `324` becomes `'324.txt'`)

In [7]:
animals = [123, 342, 543, 654]
seqs = ['GADCAG', 'CADFAAD', 'GGGGCVAGDA', 'GGDADCA']

Write all the sequences of each animal to filenames named after their corresponding animal id (e.g. 324 becomes '324.txt')

In [8]:
animals = [123, 342, 543, 654]
seqs = [
    ('GACAG', 'GCCAGT'), 
    ('CAAA', 'GGGGCAGA', 'GGACA'),
    ('GGGATATCA', 'CCACAGATA', 'GGACAAATA'),
    ('GCCATATA', 'CAACTTTATA'),
]

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=286cad4e-3049-4040-b8ae-ac30773a7605' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>