# Loops: Describing How to Repeat Steps in a Program

## For Loops:

Like R and Matlab, Python makes it straightford repeat commands "For-Each" element in a data collection.  For example, to print each element in a list:

```python
names = ['Ted', 'Roy', 'Keeley']
for name in names:
   print(name)
# prints "Ted" then "Roy" then "Keeley"
```
.



**Exercises**

Print 10 times each number in the list 
For example, `[2, 4]` should print `20` then `40`.

In [4]:
nums = [4, 8, 10, 5]
#[print(i) for i in nums]
for i in nums:
    print(i)

4
8
10
5


Print the first letter of each name in the list

In [6]:
names = ["John", "Harry", "Moe", "Luke"]
#[print(i[0]) for i in names]
for i in names:
    print(i[0])

J
H
M
L


Print the file extension of each file in the list

*Hint*: the pathlib.Path class is useful for this

In [10]:
a = "data.png"
Path(a).suffix

'.png'

In [11]:
from pathlib import Path
filenames = ['virus.fasta', 'birthday.jpg', 'hospital.csv', 'letter.pdf']
for a in filenames:
    print(Path(a).suffix)

.fasta
.jpg
.csv
.pdf


## List-Append Loop Pattern

This can also be used to create new data collections.  For example:

```python
old_names = ['ted', 'roy', 'keeley']
new_names = []
for name in old_names:
   new_names.append(name.title())
```

**Exercises**

Make a list with the first codon of each sequence

In [12]:
seqs = ["GTAATCG", "GTACCAAA", "GGTAGTACCAC"]
new = []
for seq in seqs:
    new.append(seq[:3])
print(new)

['GTA', 'GTA', 'GGT']


Clean up the data.  Make a list with that formats each sequence the same way.

In [14]:
seqs = ["GTAATCG", "gtaccaaa", "GGtAGtACCaC"]
clean=[]
for i in seqs:
    clean.append(i.upper())
print(clean)

['GTAATCG', 'GTACCAAA', 'GGTAGTACCAC']


Make a list of the number of A in each sequence

In [16]:
seqs = ["GTAATCG", "gtaccaaa", "GGtAGtACCaC"]
new =[]
for i in seqs:
    new.append(i.count("A"))
print(new)

[2, 0, 2]


Make a list sequences from the list of sequences combinations.


In [28]:
seq2

[['GGTAGTACCAC'], ['GGTAGTACCAC'], ['GGTAGTACCAC']]

In [42]:
seqs = ['GCAGA GATATC', 'GGTAAAA ACTAGA GGTATA', 'GGTAA']
seqs3=[]
for i in seqs:
    seqs3.append(i.split(' '))
seqs3[0]+seqs3[1]+seqs3[2]

['GCAGA', 'GATATC', 'GGTAAAA', 'ACTAGA', 'GGTATA', 'GGTAA']

In [30]:
pairs = [[4, 5], [7, 8], [2, 9]]
sums = []
for pair in pairs:
    first = pair[0]
    second = pair[1]
    sums.append(first + second)
sums

[9, 15, 11]

## Sequence-Unpacking Loop Pattern

Loops are also useful when you have a collection of collections (e.g. a list of lists).
When it's a collection of same-length sequences, you can break apart each sequence inside the loop and work with it:

```python
pairs = [[4, 5], [7, 8], [2, 9]]
sums = []
for pair in pairs:
    first = pair[0]
    second = pair[1]
    sums.append(first + second)
```
.

This is so common, that python provides some syntactic sugar (i.e. a shortcut) for breaking known-length sequences apart, called **"Unpacking"**
.

```python
pairs = [[4, 5], [7, 8], [2, 9]]
sums = []
for pair in pairs:
    first, second = pair
    sums.append(first + second)
```
.

This can even be done in the header of the for-loop!

```python
pairs = [[4, 5], [7, 8], [2, 9]]
sums = []
for first, second in pairs:
    sums.append(first + second)
```

**Exercises**

Make a list of the sum of each number pair

In [33]:
pairs = [[4, 7], [7, 2], [10, 2], [5, 9]]
sums=[]
for first, second in pairs:
    sums.append(first+second)
sums
    

[11, 9, 12, 14]

Write each sequence to its corresponding a file

In [38]:
from pathlib import Path

dataset = [
    ('seq1.txt', 'GACCAGTA'),
    ('seqA.txt', 'GGAGAGTATAC'),
    ('myseq.txt', 'GTTTAAC'),
]
for name, seq in dataset:
    Path(name).write_text(seq)
        


In [40]:
t = Path("seqA.txt").read_text()
t

'GGAGAGTATAC'

Make a list of Trues and Falses, saying whether the first sequence (the wild-type) in each triplet has more G's than both the second and third sequence in the triplet (the two experimental sequences)

In [44]:
seq_collections = [
    ('GACAGGAGATTA', 'GACCAGATA', 'GCCAGAGGATAA'),
    ('gacccatagag', 'CAGATAcaga', 'GAGGAACCaca'),
    ('ACCAGATA', 'GAGAAAGACCA', 'CCAGAGATATTA'),
    ('AGGGACCCCA', 'CGCCCACCACCG', 'CCCATTATC'),
]
new=[]
for wt, second, third in seq_collections:
    if wt.count("G") > second.count("G"):
        new.append(True)
    else:
        new.append(False)
    if wt.count("G") > third.count("G"):
        new.append(True)
    else:
        new.append(False)
new

[True, False, False, False, False, False, True, True]

## The Zip Loop Pattern

Most of the times, you don't have a collection of pairs--sometimes, you need to make that collection yourself before you can loop over them.  The `zip()` function makes this straightforward!

```python
names = ['Zanarah', 'Joe', 'Weiwei',]
ages = [20, 21, 22]
combined = list(zip(names, ages))  # [('Zanara', 20), ('Joe', 21), ...]

for name, ages in combined:
    print(name, age)
```
.

To make it more concise, This can also be done inside the header of the for-loop:

```python
names = ['Zanarah', 'Joe', 'Weiwei',]
ages = [20, 21, 22]
for name, ages in zip(names, ages)):
    print(name, age)
```
.




**Exercises**

Add each pair of numbers

In [45]:
firsts = [1, 2, 3, 4, 5]
seconds = [10, 20, 30, 40, 50]
new=[]
for first, second in zip(firsts, seconds):
    new.append(first+second)
new

[11, 22, 33, 44, 55]

Print the patient number and treatment group of each patient
(e.g. "Patient 32341: control")

In [46]:
patients = [32451, 435679, 4211235, 123121]
groups = ['control', 'treatment', 'treatment', 'control']

for pat, gro in zip(patients, groups):
    print("Patient Number: ", pat, "Group: ", gro)

Patient Number:  32451 Group:  control
Patient Number:  435679 Group:  treatment
Patient Number:  4211235 Group:  treatment
Patient Number:  123121 Group:  control


Compare the number of Cs in each sequence.  Does the first have more or less than the second? 

In [51]:
firsts = ['GAGATTACA', 'CAGATGATA', 'GGAGGACCAAG']
seconds = ['GGAACCAA', 'CACAGGAGA', 'GATATAACA']

for first, second in zip(firsts, seconds):
    if len(first)>len(second):
        print("First sequence is longer")
    elif len(first) == len(second):
        print("same length")
    else:
        print("second seq is longer")
        

First sequence is longer
same length
First sequence is longer


Write these sequences to filenames named after their corresponding animal id (e.g. `324` becomes `'324.txt'`)

In [54]:
animals = [123, 342, 543, 654]
seqs = ['GADCAG', 'CADFAAD', 'GGGGCVAGDA', 'GGDADCA']

for name, seq in zip(animals, seqs):
    Path(str(name) + ".txt").write_text(seq)
    

Write all the sequences of each animal to filenames named after their corresponding animal id (e.g. 324 becomes '324.txt')

In [78]:
animals = [123, 342, 543, 654]
seqs = [
    ('GACAG', 'GCCAGT'), 
    ('CAAA', 'GGGGCAGA', 'GGACA'),
    ('GGGATATCA', 'CCACAGATA', 'GGACAAATA'),
    ('GCCATATA', 'CAACTTTATA'),
]
new_list = []
for words in seqs:
    new_list.append(''.join(words))
for name, (x,y) in enumerate(zip(animals, new_list)):
    
    print("Number:", name,"Name:", x, "Sequence: ", y)


Number: 0 Name: 123 Sequence:  GACAGGCCAGT
Number: 1 Name: 342 Sequence:  CAAAGGGGCAGAGGACA
Number: 2 Name: 543 Sequence:  GGGATATCACCACAGATAGGACAAATA
Number: 3 Name: 654 Sequence:  GCCATATACAACTTTATA


## Enumerate Pattern

Sometimes you want to store the **index** of items in a sequence.  You could calculate this in a loop:

```python
bundesländer = ['Baden-Württemberg', 'Bayern', 'Thuringen']
idx = 0
for bundesland in bundesländer:
    print(idx, bundesland)  # prints 0 Baden-Württemberg
    idx += 1
```
.

Python's `enumerate()` function generates a list of (index, element) pairs:

```python
bundesländer = ['Baden-Württemberg', 'Bayern', 'Thuringen']
indices_bundesländer = list(enumerate(bundesländer))  # [(0, 'Baden-Württemberg'), ...]
for idx, bundesland in indices_bundesländer:
    print(idx, bundesland)  # prints 0 Baden-Württemberg
```
.

Like with `zip()`, this can be shortened by just putting it in the header of the for loop:

```python
bundesländer = ['Baden-Württemberg', 'Bayern', 'Thuringen']
for idx, bundesland in enumerate(bundesländer):
    print(idx, bundesland)  # prints 0 Baden-Württemberg
```
.



**Exercises**

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=2315fd57-700a-4075-a620-309a4fd7d62c' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>