# Loops: Describing How to Repeat Steps in a Program

Programming languages require three essential features, two of which we've covered so far:

   - [x] Conditional Flow (covered so far with `if-else`)
   - [x] Loops  (covered so far with `while`)
   - [ ] Abstraction (next time with `functions`)

So, why learn `for` loops, if we already have a way to do loops?  Well, because `while` loops tend to be challenging to reason about, buggy, and difficult to break down into small, manageable parts. `for` loops(specifically, a "For-Each" loop), on the other hand, are helpful for writing loop code that is concise and self-explanatory.

```
# While Loop Example                               # For Loop Example
i = 0                                               
names = ['Nick', 'Yiwen', 'Adrian']                 names = ['Nick', 'Yiwen', 'Adrian']
while i < len(names):                               for name in names:
    name = names[i]                                     print(name)
    print(name)
```



## For Loops:

Like R and Matlab, Python makes it straightford repeat commands "For-Each" element in a data collection.  For example, to print each element in a list:

```python
names = ['Ted', 'Roy', 'Keeley']
for name in names:
   print(name)
# prints "Ted" then "Roy" then "Keeley"
```
.



**Exercises**

Print each number plus 100.

In [None]:
nums = [5, 3, 1]
for num in nums:
    print(num + 100)

105
103
101


Print 10 times each number in the list 
For example, `[2, 4]` should print `20` then `40`.

In [None]:
nums = [4, 8, 10, 5]

Print the first letter of each name in the list

In [None]:
names = ["John", "Harry", "Moe", "Luke"]

Print the file extension of each file in the list (Tip: `os.path.splitext()`)

In [None]:
filenames = ['virus.fasta', 'birthday.jpg', 'hospital.csv', 'letter.pdf']


Print each positive number in the list.

In [None]:
nums = [5, 2, 4, -1, -5, 8, 9, 1, -6, 3, 7]


Inside the 'Parent' folder, create each of the folders in the list

(Tip: put forward slashes between the folder names, e.g. `Parent/Child`)

(Tip: `os.makedirs(exist_ok=True)`)

In [None]:
parent = "Parent"
child_folder_names = ['a', 'b', 'c', 'd']


Run the following code to generate some random folder and files  using the `os` and `glob` packages, then process them below:

Useful functions: `os.listdir()`, `os.walk()`, `os.getcwd()`, `glob.glob()`, `open(filename).read()`, `open(filename).readlines()`

In [None]:
import os, random
parent_folder = 'ExampleProject'
folders = ['scripts', 'data/raw', 'data/processed', 'data/final']
for folder in folders:
    os.makedirs(parent_folder + '/' + folder, exist_ok=True)
data_nums = [1, 2, 4, 5, 6]
for num in data_nums:
    data_file = open(parent_folder + f'/data/raw/dataset{num}.txt', 'w')
    data_file.write(f"{random.randint(1, 101)}\n")
    data_file.write(f"{random.randint(1, 101)}\n")
    data_file.write(f"{random.randint(1, 101)}\n")
    data_file.close()
readme_file = open(parent_folder + '/README.md', 'w')
readme_file.write('# Example Project\n\nWelcome to the Project!\n')
readme_file.close()


In the ExampleProject folder, how many folders and files are there in total?

In the ExampleProject/data/raw folder, how many data files are there?

Open each of the data files in the data/raw folder. What is the sum of the numbers in each file?

## List-Append Loop Pattern

This can also be used to create new data collections.  For example:

```python
old_names = ['ted', 'roy', 'keeley']
new_names = []
for name in old_names:
   new_names.append(name.title())
```

**Exercises**

Make a list with the first codon of each sequence

In [None]:
seqs = ["GTAATCG", "GTACCAAA", "GGTAGTACCAC"]



Clean up the data.  Make a list with that formats each sequence the same way.

In [None]:
x = 1
y = 2
x, y = 1, 2

In [None]:
seqs = ["GTAATCG", "gtaccaaa", "GGtAGtACCaC"]

clean_seqs = []
for seq in seqs:
    clean_seqs.append(seq.upper())
    

lower_seqs = []
for seq in seqs:
    lower_seqs.append(seq.lower())    

clean_seqs

['GTAATCG', 'GTACCAAA', 'GGTAGTACCAC']

In [None]:
clean_seqs = [x.upper() for x in seqs]
clean_seqs

['GTAATCG', 'GTACCAAA', 'GGTAGTACCAC']

Make a list of the number of A in each sequence

Make a list sequences from the list of sequences combinations.


In [None]:
seqs = ['GCAGA GATATC', 'GGTAAAA ACTAGA GGTATA', 'GGTAA']

AttributeError: 'list' object has no attribute 'split'

## Sequence-Unpacking Loop Pattern

Loops are also useful when you have a collection of collections (e.g. a list of lists).
When it's a collection of same-length sequences, you can break apart each sequence inside the loop and work with it:

```python
pairs = [[4, 5], [7, 8], [2, 9]]
sums = []
for pair in pairs:
    first = pair[0]
    second = pair[1]
    sums.append(first + second)
```
.

This is so common, that python provides some syntactic sugar (i.e. a shortcut) for breaking known-length sequences apart, called **"Unpacking"**
.

```python
pairs = [[4, 5], [7, 8], [2, 9]]
sums = []
for pair in pairs:
    first, second = pair
    sums.append(first + second)
```
.

This can even be done in the header of the for-loop!

```python
pairs = [[4, 5], [7, 8], [2, 9]]
sums = []
for first, second in pairs:
    sums.append(first + second)
```

**Exercises**

Make a list of the sum of each number pair

In [None]:
pairs = [[4, 7], [7, 2], [10, 2], [5, 9]]

Write each sequence to its corresponding a file

In [None]:
dataset = [
    ('seq1.txt', 'GACCAGTA'),
    ('seqA.txt', 'GGAGAGTATAC'),
    ('myseq.txt', 'GTTTAAC'),
]

Make a list of Trues and Falses, saying whether the first sequence (the wild-type) in each triplet has more G's than both the second and third sequence in the triplet (the two experimental sequences)

In [None]:
seq_collections = [
    ('GACAGGAGATTA', 'GACCAGATA', 'GCCAGAGGATAA'),
    ('gacccatagag', 'CAGATAcaga', 'GAGGAACCaca'),
    ('ACCAGATA', 'GAGAAAGACCA', 'CCAGAGATATTA'),
    ('AGGGACCCCA', 'CGCCCACCACCG', 'CCCATTATC'),
]


<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=2b117e46-dc1e-4350-8c44-0b1d367af50b' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>