# Lesson 3: Avoiding repetition with Lists and Loops

To this point, we have focused on variables holding a single value or operations performed only once. But, what if we have many values that we want to store or many operations that we want to perform? This is where lists and loops come in. In this module, we will learn how to harness the power of lists and loops to make our code more efficient and powerful.

Learning objectives of this module:
1. Learn the purpose of lists, how to create one, and how to manipulate them.
2. __Learn how to use loops to repeat operations, including how to use loops to iterate through lists. There are two main types of loops we will discuss:__
    - __For loops__
    - __While loops__
3. __Learn how to combine loops with conditionals to create more complex programs.__
4. Briefly introduce the idea of vectorized operations and numpy arrays: a more efficient way to perform operations on lists of numbers.

### Lesson 3.2: Loops

In the previous lesson, we discussed lists and other similar objects that allow you to store multiple values in a single variable. But with our current tools, if we wanted to perform an operation on each of the values in a list, we would have to do so one line at a time, which again would be tedious. Let's say we have a list of length measurements in micrometers that we need to convert to meters. We could do this like this:

In [1]:
#initiialize a list containing measurements in micrometers
measurements = [100, 10, 20, 15]

#convert the measurements to meters
measurements[0] = measurements[0]/1e6
measurements[1] = measurements[1]/1e6
measurements[2] = measurements[2]/1e6
measurements[3] = measurements[3]/1e6
measurements

[0.0001, 1e-05, 2e-05, 1.5e-05]

*If unfamiliar with 1e6 notation, this is just using scientific notation to represent 1,000,000 or 1 x 10^6*

But this isn't ideal, and isn't feasible for larger lists. That's where loops come in! Loops are used to repeat lines of code multiple times, allowing for the same task to be performed on multiple values. These repeated processes are called __iterations__.

In this lesson, we will cover two main types of loops:
- `for` loops: used to iterate a specific number of times or to iterate over a sequence of values, such as those contained in a list
- `while` loops: used to iterate until a certain condition is met

For loops are little easier to understand and use appropriately, so let's start there:

### For Loops: A Clear End To Repetition

With for loops, we can repeat a block of code a set number of times. The classic notation of for loops usually looks something like this:

```
for i = 0 To 9
    Do something 10 times
```
Where `i` is incremented by 1 with each iteration and the loop continues until `i` reaches 9 (so you repeat the process 10 times). In Python, it's the same idea, but instead of iterating over a set of numbers, we iterate over a type of object called an iterator. __iterables__ are a group of data types which can be used as an iterator. These include, but are not limited to:
- range objects
- Lists
- Tuples
- Sets
- Strings
- Numpy Arrays (we'll get to these later)
- Pandas DataFrames (we'll get to these later)

Let's see an example of a loop in python, that iterates from 0 to 9, using a range object:

In [2]:
for i in range(10):
    print(i)

0
1
2
3
4
5
6
7
8
9


The range object produces an iterator that goes from 0 to 9, so that means there are a total of 10 iterations. We can also iterate over a list or other iterables. Let's return to our measurements list and convert each measurement from micrometers to meters:

In [3]:
#initialize list
measurements = [100, 10, 20, 15]

#divide each measurement by 1000, then print it out
for item in measurements:
    item = item/1000
    print(item)

0.1
0.01
0.02
0.015


Here, we have set up a for loop to iterate over our list, meaning that the loop will start with the first item in `measurements`, copy it to `item`, and then perform the indented code (divide `item` by 1000 and then print it out). It will then repeat this with the second item in `measurements`, and so on, until it reaches the end of the list. But did we actually do anything to the list? Let's check:

In [4]:
print(measurements)

[100, 10, 20, 15]


Hmm, the list appears unchanged. Why might that be? Let's briefly talk about what is happening under the hood here. When our `measurements` list was used in the `for` clause, it was converted into an interator, the first element of the list was *copied* into the `item` variable. It is no longer associated with our list. If we want to change the list, we need to explicitly do so within the for loop. To do this, we can to make use of the built-in `enumerate()` function, which will copy both the index and the value of the item in the list. Let's see how this works:

In [5]:
#initialize list
measurements = [100, 10, 20, 15]

#divide each measurement by 1000, then print it out
for index, item in enumerate(measurements):
    #print out the values of index and item
    print("The current index is", index, "and the current item is", item)

    #convert the item measurment and update our list
    item = item/1000
    measurements[index] = item

measurements

The current index is 0 and the current item is 100
The current index is 1 and the current item is 10
The current index is 2 and the current item is 20
The current index is 3 and the current item is 15


[0.1, 0.01, 0.02, 0.015]

We could also accomplish the same task using a combination of the `range()` and `len()` functions. `len()` returns the number of items in your list. `range()` returns a list of numbers from 0 to the number you pass in. So, if you pass in the length of your list, you'll get a list of numbers from 0 to the length of your list minus 1. You can then use these numbers to access each item in your list.

In [6]:
#initialize list
measurements = [100, 10, 20, 15]

#divide each measurement by 1000, then print it out
for i in range(len(measurements)):
    #print out the values of index and item
    print("The current index is", i, "and the current item is", measurements[i])

    #convert the item measurment and update our list
    measurements[i] = measurements[i]/1000

measurements

The current index is 0 and the current item is 100
The current index is 1 and the current item is 10
The current index is 2 and the current item is 20
The current index is 3 and the current item is 15


[0.1, 0.01, 0.02, 0.015]

Although it is generally recommended to use `enumerate()` instead of `range()` if iterating over a list, it also highlights an important point: there are many ways to do the same thing in Python (or any language). There is very rarely one right answer, and as long as it does what you intend and you understand the code, it's a good solution. This is also why testing and debugging are important aspects of coding. You want to make sure your code does what you think it does!

Finally, what if we have multiple lists that we would like to iterate over simulataneously? This might happen if we have paired measurements. Let's say we measured both the expression and phosphorylation of ERK1/2 across three replicates. We could store these in two separate lists, and then iterate over both lists at the same time using the `zip` function, allowing us to calculate a normalized phosphorylation signal.

In [7]:
# initialize lists
expression = [2, 4, 3]
phosphorylation = [1, 6, 6]

#initialize empty list to store normalized values
normed_phosphorylation = []
for exp, phos in zip(expression, phosphorylation):
    #calculate normalized values and add to our new list
    normed_phosphorylation.append(phos/exp)

normed_phosphorylation

[0.5, 1.5, 2.0]

#### Exercise 3.2.1:

__Problem 1__: Write a for loop that sums the numbers from 1 to 100. Print the sum after the loop.

*Hint: you can use `range(1,101)` to get an iterator from 1 to 100*

Answer = 5050

In [8]:
my_sum = 0

#your code here!
for i in range(1, 101):
  my_sum = my_sum + i


print(my_sum)

5050


__Problem 2__: Write a for loop to iterate over a list of numbers and square each number in the list. Print the list of squared numbers. Try doing this two different ways: (1) editing the original list, and (2) creating a new list and storing the new values in that list.

In [9]:
my_list = [1, 2, 3, 4]

#your code here!
for i, item in enumerate(my_list):
    my_list[i] = item**2

# the original, edited list
print(my_list)

[1, 4, 9, 16]


__Problem 3__: You have a sequence of DNA. You would like to calculate the GC content of the DNA sequence. The GC content is the percentage of a DNA sequence that consists of guanine or cytosine bases. Use a for loop to calculate the GC content.

*Hint: count the number of G's and C's in the sequence by including a conditional statement in your loop that checks if the base is a G or C before adding it to the count.*

Answer = 0.5

In [10]:
seq = 'ACCGTTAGCAATGCTCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCAGATCG'

gc_count = 0

# your code here!

for char in seq:
  if char == 'G' or char == 'C':
    gc_count += 1


# using the number of Gs and Cs, calculate the GC content, and print
gc_content = gc_count/len(seq)
print(gc_content)

0.5


### While Loops: They'll stop, on one condition...

With for loops, there is a defined number of times the loop will run. However, we may not always have a well-defined end to our loop, but know we want to continue running the loop until a certain condition is met. This is where while loops come in handy.


In [11]:
x = 0
while x < 5:
    print(x)
    x = x + 1

0
1
2
3
4


The above case is an example of a simple while loop. Prior to running the indented code, the while loop checks to see if the condition is true. If it is, the indented code is run. If it isn't, the loop is exited. It is possible for the code in the while loop to never run if the first check of the condition is false (try changing the conditional statement to `x < 0` to see what happens).

This also introduces a new potential problem: the infinite loop. If the condition provided in the while loop is never met, the loop will continue to run forever. If you were to run the below code, the loop would never end and it would cause it to break:

```python
while True:
    print("Help! I'm stuck in a loop")
```

For this reason, it's important to make sure it is possible for your loop to exit. If you're not careful, you can end up with an infinite loop, which will cause your program to crash.

Let's walk through another exmaple of a while loop. Let's say we have a want to know where the first 'h' is located in a word. We could use a while loop to move through a string until we find the first 'h', then print the location

In [12]:
#our word
word = 'python'

#initialize the starting index
i = 0
#start a while which checks if the current letter is h
while word[i] != 'h':
    #print the current letter
    print(word[i])

    #move the next letter
    i = i + 1

print('The first h is located at index', i)


p
y
t
The first h is located at index 3


#### Exercise 3.2.2

You decide to invest that $20 in the best performing stock ever, which doubles your money every week. Use a while loop to calculate how many weeks it will take until you are a millionaire. Print the number of weeks it will take.

In [14]:
#initialize your starting money and the starting week
my_money = 20
weeks = 0

# Your code here!
while my_money < 1000000:
  my_money *= 2
  weeks += 1

print(weeks)

16
