# Lesson 3: Avoiding repetition with Lists and Loops

To this point, we have focused on variables holding a single value or operations performed only once. But, what if we have many values that we want to store or many operations that we want to perform? This is where lists and loops come in. In this module, we will learn how to harness the power of lists and loops to make our code more efficient and powerful.

Learning objectives of this module:
1. __Learn the purpose of lists, how to create one, and how to manipulate them.__
2. Learn how to use loops to repeat operations, including how to use loops to iterate through lists. There are two main types of loops we will discuss:
    - For loops
    - While loops
3. Learn how to combine loops with conditionals to create more complex programs.
4. Briefly introduce the idea of vectorized operations and numpy arrays: a more efficient way to perform operations on lists of numbers.

## Lists

### Creating lists and accessing the items within them

Let's say we just ran an experiment where we were testing the efficacy of a drug for reducing tumor size in mice, in which 3 mice were treated with drug and 3 mice were treated with a control solution. We measured the size of the tumor at the end of treatment. We want to store this data in a way that we can easily access it later. Given what we currently know, we could do this:

In [None]:
# drug replicates
drug_rep1 = 0.1
drug_rep2 = 0.5
drug_rep3 = 0.3

# control replicates
control_rep1 = 1.5
control_rep2 = 1.2
control_rep3 = 0.9

But this is tedious, and if we had any more than three replicates that would be a lot of typing. Instead, we can create a __list__ for each condition to hold all replicates:

In [None]:
drug = [0.1, 0.5, 0.3]
control = [1.5, 1.2, 0.9]

Much easier to work with! The lists we just created are ordered, which means we can quickly access any element in the list by its __index__.

As a generic example, let's say we have a simple list that counts up from 0 to 5:

In [None]:
#initialize list
my_list = [1, 2, 3, 4, 5]

#output the entire list
print(my_list)

[1, 2, 3, 4, 5]


If we want an individual element, we need to access it at the appropriate index using brackets after our list name: `my_list[index]`. The following table shows how you can access each element in this list. Notice that it is possible for us to use negative indices, which will count backwards from the end of the list.

| Value of Item in List | 1 | 2 | 3 | 4 | 5 |
|---------------|---|---|---|---|---|
| Forward Index | 0 | 1 | 2 | 3 | 4 |
| Reverse Index | -5 | -4 | -3 | -2 | -1 |

In [None]:
print(my_list[0]) #first element
print(my_list[1]) #second element
print(my_list[2]) #third element

1
2
3


If we try to access an index that is too large (greater than the size of our list), we will get an error:

In [None]:
print(my_list[10])

IndexError: ignored

Notice from our table that it is also possible for us to use negative indices, which will count backwards from the end of the list. For example, if we want the last item of the list, we have two options. We can access like above with `drug[2]`, or we could use a negative index to access the last item of the list, `drug[-1]`.

In [None]:
print(drug[-1], 'is the same as', drug[2])

0.3 is the same as 0.3


We can also use __slice__ notation to get a subset of the list, denoted by `myList[start:stop:step]`. This notation will tell python to return all items between the `start` and `stop` values (including the start item but excluding the stop item), with a specific `step` size. By default, the step size is 1 and you do not have to provide this value. So let's say we only want to get the first two replicates of our drug condition, we can do that like this:

In [None]:
print(drug[0:2])

[0.1, 0.5]


Notice that the last number is in our slice is not included in the result. In python, slices are always exclusive at the end, so '0:2' means output all elements from 0 to 2, but not including 2. This can be a point of confusion, so always make sure you are aware of this and are grabbing all the elements you want. Note that you can also leave the start or stop blank, which means that it will start at the beginning or go all the way to the end. For example, `drug[:2]` will return the same thing as above.

If it still doesn't quite make sense, another way to visualize slicing is through these figures from the Python Simplified tutorial [here](https://pythonsimplified.com/understanding-indexing-and-slicing-in-python/).

```my_list = [10,20,30,40,50,60,70,80,90,100]```

*Slice with a step size of 1*:


![Example of Slice Notation](https://cdn-0.pythonsimplified.com/wp-content/uploads/2021/06/python-slicing-ex1-1024x332.jpg?ezimgfmt=ng:webp/ngcb3)

*Slice with a step size of 2*:

![Example of Slice Notation](https://cdn-0.pythonsimplified.com/wp-content/uploads/2021/06/python-slicing-ex2-1024x425.jpg?ezimgfmt=ng:webp/ngcb3)

#### Exercise 3.1.1

Let's practice indexing on a larger list. Running the below code which will generate a list with 100 items, counting from 1 to 100. Don't worry too much about these details for now, but the `range()` function is a builtin python function that generates a list of numbers from the provided start value (1) to the stop value (101), similar to slices discussed previously. The `list()` function converts the range object outputted by `range()` to a list.

Editing the below code block, return the following items:
1. The first item in the list = 1
2. The last item in the list = 100
3. The 50th item in the list = 50
4. The 10th, 11th, and 12th item in the list = [10, 11, 12]
5. The last 5 items in the list = [96, 97, 98, 99, 100]
6. The 3rd, 6th, and 9th item in the list = [3, 6, 9]
7. Play around a bit!

In [None]:
#initialize list from 1 to 100
my_long_list = list(range(1, 101))

#edit the code here to output specific elements from the list
my_output = my_long_list[0]
print(my_output)

my_output = my_long_list[-1]
print(my_output)

my_output = my_long_list[49]
print(my_output)

my_output = my_long_list[9:12]
print(my_output)

my_output = my_long_list[-5:]
print(my_output)

my_output = my_long_list[2:9:3]
print(my_output)

1
100
50
[10, 11, 12]
[96, 97, 98, 99, 100]
[3, 6, 9]


### Multidimensional or multi-type lists

Lists are not restricted to holding a single data type. If you wanted you could have a list consisting of both ints, floats, and strings:

In [None]:
my_multitype_list = [1, 2, 3.0, "four", "five", "six"]

In most cases, this isn't recommended and there aren't many use cases where this would be beneficial. But, what this also allows to do is create __nested lists__, or lists within lists. These allow us to create 2D matrix-like structures useful for multidimensional data. For a simple example, we can create a 2D matrix using nested lists like so:

Which is equivalent to the following matrix
```python
list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
```
Which behaves similarly to the following matrix

| 1 | 2 | 3 |
|---|---|---|
| 4 | 5 | 6 |
| 7 | 8 | 9 |

To access data from nested lists, you need to use multiple square brackets. For example to access number 1 in the above matrix, we would do the following:

In [None]:
#initialize 2D matrix with nested lists
my_matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

#grab the first nested list
print("Nested List:", my_matrix[0])

#grab the first element of the first nested list
print("Element from our Nested List:", my_matrix[1][0])

Nested List: [1, 2, 3]
Element from our Nested List: 4


#### Exercise 5.1.2: Accessing 2D Lists

Given a 5 x 5 matrix provided below, access the following elements of our list/matrix (answer in parentheses):
1. The fourth nested list = [16, 17, 18, 19, 20]
2. The 4th element in the 3rd list = 14
3. The element in the 4th row and 3rd column = 14
4. The first two elements in the second row = [6, 7]
5. The last two elements in the fourth row = [19, 20]

In [None]:
#initialize the nested lists: it can often make it easier to read if you put each list/row on a separate line and indent it. It runs exactly the same!
my_matrix = [[1,2,3,4,5],
             [6,7,8,9,10],
             [11,12,13,14,15],
             [16,17,18,19,20],
             [21,22,23,24,25]]

### edit code here ###
my_item = my_matrix

#print element
print(my_item[3])

print(my_item[2][3])

print(my_item[3][2])

print(my_item[1][:2])

print(my_item[3][-2:])

[16, 17, 18, 19, 20]
14
18
[6, 7]
[19, 20]


### Editing Lists

Lists are also *mutable*, which means you can change the values of items in a list. For example, whoops, we noticed that our first drug replicate actually had a tumor size of 0.01, not 0.1! How might we change this without having to retype the entire list? Well, like this:

In [None]:
drug[0] = 0.01
print(drug)

[0.01, 0.5, 0.3]


Similar to strings, we can also perform operations on lists. We can combine lists using the `+` operator, and we can repeat a list a given number of times using the `*` operator.

In [None]:
#print original lists
print('Control List:', control)
print('Drug List:', drug)

#combine two lists into one list
combined_data = control + drug
print('Combined list:', combined_data)

#repeat the list three times
repeated_data = [1, 2, 3] * 3
print('Repeated list:',  repeated_data)

Control List: [1.5, 1.2, 0.9]
Drug List: [0.01, 0.5, 0.3]
Combined list: [1.5, 1.2, 0.9, 0.01, 0.5, 0.3]
Repeated list: [1, 2, 3, 1, 2, 3, 1, 2, 3]


But what if we repeated the same experiment with another mouse and want to add this data to our list? Lists have built-in functions (called __methods__) that are unique to lists. To access these methods, we use the following generic notation: `myList.method(parameters)`. For example, we can use the append() method to add an item to the end of a list. Each time you run this code, you'll add a new item to the list. Try it out!

In [None]:
drug.append(0.6)
drug

[0.01, 0.5, 0.3, 0.6]

We can reverse the above (i.e. remove an item from the end of the list) using the pop method. You can also remove a specific value in the list using the remove method.

In [None]:
drug.pop()
drug

[0.01, 0.5, 0.3]

In [None]:
drug.remove(0.5)
drug

[0.01, 0.3]

For a complete list of all the methods avialable lists, see the below table and the following resource [here](https://docs.python.org/3/tutorial/datastructures.html)


| Method | Description | Example Use |
| --- | --- | --- |
| append | Adds an element at the end of the list | `myList.append(1)` |
| pop | Removes and returns element at the indicated index (default is remove last item) | `myList.pop()` |
| insert | Adds an element at the specified position | `myList.insert(1, 2) #first element is index to place item in front of`  |
| remove | Removes the first item with the specified value | `myList.remove(1)` |
| clear | Removes all the elements from the list | `myList.clear()` |
| index | Returns the index of the first element with the specified value | `myList.index(1)` |
| count | Returns the number of elements with the specified value | `myList.count(1)` |
| sort | Sorts the list, in ascending order  | `myList.sort()` |
| reverse | Reverses the order of the list | `myList.reverse()` |
| copy | Returns a copy of the list | `myList.copy()` |


#### Exercise 5.1.3: List Manipulation

Editing the code where denoted, perform the following tasks:
1. Combine the two lists below into a single list and then append the value `25` to the end of the list
2. Divide the first element in the list by 5
3. Remove the last element in the list

The output after completing each task should like like this
```
Part 1 solution: [5,10,15,20]
Part 2 solution: [5,10,15,20,25]
Part 3 solution: [1,10,15,20,25]
Part 4 solution: [1,10,15,20]
```

In [None]:
first_list = [5, 10]
second_list = [15, 20]

# Add the two lists together (edit below)
combined_list = first_list + second_list
print('Part 1 solution:', combined_list)

# Append 25 to the combined list (edit below)
combined_list.append(25)
print('Part 2 solution:', combined_list)

# Divide the first element of the list by 5 (edit below)
combined_list[0] = combined_list[0]/5
print('Part 3 solution:', combined_list)

# Remove the last element of the list (edit below)
combined_list.pop()
print('Part 4 solution:', combined_list)

Part 1 solution: [5, 10, 15, 20]
Part 2 solution: [5, 10, 15, 20, 25]
Part 3 solution: [1.0, 10, 15, 20, 25]
Part 4 solution: [1.0, 10, 15, 20]


### Membership operators (in)

We can use the 'in' operator to check if a value is in a list, which will return a boolean value (Yes it is in the list, No it is not in the list). Let's say we obtained a list of all the proteins associated with a particular signaling pathway. We can check if a protein we are interested is in this list or not with the 'in' operator. Pairing

In [None]:
#initialize list
egfr_signaling_proteins = ['EGF','EGFR', 'GRB2','ERK', 'MEK', 'RAS']

#Indicate our protein of interest and check to see if it is in the above list
protein_of_interest = 'VEGFR'
protein_of_interest in egfr_signaling_proteins

False

You can also combine it with the not operator to check if a value is not in a list (opposite of in)

In [None]:
protein_of_interest not in egfr_signaling_proteins

True

As you learned with other booleans, you can combine the 'in' operator with conditional statements (if-else) to change the behavior depending on whether our protein belongs to the signaling pathway or not.

In [None]:
if protein_of_interest in egfr_signaling_proteins:
    print('Yes, ' + protein_of_interest + ' is in the EGFR signaling pathway!')
else:
    print('No, ' + protein_of_interest + ' is not in the EGFR signaling pathway!')

No, VEGFR is not in the EGFR signaling pathway!


### A brief mention of other 'List-like' objects in python

Within python there are many objects that behave similar to lists, but with slightly different properties. We will not go as in depth with these objects, but it is important to know that they exist and how they differ from lists. These include:
- Tuples
- Strings
- Sets
- Dictionaries

#### Tuples

We mentioned that lists are mutable, meaning we can edit the value of items in the list and add/remove items from the list. However, in some cases, this may not be the desired behavior and you might want to avoid accidentally editing your list. In this case, you can use a tuple, which are immutable versions of lists. Tuples are defined using parentheses instead of square brackets.

In [None]:
drug = (0.1, 0.5, 0.3)
control = (1.5, 1.2, 0.9)

We can still access elements of a tuple like a list, but if we try to edit elements, we get an error. Try it with the following code blocks:

In [None]:
print(drug[0])

0.1


In [None]:
drug[0] = 1

TypeError: ignored

But we can still combine tuples with the `+` operator, just like we did with lists and strings:

In [None]:
myTuple = (1, 1, 1) + (2, 2, 2)
print(myTuple)

(1, 1, 1, 2, 2, 2)


If you have no intention of manipulating your data in any way (adding, subtracting, changing specific values, etc.), it's generally recommended you use a tuple. In fact, if you do any work in R instead of python, you'll find that vectors in R (the equivalent of lists/tuples) are immutable like tuples in python.

#### Strings

We already discussed strings in a previous lesson, and how operations (`+`, `*`) work with strings. But strings also behave similarly to lists in some other ways. For example, the following string can also be thought of as a list of characters, where individual elements can be accessed by indexing. For the string `'python'`, you can also think of it as `('p', 'y', 't', 'h', 'o', 'n')`. Then, we can access the first element of the string like so:

In [None]:
my_string = "python"
print(my_string[0])

p


We can also use slices to get only a part of the string:

In [None]:
print(my_string[0:4])

pyth


However, like tuples, strings are immutable. This means that you can't change the value of a single element of string as you can with lists:

In [None]:
my_string[0] = 'p'

TypeError: ignored

Similar to lists, strings have many built-in methods that add additional functionality strings. Feel free to explore these at the python documentation [here](https://docs.python.org/3/library/stdtypes.html#string-methods).

#### Sets

Sets are *unordered* collections of *unique* elements, making them function quite differentally than lists, tuples, or strings. This means that there is no "first" element in a set, and no two elements in a set are the same. Sets are mutable, but since they are unordered, they cannot be indexed. Sets are created using curly braces `{}`. Sets are particularly useful for comparing groups of elements to look for overlap (think venn diagrams).

For a quick tutorial on how these work, see this [interactive tutorial](https://www.w3schools.com/python/python_sets.asp)

#### Dictionaries

Dictionaries are extremely useful in Python programming, but are also fairly unique to python. Dictionaries are a mutable collection like lists, but with __key-value pairs__. This means that rather than accessing elements by their numerical, ordered index, you access elements by their key, usually a string. Dictionaries are defined using curly braces `{}` and colons `:` to separate keys and values.

In [None]:
results = {'control': [0.3,0.5, 0.7], 'drug': [0.5, 0.7, 0.9]}

#print the whole dictionary
print('Complete Dictionary:', results)

#print the value of our 'control'
print("Control Data:", results['control'])

Complete Dictionary: {'control': [0.3, 0.5, 0.7], 'drug': [0.5, 0.7, 0.9]}
Control Data: [0.3, 0.5, 0.7]


The value in the key-value pair does not have to be a list. It can be any object. Dictionaries can be very helpful when you have lots of related data that you want to organize/access complex data quickly.