# Avoiding repetition with Lists and Loops

To this point, we have focused on variables holding a single value or operations performed only once. But, what if we have many values that we want to store or many operations that we want to perform? This is where lists and loops come in. In this module, we will learn how to harness the power of lists and loops to make our code more efficient and powerful.

Learning objectives of this module:
1. Learn the purpose of lists, how to create one, and how to manipulate them.
2. Learn how to use loops to repeat operations, including how to use loops to iterate through lists. There are two main types of loops we will discuss:
    - For loops
    - While loops
3. Learn how to combine loops with conditionals to create more complex programs.
4. Briefly introduce the idea of vectorized operations and numpy arrays: a more efficient way to perform operations on lists of numbers.

## Lists

#### Creating lists and accessing the items within them

Let's say we just ran an experiment where we were testing the efficacy of a drug for reducing tumor size in mice, in which 3 mice were treated with drug and 3 mice were treated with a control solution. We measured the size of the tumor at the end of treatment. We want to store this data in a way that we can easily access it later. Given what we currently know, we could do this:

In [None]:
# drug replicates
drug_rep1 = 0.1
drug_rep2 = 0.5
drug_rep3 = 0.3

# control replicates
control_rep1 = 1.5
control_rep2 = 1.2
control_rep3 = 0.9

But this is tedious, and if we had any more than three replicates that would be a lot of typing. Instead, we can create a list for each condition to hold all replicates:

In [2]:
drug = [0.1, 0.5, 0.3]
control = [1.5, 1.2, 0.9]

Much easier to work with! The lists we just created are ordered, which means we can quickly access any element in the list by its index. Python uses a 0-based indexing system, which means that first element is stored at index 0, the second at index 1, and so forth. *Not all languages use 0-based indexing: for example, in the R language, the first element is stored at index 1.* 

The following table summarizes list indexing in Python for a simple list that counts from 1 to 5: `myList = [1,2,3,4,5]`


| Value of Item in List | 1 | 2 | 3 | 4 | 5 |
|---------------|---|---|---|---|---|
| Forward Index | 0 | 1 | 2 | 3 | 4 |
| Reverse Index | -5 | -4 | -3 | -2 | -1 |


Let's return to our experiment data. If we want to obtain the whole list:

In [3]:
drug

[0.1, 0.5, 0.3]

If we want an individual element:

In [4]:
drug[0]

0.1

We can also use negative indices to count backwards from the end of a list. For example, if we want the last item of the list, we have two options. We can access like above with `drug[2]`, or we could use a negative index to access the last item of the list, `drug[-1]`. It also works with slice notation. If we want the last two items in a list, we could use `drug[-2:]`, which means take everything from the last two items of the list  This especially useful when working with longer lists.

In [None]:
print(drug[-1])

But what if we want only a subset of the list? We can use *slice* notation to get a subset of the list, denoted by `myList[start:stop:step]`. This notation will tell python to return all items between the `start` and `stop` values (including the start item), with a specific `step` size. By default, the step size is 1 and you do not have to provide this value. So let's say we only want to get the first two replicates of our drug condition, we can do that like this:

In [13]:
drug[0:2]

[0.1, 0.5]

Notice that the last number is in our slice is not included in the result. In python, slices are always exclusive at the end, so '0:2' means output all elements from 0 to 2, but not including 2. This can be a point of confusion, so always make sure you are aware of this and are grabbing all the elements you want. Note that you can also leave the start or stop blank, which means that it will start at the beginning or go all the way to the end. For example, `control[:2]` will return the same thing as above.

We can also use negative indices for slices. For example, if we want the last two items of the list, we have two options. We can access like above with `drug[1:]`, or we could use a negative index to access the last item of the list, `drug[-2:]` which means take everything from the last two items of the list. This especially useful when working with longer lists.

In [39]:
drug[-2:]

[0.6, 0.6]

To summarize, let's say we have a simple list that counts up from 0 to 5: `myList = [1,2,3,4,5]`. The following table shows different ways you can access each element in this list:

| Value of Item in List | 1 | 2 | 3 | 4 | 5 |
|---------------|---|---|---|---|---|
| Forward Index | 0 | 1 | 2 | 3 | 4 |
| Reverse Index | -5 | -4 | -3 | -2 | -1 |


#### Exercise 5.1

Let's practice indexing on a larger list. Runing the below code which will generate a list with 100 items, counting from 1 to 100. Don't worry too much about these details for now, but the `range()` function is a builtin python function that generates a list of numbers from the provided start value (1) to the stop value (101), similar to slices discussed previously. The `list()` function converts it to a list.

Editing the below code block, return the following items:
1. The first item in the list = 1
2. The last item in the list = 100
3. The 50th item in the list = 50
4. The 10th, 11th, and 12th item in the list = [10, 11, 12]
5. The last 5 items in the list = [96, 97, 98, 99, 100]
6. The 3rd, 6th, and 9th item in the list = [3, 6, 9]
7. Play around a bit! 

In [None]:
#initialize list from 1 to 100
my_long_list = list(range(1, 101))

#edit the code here to output specific elements from the list
my_long_list

#### Editing Lists

Lists are also *mutable*, which means you can change the values of items in a list. For example, whoops, we noticed that our first drug replicate actually had a tumor size of 0.01, not 0.1! How might we change this without having to retype the entire list? Well, like this:

In [17]:
drug[0] = 0.01
drug

[1, 0.5, 0.3, 0.7]

But what if we repeated the same experiment with another mouse and want to add this data to our list? We can use the append() method to add an item to the end of a list. Each time you run this code, you'll add a new item to the list. Try it out!

In [30]:
drug.append(0.6)
drug

[1,
 0.5,
 0.3,
 0.7,
 0.6,
 0.6,
 0.6,
 0.6,
 0.6,
 0.6,
 0.6,
 0.6,
 0.6,
 0.6,
 0.6,
 0.6,
 0.6]

We can reverse the above (i.e. remove an item from the end of the list) using the pop method. You can also remove a specific value in the list using the remove method.

In [36]:
drug.pop()
drug

[1, 0.5, 0.3, 0.7, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6]

In [38]:
drug.remove(0.5)
drug

ValueError: list.remove(x): x not in list

#### Membership operators (in)

We can use the 'in' operator to check if a value is in a list. Let's say we obtained a list of all the proteins associated with a particular disease. We can check if a protein we are interested is in this list or not.

#### A brief mention of tuples

We mentioned that lists are mutable, meaning we can edit the value of items in the list and add/remove items from the list. However, in some cases, this may not be the desired behavior and you might want to avoid accidentally editing your list. In this case, you can use a tuple, which are immutable versions of lists. Tuples are defined using parentheses instead of square brackets. 

In [44]:
drug = (0.1, 0.5, 0.3)
control = (1.5, 1.2, 0.9)

We can still access elements of a tuple like a list, but if we try to edit or add elements, we get an error. Try it with the following code blocks

In [45]:
drug[0]

0.1

In [46]:
drug[0] = 1

TypeError: 'tuple' object does not support item assignment

In [47]:
drug.append(1.2)

AttributeError: 'tuple' object has no attribute 'append'

If you have no intention of manipulating your data in any way (adding, subtracting, changing specific values, etc.), it's generally recommended you use a tuple. In fact, if you do any work in R instead of python, you'll find that vectors in R (the equivalent of lists/tuples) are immutable like tuples in python.

#### Other quick notes
__Note1__

Lists can hold different data types in the same list. The following is valid, although generally not recommended:
```python
list = [1, 2, 3.0, "four", "five", "six"]
```

__Note2__

Lists can be nested within other lists. For example, for 2D data, you can use a list of lists:
```python
list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
```
Which is equivalent to the following matrix
| 1 | 2 | 3 |
|---|---|---|
| 4 | 5 | 6 |
| 7 | 8 | 9 |