# 5. Working with lists

Recall: a list is a collection of objects, arranged in a specific order.  A comma separates each entry and the whole thing is surrounded by square brackets.
    
For example:

In [None]:
exampleList = [1,5,3,2]

In [None]:
type(exampleList)

## List indexing

You can retrieve entries in a list by specifying the index of the entry in the list.  This is done by putting the index in square brackets after the name of the list.  

Counting of indices in python lists starts at zero, so to pick the first entry in a list, specify index 0.

eg. to specify the first value of the list, exampleList, defined above:

In [None]:
exampleList[0]

### Exercise 5.1

Pick out the third entry of the list `exampleList`

In [None]:
# Exercise 5.1 a)

What happens if you try to pick an entry that doesn't exist, for example the tenth entry?

In [None]:
# Exercise 5.1 b)

You can also use negative numbers as indices, which start backwards from the end of the list.
This is useful if you don't know how long your list is, but want to pick, for example, the last entry:


In [None]:
exampleList[-1]

### Exercise 5.2

Pick out the last but one entry in the list, `exampleList`, using negative indices

In [None]:
# Exercise 5.2

## Changing entries in a list

We can use list indexing to change the value of a particular entry in a list.

For example, to change the value of the second entry in exampleList from 5 to 32 we do:

In [None]:
exampleList[1] = 32

Now exampleList looks like:

In [None]:
exampleList

### Exercise 5.3

Add 37 to the fourth entry in `exampleList`

In [None]:
# Exercise 5.3 a)

Check that you have done what you expected by looking at exampleList:

In [None]:
# Exercise 5.3 b)

## List slicing

We can also select a section (or slice) of the original list using slicing.

Here, we specify the first index we want to select, then a semicolon, then one more than the last index we want to select.

For example, to select the second to the third entries in exampleList:

In [None]:
exampleList[1:3]

### Exercise 5.4
Pick out the first to third entries (inclusive) from `exampleList`

In [None]:
# Exercise 5.3

## Finding the length of a list

To find the length of a list, use the function `len()`:

In [None]:
len(exampleList)

## Adding an entry to the end of a list

What if you want to update your list and add a further entry?

If you want to add a single entry to the list, use the `append()` function:

In [None]:
exampleList.append(9)

In [None]:
exampleList

If you have a whole other list of entries that you want to add the end of your list, you can add the two lists together to make a third list:


In [None]:
exampleList + [5,2,721]

## Other functions of lists

Other useful functions you can perform on lists:

- `sorted()`: sorts a list in ascending order:
- `sum()`: sums the values in a list

In [None]:
unsortedlist = [9,2,7,4,1,8,12,6,4,1]
sorted(unsortedlist)

In [None]:
sum(unsortedlist)

### Exercise 5.5

a) Find average value of the data in the list below:

In [None]:
data = [1,4,6,3,9,12,5,4,20,2,6,4,8,1,1,2,5,4,6,7,3]

# Exercise 5.5 a)

b) Add a new data point `17` to the list

In [None]:
# Exercise 5.5 b)

c) Find the median value of the data

In [None]:
# Exercise 5.5 c)

### Converting strings to lists

So far we have been working with lists of integers or floating point values.

What if we have a string that we want to convert to a list? eg. a DNA or protein sequence.  To convert a string to a list, we use `list()`.

eg:

In [None]:
DNA = 'ACAATGCGATACGTATTTGCG'
DNA = list(DNA)
print(DNA)

Then we can use list functions on the DNA list.

### Exercise 5.6

Find the length of the DNA sequence and change the second-last base to a 'T'

In [None]:
# Exercise 5.6

## Review

In this section we have learnt:
- index numbers in python lists start at zero
- to pick out values in a list by their index, using *listname*[*index*]
- to pick out a slice of a list by index, using *listname*[ *index1*: *index2* ] (*index2* is one larger than the index of the last entry you want to select)
- to find the length of a list, using len(*listname*)
- to add a single entry to the end of a list, use *listname*.append(*entry*)
- we can concatenate two lists to make a third list
- to sort the entries in a list using sort(*listname*)
- to sum the entries in a list using sum(*listname*)
- convert letters in a string to a list using list(*string*)

# 6.  Using lists in for loops

To iterate through values in a list, we use a `for` loop.

So to print every value of a list of names, for example:

In [None]:
for name in ['Jack','Elisa','Rahim','Sterling','Lucas','Anya']:
    print(name)

We can also set up the list in advance and refer back to the list name:

In [None]:
names = ['Jack','Elisa','Rahim','Sterling','Lucas','Anya']
for name in names:
    print(name)

### Exercise 6.1
From the list of distances given below, obtain a list of distances squared

In [None]:
# Exercise 6.1
dists = [3.1, 8.4, 9.0, 4.2, 3.0]

*List comprehension* is a handy shortcut when looping over lists:

In [None]:
dist_squared2 = [dist*dist for dist in dists]
print(dist_squared2)

This does the same thing as above (loops over `dists` and for each entry `dist` adds `dist` x `dist` at the corresponding index of a new list, `dist_squared2`), only in one line instead of three. You can do a similar thing with dictionaries and tuples.

## The `range` function

When using a for loop it can sometimes be useful to create a list of consecutive numbers.
To do this, you can use the `range` function in combination with list comprehension, which returns a list from 0 to one less than the argument given.
eg:

In [None]:
[ i for i in range(12) ]

### Exercise 6.2

Below are two lists, the first gives the amount of growth (in cm) of four pieces of grass in the first twelve hours of the day, the second for the same pieces of grass in the last twelve hours of the day. Find the total amount of growth for each piece of grass.

In [None]:
# Exercise 6.2
am = [4.3, 7.1, 9.5, 8.5]
pm = [1.2, 3.2, 4.2, 3.9]

#### Exercise 6.3

Below is a DNA sequence. During transcription, when mRNA is formed, base pairing occurs, so that the resultant mRNA sequence relates to the original DNA sequence in the following way:
DNA -> mRNA
- A -> U
- C -> G
- G -> C
- T -> A

The mRNA sequence is translated to protein with the mRNA sequence being read three letters at a time - the three letter codes give the protein amino acid, eg.:
- AUG = methionine
- AAA or AAG = lysine

etc.

a) Check that the DNA sequence length is divisible by 3.

b) How many methionines would be in the resultant protein sequence?

c) [if you have time] what would the resultant protein sequence be? [you will need to look online for mRNA -> amino acid transcription codes]

In [None]:
DNA = 'GGCGGCCCCCTAGCGTCGCGCAGGGTCGGGGACTGCGCGGCGGTGCCAGGCCGGGCGTGGGCGAGAGCACGAACGGGCTGCCTGCGGGCTGAGAGCGTCGAGCTGTCACCATGGGTGATCACGCTTGGAGCTTCCTAAAGGACTTCCTGGCCGGGGGCGTCGCCGCTGCCGTCTCCAAGACCGCGGTCGCCCCCATCGAGAGGGTCAAACTGCTGCTGCAGGTGAGGACCGCGCGGTGCAAGAGGCGGGCGCGGGCGCGGCGGGCCGGGCGGGGCGCGCGATGCGGCGCGAGCTGCAGGGCGCGGGGCGCCGCGGAAAATCTGCGCCAGGCCACAGGCCCGGGCGCCCGCCCGCCCGCCCGCCCGCCCGCGGGGGAAGAAGGTGCCCTCTGCGTAGAGACAGGTCCAGCGTCAGTCGCAG'

In [None]:
# Exercise 6.3 a)

In [None]:
# Exercise 6.3 b)

In [None]:
# Exercise 6.3 c)

#### An aside on dictionaries

When defining a dictionary, it has the form:
 `dict_name = {key1: value1, key2: value2, ...}`

Keys can be strings, integers, tuples; values can be strings, integers, tuples, lists, other dictionaries...

In [None]:
key3 = 3
dict_name = {'key1': 1, (2,2): 'value2', key3: [3,3]}
dict_name

Each key and value form a pair. We can use our dictionary to get the 'value' corresponding to a particular key:

In [None]:
print(dict_name['key1'])
print(dict_name[(2,2)])
print(dict_name[3])
print(dict_name[key3])

After first setting up our dictionary, we can then add or change entries:

In [None]:
dict_name['key1'] = 'newvalue'
dict_name['key1']

In [None]:
dict_name['newkey'] = 4
dict_name['newkey']

### Review
We have learnt:
- to iterate through list using `for`
- using the `range` function to create a list of consecutive numbers