# Lists and list comprehension

This notebook will introduce one of the most powerful Python functionalities: *list comprehension*, which can be used to conduct concise iteration without the use of loops. 

In [None]:
import pandas as pd

### Lists

We've seen lists a few times throughout this workshop, but we haven't really talked much about them.

A **List** is an object that can hold multiple values. 

A list can be defined using `[]`, with each value seprated by a comma. 

For example, below we create a list of four integers.

In [None]:
# create a list, x, containing the integers 2, 5, 18, and 22


In [None]:
# look at x


Since a list is an object in Python, we can check its type using the `type()` function:

In [None]:
# check the type of x


You can also ask how many entries are in a list using the `len()` function.

In [None]:
# report the length of x


Lists can contain multiple identical entries.

In [None]:
# create a list with multiple repeated values


### Lists containing different types of objects

Lists can contain entries of multiple types too. For example, below we create a list of four values: a string, an integer, a float, and a boolean.

In [None]:
# Create a list containing 'hello', 3, 5.2, and True


Lists can contain complex objects, in addition to single values. The following list contains three entires:

1. A list of two strings `['a', 'b']`

2. The integer `1`

3. A pandas Series containing two values: `apple` and `banana`

In [None]:
# create the list with the three elements above


### Manipulating lists

What do you think the following code will do?

In [None]:
[1, 2, 3] + [3, 3, 3]

What do you think will happen when we run the following code?

In [None]:
[1, 2, 3] * 3

### Slicing/subsetting lists

Similarly to how we can extract values from a pandas Series and DataFrames, we can extract values from a list using the `[]` operator.

In [None]:
# remind ourselves of x
x

In [None]:
# extract the third entry from x (remember that Python starts counting at 0)


How might you try to extract multiple values from `x`?

In [None]:
# first attempt at extracting the third and fourth entries from x


In [None]:
# another attempt at extracting the third and fourth entries from x 




Unfortunately you cannot extract multiple values from a list by providing a list of values to the `[]` operator. Instead, you can use the `:` sequencing operator to extract a range of values from a list.

We need to use the sequencing syntax `start:stop` to extract a subset of values from a list. Note that the `stop` value is not included in the output.

In [None]:
# Extract the first two entries from x using a slice `0:2`


You can omit the end point of the sequence to extract all values from the start index to the end of the list by leaving the `stop` index blank.

In [None]:
# extract all entries except the first entry using a slice


You can also omit the start of the sequence to extract all values until a certain point by leaving the `start` index blank:

In [None]:
# Extract the first two entries from x using a slice 


You can also use the `start:stop:step` syntax to take every `step`-th value from the list.

In [None]:
# Extract every second entry starting from the first entry (index 0) and ending with the fourth entry (index 3)


In [None]:
# Extract every second entry starting from the first entry (index 0)
# start:stop:step -- leaving stop blank will go to the end, i.e., start::step


### Negative indexing

In [None]:
# look at x again
x

What do you think the following code will return?

In [None]:
x[-2]

Negative indexing involves indexing from the *end* of the list. So `x[-1]` gives the last value in the list, `x[-2]` gives the second-to-last value in the list, and so on.

The following code will return the list in reverse order. Can you explain why?

In [None]:
x[::-1]

### Updating values with slicing

In [None]:
# take a look at x
x

In [None]:
# replace the second and third entries with apple and banana


In [None]:
# take a look at x now


### Exercise

For the following list

In [None]:
a = [3, 1, 4, 9, 10, 3]

Write some code to do the following:

1. Extract the fifth entry in the list

1. Extract the last four entries in the list

1. Extract the first four entries in the list

1. Extract the third-last entry in the list 

1. Extract every second entry in the list, starting from the second entry

1. Return the list with its entries in reverse order

In [None]:
# extract the fifth entry


In [None]:
# extract the last four entries in the list


In [None]:
# extract the first four entries in the list


In [None]:
# extract the third last entry in the list


In [None]:
# extract every second entry starting from the second entry


In [None]:
# return the list entries in reverse order


### List comprehension

Suppose that we wanted to add 1 to each entry in `a` from the exercise above. Would the following code work?

In [None]:
a + 1

The way to conduct element-wise operations on a list is to use list comprehension (another way would be to use a "for loop"). 

In [None]:
# list comprehension for adding 1 to each value in a


The general syntax for list comprehension is: 

`[expression for item in list]`

where `expression` is the operation you want to conduct on each item in the list.

In [None]:
# list comprehension for squaring each value in a


### Exercise

Use list comprehension to add the text `', Australia'` to the end of each string in the following list:

In [None]:
my_list = ['Sydney', 'Melbourne', 'Brisbane', 'Adelaide', 'Perth']

In [None]:
# solution:


### Gapminder list comprehension example

Let's look at a more interesting example.

In [None]:
# load in gapminder
gapminder = pd.read_csv('data/gapminder.csv')
# create a copy of gapminder to modify


Suppose that I wanted to change all of the column names to uppercase.

In [None]:
# print out the original columns


In [None]:
# Manually change the names of all columns to uppercase


In [None]:
# reset to the original columns


Instead I could use list comprehension together with the `.upper()` string method to do this in one line of code.

In [None]:
# use a list comprehension to create a list of upper case column names


In [None]:
# update the column names with the upper case names


## Exercise

Using list comprehension, write some code to extract the first four letters of each country name in the country column of gapminder (advanced challenge: remove duplicated country names).