# Introduction to Data Science for Public Policy
## Class 3a: Lists
## Thomas Monk

# Overview
- Yesterday we set up our environment and learnt the basics of Python coding, including variables, functions, and conditionals.

- We're going to start today with another useful Python data type: the **list**.
- Lists will remind us of how we previously understood variables in Stata: a vector of items whose order has some meaning.

## Lists

What is a list?

- A list is an *ordered* sequence of values.

We can create a list in the following way:

In [2]:
primes = [2, 3, 5, 7]

Notice that the list is defined through the use of square brackets, `[]`, just as the string was defined through quotation marks, `""`.

## Lists
We can put anything we want into a list.

- Strings

In [4]:
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']

- Integers

In [None]:
primes = [2, 3, 5, 7]

- We can even have lists of lists.

In [9]:
hands = [
    ['J', 'Q', 'K'],
    ['2', '2', '2'],
    ['6', 'A', 'K'], # (Comma after the last element is optional)
]
# (I could also have written this on one line, but it can get hard to read)
hands = [['J', 'Q', 'K'], ['2', '2', '2'], ['6', 'A', 'K']]
print(hands)

[['J', 'Q', 'K'], ['2', '2', '2'], ['6', 'A', 'K']]


We can store any mix of data types within a list.

In [10]:
things = [32, 'saturn', print]
print(things)

[32, 'saturn', <built-in function print>]


Notice that we can even store a function within a list.

## List indexing
Now we have things in lists, we'll want to *access* them.

Python uses *zero-based* indexing. What this means is, the first item in a list is the *0th* item.

In [13]:
my_list = [3, 2, 1]
my_list[0]

3

We access the list item we want through the square brackets shown `[]`.

In [15]:
my_second_list = ['UCL', 'LSE', 'QMUL']

What will my_second_list[2] give us?

my_second_list[2]

## List indexing
We can also index the string *backwards* through using the `-` sign.

That is, to obtain the last item on the list:

In [17]:
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']
planets[-1]

'Neptune'

We obtain the planet furthest from the sun!

## List slicing
We can chop up lists however we like.

What are the first three planets?

In [18]:
planets[0:3]

['Mercury', 'Venus', 'Earth']

We use the `:` symbol to obtain our slice.
    
Notice that the indexing is a bit strange! We start at `0` and continue up to *but not including* `3`.

## List slicing
The starting and ending indices are both optional. If I leave out the start index, it's assumed to be 0. So I could rewrite the expression above as:

In [20]:
planets[:3]

['Mercury', 'Venus', 'Earth']

Or from Mars to the end...

In [22]:
planets[3:]

['Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']

Or even negative indexes, i.e. from the third **up to** the last in the list?

In [24]:
planets[3:-1]

['Mars', 'Jupiter', 'Saturn', 'Uranus']

## Changing lists
Lists are 'mutable' - this just means they can be changed in place.

Let's say that the name of the Mars changes to Marte. Any ideas on how we would do this?

In [25]:
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']

In [28]:
planets[3] = 'Marte'
planets

['Mercury',
 'Venus',
 'Earth',
 'Marte',
 'Jupiter',
 'Saturn',
 'Uranus',
 'Neptune']

We can even change multiple lists items at the same time, just by assigning a list to that slice.

In [31]:
planets[:3] = ['Mr', 'V', 'U']
planets

['Mr', 'V', 'U', 'Marte', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']

## Some useful list functions
How many planets are there?

In [33]:
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']
len(planets)

8

sorted returns a sorted version of a list:

In [34]:
sorted(planets)

['Earth', 'Jupiter', 'Mars', 'Mercury', 'Neptune', 'Saturn', 'Uranus', 'Venus']

sum adds the items of the list together, max or min the extreme values.

In [43]:
primes = [2, 3, 5, 7]
print('Sum:', sum(primes), ', Max:', max(primes), ', Min:', min(primes))

Sum: 17 , Max: 7 , Min: 2


## Appending and Popping

We can create an *empty* list simply by creating a new `[]` variable.
We can easily add or remove things from lists with the following functions.

In [54]:
myList = []
myList.append('LSE')
print(myList)
myList.append('UCL')
print(myList)

['LSE']
['LSE', 'UCL']


In [53]:
# Popping removes the last item on the list - and returns that item, to use if we wish.
popped = myList.pop()
print(myList)
print(popped)

['LSE']
UCL


## Why lists?
- Why've we talked about lists for so long?

- Because list-like concepts are used heavily when we turn to the data.
- Remember, these are like our Stata 'columns' (or vectors) of data.