# Notebook-5: Lists

### Lesson Content 

Welcome to the fifth Code Camp notebook! In this lesson we're going to explore two, more advanced *data types*. In previous weeks we looked at numeric (*integers* and *floats*) and textual (*strings*) data, but it's probably been quite difficult to imagine how you'd assemble these simple data into something useful. The new data types (lists and dictionaries) will _begin_ to show you how that can happen, and they will allow you to 'express' much more complex concepts with ease.

### What's the Difference?

Up until now, our variables have only held _one_ thing: a number, or a string. 

In other words, we've got: `myNumber = 5` or `myString = "Hello world!"`. And that's it. 

Now, with **lists** and **dictionaries** we can store _multiple_ things: several numbers, several strings, or some mish-mash of both. This is much as you would with a _real_ dictionary (which holds the definitions of lots of words) or a _real_ list (which holds the things that you need to get done today).

In fact, the simple explanations in parentheses above encompass the main difference between lists and dictionaries:
- A _list_ is an *ordered* collection of 'items' (numbers, strings, etc.). You can ask for the first item in a list, the 3rd, or the 1,000th, it doesn't really matter because your list has an order and it keeps that order.
- A _dictionary_ is an *unordered* collection of 'items' (numbers, strings, etc.). You access items in a dictionary in a similar way to how you access a real dictionary: you have a 'key' (i.e. the word for which you want the definition) and you use this to look up the 'value' (i.e. the definition of the word).

There's obviously a lot more to lists and dictionaries than this, but it's a good starting point: lists = ordered, dictionaries = unordered.

Let's start with *lists* in this notebook and we'll continue with *dictionaries* in the following.

### In this Notebook

- Lists
    - Indexing & Slicing
    - List operations
        - Addition and Mulitplication
        - You're (not) in the list!
    - List functions
        - insert
        - append
        - index
        - len()
        - range()

----
# Lists

So a *list* is an ordered collection of items that we access by position (what Python calls the _index_). So the first item in a list is always the first item in the list. End of (well, sort of... more on this later). Because lists are a new data type, we create and use them differently from the simple variables we've seen so far. You can always spot a list because it is a series of items separated by commas and grouped together between a pair of square brackets ( *[A, B, C, ..., n]* ).

Here is a list of 4 items assigned to a variable called `myList`:

In [4]:
myList = [1,2,4,5]
print myList

[1, 2, 4, 5]


Lists are pretty versatile: they don't really care what kind of items you ask them to store. So you can use a list to hold elements from all the other data types that we have seen so far! Below we assign a new list to a variable called `myList` and then print it out so that you can see that it 'holds' all of these different types of data:

In [5]:
myList = ['hi I am', 2312, 'mixing', 6.6, 90, 'strings, integers and floats']
print myList

['hi I am', 2312, 'mixing', 6.6, 90, 'strings, integers and floats']


We don't want to get too technical here, but it's important to note one thing: the reason that `myList` can hold all of those different types of data is that it still just one thing – a list – and not six things (`'hi I am', 2312, 'mixing', 6.6, 90, 'strings, integers and floats'`). Got it? It's like the difference between a person and a crowd: a crowd is one thing that holds many people inside it...

## Indexing

To access an element in a list you use an **index**. This is a new and important concept (and it's a term that we're going to use a _lot_ in the future), so try to really get this term into your head before moving forward in the notebook.

### Accessing Elements in a List using Indexes

The index is just the location of the item in the list that you want to access. So let's say that we want to fetch the second item, we access it via the *index notation* like so:

In [6]:
myList = ['hi', 2312, 'mixing', 6.6, 90, 'strings, integers and floats' ]
secondElement =  myList[1]
print secondElement

2312


See what has happened there? We have:

1. Assigned a list with 6 elements to the the variable `myList`.
2. Accessed the second element by writing that element's index between a pair of square brackets (next to the list's name). 
3. Assigned the second element to another variable called `secondElement`.
4. Told Python to `print` the value of this new variable.

Wait a sec – didn't we say *second* element? Why then is the index 1???

### Zero-Indexing

Good observation! That's because lists indexes are *zero-based*: this is a fancy way to say that the count starts from 0 instead that from 1. So the first element has index 0, and the last element has index _n-1_ (i.e. the count of the number of items in the list [_n_] minus one).

To recap:

In [8]:
myNewList = ['first', 'second', 'third']
print "The first element is: " + myNewList[0]
print "The third element is: " + myNewList[2]

The first element is: first
The third element is: third


### Negative Indexing

Since programmers are lazy, they also have a short-cut for accessing the _end_ of a list. Since positive numbers count from the start of a list, negative numbers count from the end:

In [10]:
print myNewList[-1]
print myNewList[-2]

third
second


You wouldn't write `myNewList[-0]` to get at the last element because `-0` is the same as `0`, so `myNewList[-0]` is the _same_ as `myNewList[0]`, which is the first item in the list. You can remember it this way: the last item in the list is at _n-1_ (where _n_ is the number of items in the list), so `...[-1]` is a sensible way to access the last item.

#### A challenge for you!

In [1]:
# print the 'second' element in the list
print "The second element is :" + myNewList[???]

SyntaxError: invalid syntax (<ipython-input-1-5ee299ed6e5f>, line 2)

### Index Out of Range

What happens when you try to access an element that doesn't exist? 

We know that `myList` has 3 elements, so what if we try to access the 200th element in the list? In that case Python, as usual, will inform us of the problem using an *error message* pointing to the problem:

In [20]:
print myNewList[200]

IndexError: list index out of range

#### A challenge for you!

Do you remember the past lesson on *syntax errors* and *exceptions*? What is the error message displayed in the code above? Is it an *exception* or a *syntax error*? Can you find the explanation for what's going in the [Official Documentation](https://www.google.ie/url?sa=t&rct=j&q=&esrc=s&source=web&cd=3&ved=0ahUKEwiN3s-0qr7OAhVGIcAKHYBLAE4QFggoMAI&url=https%3A%2F%2Fdocs.python.org%2F2%2Ftutorial%2Ferrors.html&usg=AFQjCNG6q1juN8ZVXOEqOYWxE18Cv5X_qw&sig2=o92WLjkV1PNNfgpW1w9n0g&cad=rja)? 

## A String is a List?

Even if you didn't realise it, you have already been working with lists a _bit_ because *strings* are (_basically_) lists! Think about it this way: strings are an ordered sequence of characters because 'hello' and 'olhel' are very different words! It turns out that characters in a string can be accessed the same way we'd access a generic list.

In [2]:
myString = "ABCDEF"
print myString[0]
print myString[-1]

A
F


## Slicing

If you want to access more than one item at a time, then you can specify a _range_ using two indexing numbers instead of just one. If one number pulls out one element of the list, then it is stands to reason that using two numbers gives us... what? 

The start and end of a _group_ of items. This operation is called *list slicing*, and keep in mind that indexes start from 0!

_Note_: remember too that the error above when we tried to get the 200th element was _index out of range_. So 'range' is how Python talks about more than one list element.

In [3]:
shortSentence = "Now I'll just print THIS word, between the 20th and the 25th character: "
print shortSentence[20:25] 

THIS 


#### A challenge for you!

In [5]:
# print from the second to the fourth (inclusive) 
# characters in the following string:
shortSentence2 = "A12B34c7.0"
print shortSentence2[???:???] 

SyntaxError: invalid syntax (<ipython-input-5-4958f913e9df>, line 4)

To print the entirety of a list from a starting position onwards, just skip the final index:

In [7]:
stringToPrint = "I will print from HERE onwards"
print "Starting from the 17th position: " + stringToPrint[17:] 

Starting from the 17th position:  HERE onwards


Notice that there are _two_ spaces between "position:" and "HERE" in the printout above? That's because the 17th character is a space. Let's make it a little more obvious:

In [8]:
print "Starting from the 17th position: '" + stringToPrint[17:] + "'"

Starting from the 17th position: ' HERE onwards'


Got it?

#### A challenge for you!

Now, combining what we've seen above, how do you think you would print everything _up to the 8th character_ (which is the space between "HERE" and "onwards")?

You'll need to combine:
1. Negative indexing
2. List slicing

There are two ways to do it, one way uses only one number, the other uses two. Why don't you try to figure them both out? For 'Way 2' below the `???` is a placeholder for a full slicing operation since if we gave you more of a hint it would make it too easy.

In [9]:
print "Up to the 18th position (Way 1): '" + stringToPrint[???:???] + "'"
print "Up to the 18th position (Way 2): '" + ??? + "'"

SyntaxError: invalid syntax (<ipython-input-9-4fd5a77cde30>, line 1)

Strings have also plenty of methods that might prove to be quite useful in the future; for a fuller overview check out [this reference](https://en.wikibooks.org/wiki/Python_Programming/Variables_and_Strings).

## List operations

So far, we've only created a list, but just like a real to-do list most lists don't usually stay the same throughout the day (or the execution of your application). Their _real_ value comes when we start to change them: adding and removing items, updating an existing item, concatenting several lists (i.e. sticking them together), etc.

### Replacing an item

Here's how we replace an item in a list:

In [11]:
myNewList = ['first', 'second', 'third']
print myNewList

# This replaces the item in the 2nd position
myNewList[1] = 'new element'

print myNewList

['first', 'second', 'third']
['first', 'new element', 'third']


This shouldn't surprise you too much: it's just an assignment (via `=`) after all! 

So if you see `list[1]` on the right side of the assignement (the "=") then we are _reading from_ a list, but if you see `list[1]` on the left side of the assignment then we are _writing to_ a list.

Here's an example for a (small) party at a friend's place:

In [12]:
theParty = ['Bob','Doug','Louise','Nancy','Sarah','Jane']
print theParty

theParty[1] = 'Phil' # Doug is replaced at the party by Phil
print theParty

theParty[0] = theParty[1]
print theParty # Phil is an evil genius and manages to replace Doug with a Phil clone

['Bob', 'Doug', 'Louise', 'Nancy', 'Sarah', 'Jane']
['Bob', 'Phil', 'Louise', 'Nancy', 'Sarah', 'Jane']
['Phil', 'Phil', 'Louise', 'Nancy', 'Sarah', 'Jane']


Got it?

### Addition and Multiplication

You can also operate on entire lists at one time, rather than just on their elements individually. For instance, given two lists you might want to add them together like so:

In [13]:
britishProgrammers = ["Babbage", "Lovelace"]
nonBritishProgrammers = ["Torvald", "Knuth"]

famousProgrammers = britishProgrammers + nonBritishProgrammers
print famousProgrammers

['Babbage', 'Lovelace', 'Torvald', 'Knuth']


You can even multiply them, although in this particular instance it is kind of pointless:

In [14]:
print britishProgrammers * 2

['Babbage', 'Lovelace', 'Babbage', 'Lovelace']


#### A challenge for you!

In [15]:
# Check the syntax to properly define a new list 
otherNonBritishProgrammers = ["Wozniak" ??? "Van Rossum"]

# Then print out all the non british programmers
print nonBritishProgrammers ??? otherNonBritishProgrammers

SyntaxError: invalid syntax (<ipython-input-15-27d852a1d1e3>, line 2)

###  You're (not) in the list!

Ever stood outside a party and been told "You're not on/in the list"? Well, Python is like that too. In fact, Python tries as hard as possible to be _like_ English – this isn't by accident, it's by design – and once you've done a _bit_ of programming in Python you can start to guess how to do something by thinking about how you might say it in English. 

So if you want to check if an item exists in a list you can use the **in** operator:

```python
element in list
```

The **in** operator will return `True` if the item is present, and `False` otherwise

In [19]:
print ('Lovelace' in britishProgrammers)
print ('Lovelace' in nonBritishProgrammers)

letters = ['a','b','c','d','e','f','g','h','i']
print ('e' in letters)
print ('z' in letters)

True
False
True
False


Likewise, if you want to check if an item does not exist in a list then you can use the **not in ** operator. Let's go back to our party:

In [20]:
print theParty
print 'Bob' not in theParty
print 'Jane' not in theParty

['Phil', 'Phil', 'Louise', 'Nancy', 'Sarah', 'Jane']
True
False


#### A challenge for you!

In [22]:
# Complete the missing bits so that 
# we print out Ada Lovelace, her full name.
firstProgrammerSurnames = ["Babbage", "Lovelace"]
firstProgrammerNames    = ["Charles", "Ada"]

firstProgrammerSurnames[1] = firstProgrammerNames[1] + " " + firstProgrammerSurnames[1]

print "Lady "+ ???[1] +" is considered to be the first (woman) programmer" 

SyntaxError: invalid syntax (<ipython-input-22-5197da96e5b3>, line 8)

_Note_: Actually, Lady Ada Lovelace is a [fascinating person](https://en.wikipedia.org/wiki/Ada_Lovelace): she isn't just the first female programmer, she was the first programmer full-stop. For many years Charles Babbage got all the credit simply because he was a guy, but lately we realised that Ada was the person who actually saw that Babbage hadn't just invented a better abacus, he'd invented a general purpose computing device! She was so far ahead of her time that the programme she invented in her head couldn't even run on Babbage's 'simple' (i.e. remarkably complex for the time) computer, but it is not recognised as the first computer algorithm. As a result, there is now a day in her honour every year that is celebrated around the world at places like Google and Facebook, as well as at King's and MIT, because we want to recognise the fundamental contribution to computing made by women programmers. They were long overlooked by the men who thought that the hard part was the machine, not the programming (which they erroneously thought was just like typing).

## Extending Lists

We've already seen that we can combine two lists using the `+` operator, but if you wanted to add to your list constantly having to do something like this would be annoying:
```python
myList = [] # An empty list

myList = myList + ['New item']
myList = myList + ['Another new item']
myList = myList + ['And another one!']

print myList
```
Not just annoying, but also hard to read! So there's an easier way to write this in Python:
```python
myList = [] # An empty list

myList.append('New item')
myList.append('Another new item')
myList.append('And another one!')

print myList
```

Why don't you try typing it all in the coding area below? You'll get the same answer either way, but one is faster to write and easier to read!

Appending to a list using `append(...)` is actually using something called a _function_. We've not really seen this concept before and we're not going to go into it in enormous detail here (there's a whole other notebook to introduce you to functions). The things that you need to notice at _this_ point in time are:

1. That square brackets ('[' and ']') are used for list indexing.
2. That parentheses ('(' and ')') are (normally) used for function calls.

The best thing about functions is that they are like little packages of code that can do a whole bunch of things at once (e.g. add an item to a list by modifying the list directly), but you only need to remember to write `append(...)`. 

What did I mean just now by 'modifying the list directly'? Notice that in the first example above we had to write:
```python
myList = myList + ['New item']
``` 
because we had to write the result of concatening two lists together back to a variable, but in the second example we could just write: 
```python 
myList.append('New item')
```
and the change was made to `myList` directly! 

## Other List Functions

There are many functions that can be applied to lists such as: `range`, `len`, `insert` and `index`. 

You tell Python to *execute a function* by specifying the function's name, followed by a set of parentheses. The parentheses serve also to contain the optional input (the thing for the function to _operate on_). 

The functions `range` and `len` are a good examples:
```python
len(theParty)
range(theParty)
```
Here, the function `len` (lazy short-hand for _length_) is _passed_ theParty list in order to do its magic.

The functions `append`, `insert` and `index` are used a bit differently and have to be _called_ using theParty list. We're at risk of joining Alice down the rabbit-hole here, so let's just leave it at: the second set of functions are known as *methods* of the list *class*. We'll stop there. 

<img src="https://www.washingtonpost.com/blogs/answer-sheet/files/2013/01/alice-falling-down-rabbit-hole1.jpg" width="250" />

In order to use a _method_ you always need to 'prepend' the name of list you want to act upon, like so:

```python
theParty.append("New Guest")
theParty.insert(2, "Anastasia")
```

The logic here is that methods are associated with specific types of things (such as lists) because you can't append something to, for instance, a number, but you can use `len` on a list, on a string, perhaps even on an integer? 

### Append

Reminder: here's appending...

In [23]:
britishProgrammers = ['Lovelace']
britishProgrammers.append("Turing")
print britishProgrammers

['Lovelace', 'Turing']


### Insert

That's cool, but as you noticed `append` only ever inserts the new item as the last element in the list. What if you want it to go somewhere else?

With `insert` you can also specify a position

In [25]:
print nonBritishProgrammers
nonBritishProgrammers.insert(1, "Swartz")
print nonBritishProgrammers

['Torvald', 'Knuth']
['Torvald', 'Swartz', 'Knuth']


### Index

Lastly, with the `index` method you can easily ask Python to find the position (index) of a given item:

In [26]:
# Say you want to know in where "Knuth" is 
# in the list of non-British programmers...
print nonBritishProgrammers.index("Knuth")

2


#### A challenge for you!

Add the famous [Grace Hopper](https://en.wikipedia.org/wiki/Grace_Hopper) (inventress of the first compiler!) to the list of British programmers, and then print her index position:

In [27]:
nonBritishProgrammers.???("Hopper") 
print nonBritishProgrammers.???("Hopper")

SyntaxError: invalid syntax (<ipython-input-27-7ad5a11775ad>, line 1)

### Length

Cool, so those were some of the *methods you can invoke on* a list. Let's focus now on some *functions* that take lists as an _input_.

With the function `len` you can immediately know the `len`-gth of a given list:

In [28]:
print len(britishProgrammers)

2


In [32]:
length_of_list = len(nonBritishProgrammers)
print "There are " + str(length_of_list) + " elements in the list of non-British Programmers"

There are 3 elements in the list of non-British Programmers


Did you see the `str(length_of_list)`? There's another function! We didn't draw attention to it before, we just told you to use `str(5.0)` to convert the float `5.0` to the string `"5.0"`. We can tell it's a function because it uses the format `functionName(...some input...)`. So the function name is `str` (as in _convert to string_) and the input is a number (in this case it's the length of the list `nonBritishProgrammers`). So now we can easily convert between different types of data: we took an integer (you can check this by adding the following line to the code above: 
```python
print length_of_list + 1
```
And then printing it out as a string. So `length_of_list` is a number, and by calling `str(length_of_list)` we changed it to a string that we could print out. Given that programmers are lazy, can you guess how you'd convert the string "3" to the _integer_ 3?

#### A challenge for you!

In [33]:
# complete the missing bits
length_of_brits = ???(britishProgrammers)
print "There are " ??? " British programmers."

SyntaxError: invalid syntax (<ipython-input-33-7437217d5617>, line 2)

### Range

The function `range` is used instead to create a sequence of values. It accepts at *least one* parameter as input, and always returns a list as output. It's quite handy for instance when you need a list of ordered numbers:

In [34]:
range(10)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [35]:
# To check if the output of range()
# is a list we can use the type() 
# function.
type(range(10))

list

Notice that before I said that it accepts "*at least one parameter*" as input. If you look at the ouput you'll notice that the function assumes that you want to start from 0. This is laziness again: for a lot of things that programmers do they want to start from 0 and count up to some number. So the _default_ behaviour for `range(n)` is to produce a range from 0 to _n_.

To define a lower boundary for a range, you have to provide *two input parameters*:

In [36]:
# See how we get a list between 
# any pair of integers?
range(5,10)

[5, 6, 7, 8, 9]

In fact, `range` accepts even a third parameter that allows us to specify a 'step' for the sequence:

In [37]:
# I want a sequence of all numbers 
# from -5 to 20 with a step of 5
range(-5,20,5)

[-5, 0, 5, 10, 15]

#### A challenge for you!

What do you think will be the output of this code?

In [38]:
sequence_of_numbers = range(5,30,5)
print(sequence_of_numbers[2])

15


# Code (Applied Geo-example)

Let's continue our trips around the world! This time though, we'll do things better, and instead of using a simple URL, we are going to use a real-word geographic data type, that you can use on a web-map or in your favourite GIS software.

If you look down below at the `KCL_position` variable you'll see that we're assigning it an apparently complex and scary data structure.  Don't be afraid!  If you look closely enough you will notice that is just made out the "building blocks" that we've seen so far: `floats`, `lists`, `strings`..all wrapped comfortably in a cozy `dictionary`!

This is simply  a formalised way to represent a *geographic marker* (a pin on the map!) in a format called `GeoJSON`.

According to the awesome [Lizy Diamond](https://twitter.com/lyzidiamond?lang=en-gb) 

>[GeoJSON](http://geojson.org/geojson-spec.html) is an open and popular geographic data format commonly used in web applications. It is an extension of a format called [JSON](http://json.org), which stands for *JavaScript Object Notation*. Basically, JSON is a table turned on its side. GeoJSON extends JSON by adding a section called "geometry" such that you can define coordinates for the particular object (point, line, polygon, multi-polygon, etc). A point in a GeoJSON file might look like this:

    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          -122.65335738658904,
          45.512083676585156
        ]
      },
      "properties": {
        "name": "Hungry Heart Cupcakes",
        "address": "1212 SE Hawthorne Boulevard",
        "website": "http://www.hungryheartcupcakes.com",
        "gluten free": "no"
      }
    }
    
>GeoJSON files have to have both a `"geometry"` section and a `"properties"` section. The `"geometry"` section houses the geographic information of the feature (its location and type) and the `"properties"` section houses all of the descriptive information about the feature (like fields in an attribute table). [Source](https://github.com/lyzidiamond/learn-geojson)


Now, in order to have our first "webmap", we have to re-create such `GeoJSON` structure. 

As you can see there are two variables containing King's College Longitute/Latitude coordinate position. Unfortunately they are in the wrong data type. Also, the variable `longitude` is not included in the list `KCLCoords` and the list itself is not assigned as a value to the `KCLGeometry`dictionary.

Take all the necessary steps to fix the code, using the functions we've seen so far.



In [None]:
# don't worry about the following line
# I'm simply requesting a module from Python
# to have additional functions at my disposal
# which usually are not immediately available
import json

# King's College coordinates
# What format are they in? Does it seem appropriate?
# How would you convert them back to numbers?
longitude = '-0.11596798896789551'
latitude = '51.51130657591914'

# use the appropriate function to insert the item 
KCLCoords = [??? , latitude ]

# how can you assign KCLCoords to the key KCLGeometry["coordinates"]?
KCLGeometry = {
        "type": "Point",
        "coordinates": ???
      }

KCL_position = {
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {
        "marker-color": "#7e7e7e",
        "marker-size": "medium",
        "marker-symbol": "building",
        "name": "KCL"
      },
      "geometry": KCLGeometry
    }
  ]
}

# OUTPUT
# -----------------------------------------------------------
# I'm justing using the "imported" module to print the output
# in a nice and formatted way
print(json.dumps(KCL_position, indent=4))
# here I'm saving the variable to a file on your local machine
with open('my-first-marker.geojson', 'w') as outfile:
    json.dump(json.dumps(KCL_position, indent=4), outfile)

After you've run the code, Python will save a new file called `my-first-marker.geojson` in the folder where you are running the notebook.
Try to upload it on [this website (Geojson.io)](http://geojson.io/#map=2/20.0/0.0) and see what it shows!                                               

**Congratulations on finishing your fifth notebook!**


### Further references:

General list or resources
- [Awesome list of resources](https://github.com/vinta/awesome-python)
- [Python Docs](https://docs.python.org/2.7/tutorial/introduction.html)
- [HitchHiker's guide to Python](http://docs.python-guide.org/en/latest/intro/learning/)
- [Python for Informatics](http://www.pythonlearn.com/book_007.pdf)
- [Learn Python the Hard Way - Lists](http://learnpythonthehardway.org/book/ex32.html)
- [Learn Python the Hard Way - Dictionaries](http://learnpythonthehardway.org/book/ex39.html)
- [CodeAcademy](https://www.codecademy.com/courses/python-beginner-en-pwmb1/0/1)

