# Notebook-6: Dictionaries

- Michele Ferretti (https://github.com/miccferr); Jon Reades (https://github.com/jreades)

### Lesson Content 

In this lesson we'll contiune our exploration of more advanced data structures. Last time we took a peek at a way to represent **_ordered_** collections of items via **_lists_**.

This time we'll use **_dictionaries_** to create **_unordered_** collections (this is just an obvious distinction -- there's much more to it -- but it's a good way to start wrapping your head around the subject).

### In this Notebook

- Creating dictionaries
- Accessing elements of dictionaries
- Methods of dictionaries

## Dictionaries

Dictionaries are another type of data structure that is frequently used in Python. Like lists, the dictionary is also found in other programming languages, sometimes under a different name. For instance, Python dictionaries might be referred to elsewhere as "maps", "hashes", or "associative arrays".

According to the [Official Docs](https://docs.python.org/2/tutorial/datastructures.html#dictionaries):

> It is best to think of a dictionary as an unordered set of key-value pairs, with the requirement that the keys are unique (within one dictionary). A pair of braces creates an empty dictionary: {}

In other words, dictionaries are not lists: instead of just a checklist, we now have a _key_ and a _value_. We use the key to find the value. So a generic dictionary looks like this:

```python
theDictionary = {
    key1: value1,
    key2: value2,
    key3: value3,
    ...
}
```
Each key/value _pair_ is linked by a ':', and each pair is separated by a ','. It doesn't really matter if you put everything on on new lines (as we do here) or all on the same line. We're just doing it this way to make it easier to read.

Here's a real one:

In [None]:
myDict = {
    "key1": "Value 1",
    "key2": "2nd Value",
    3: "3rd Value",
    "Fourth Key": [4.0, 'Jon']
}
print( myDict )

Did you notice that when we printed out `myDict` it didn't print out the elements of dictionary in the same order in which we put items into it? That's what we mean when we say that dictionaries are _un_-ordered. Always remember that you have no idea how things are stored in a dictionary and so can't rely on indexing like you can with a list. Explaining _why_ this works this way is something you'd encounter in a first year Computer Science course.

And notice too that almost _any_ type of data can go into a dictionary: strings, integers, and floats. There's even a _list_ in there (`[4.0, 'Jon']`)! The _only_ constraint is that the **key must be _immutable_**; this means that about the only thing you can't use as a dictionary key is a variable such as a list or dictionary that _changes_ its value. _Note:_ this doesn't mean that we can't use variables to look thigs up, just that we can't have the key's _value_ change.

In [None]:
# This will result in an error
myFaultyDict = {
    ["key1", 1]: "Value 1", 
    "key2": "2nd Value", 
    3: "3rd Value", 
    8.0: [5, 'jon']
}

This fault dictionary doesn't work because you can't use a list (`["key1",1]`) as a key, even though you can use a list as a _value_. For more on the subject of (im)mutability checkout [this SO answer](http://stackoverflow.com/a/8059504) ). But this is _not_ the same as being unable to use variables to look things up:

In [1]:
# This will *not* result in an error
myDict = {
    "key1": "Value 1", 
    "key2": "2nd Value", 
    3: "3rd Value", 
    8.0: [5, 'jon']
}
myVariable = 8.0
print( myDict[myVariable] )

[5, 'jon']


## Accessing Dictionaries

Like lists, we access an element in a dictionary using a 'location' marked out by a pair of square brackets (*[...]*). The difference is that the index is no longer an integer indicating the position of the item that we want to access, but is a *key* in the *key:value* pair:

In [None]:
print( myDict["key1"] )
print( myDict[3]      )

Notice how now we just jump straight to the item we want? We don't need to think about "Was that the fourth item in the list? Or the fifth?", we just use a sensible key and we can ask for the associated value directly.

#### A challenge for you!

How would you print out "2nd Value" from `myDict`?

In [None]:
print( myDict[???] )

When it comes to error messages, `dict`s and `list`s behave in similar ways. If you try to access a dictionary using a key that doesn't exist then Python raises an *exception*. 

What is the name of the exception? Can you find it the [Official Docs](https://docs.python.org/2/library/exceptions.html)?

In [None]:
print( myDict[99] )

Handy, no? Again, Python's error messages are giving you helpful clues about where the problem it's encountering might be! Up above we had a `TypeError` when we tried to create a key using a list. Here, we have a `KeyError` that tells us something must be wrong with using `99` as a key in `myDict`. In this case, it's that there is no key 99!

## Creating a Simple Phone Book

One of the simplest uses of a dictionary is as a phone book! One of us did this in BASIC back in 1989 (Yes, James is that old! Much older than Jon who definitely never did anything in BASIC, no sirree!).

So here are some useful contact numbers:
1. American Emergency Number: 911
2. British Emergency Number: 999
3. Icelandic Emergency Number: 112
4. French Emergency Number: 112
5. Russian Emergency Number: 102

Now, how would you create a dictionary that allowed us to look up and print out an emergency phone number based on the [two-character ISO country code](http://www.nationsonline.org/oneworld/country_code_list.htm)? It's going to look a little like this:
```python
eNumbers = {
    ...
}
print( "The Icelandic emergency number is " + eNumbers['IS'] )
print( "The American emergency number is " + eNumbers['US']  )
```

In [None]:
eNumbers = {
    ???
}
print( "The Icelandic emergency number is " + eNumbers['IS'] )
print( "The American emergency number is " + eNumbers['US']  )

## Useful Dictionary Methods

We are going to see in the next couple of notebooks how to systematically access values in a dictionary (amongst other things). For now, let's also take in the fact the dictionaries _also_ have utility *methods* similar what we saw with the the `list`. And as with the list, these methods are functions that only make sense _when_ you're working with a dictionary, so they're bundled up in a way that makes them easy to use.

Let's say that you have forgotten what *keys* you put in your dictionary...

In [2]:
dictionary = {
    "Charles": "Babbage",
    "Ada": "Lovelace",
    "Alan": "Turing"
}

print( dictionary.keys() )

['Charles', 'Alan', 'Ada']


Or maybe you just need to access all of the values without trouble to ask for each key:

In [3]:
print( dictionary.values() )

['Babbage', 'Turing', 'Lovelace']


Or maybe you even need to get them as pairs:

In [4]:
# Output is a list of key-value pairs!
print( dictionary.items() )

[('Charles', 'Babbage'), ('Alan', 'Turing'), ('Ada', 'Lovelace')]


### Are You on the List? (Part 2)

As with the `list` data type, you can check the presence or absence of a key in a dictionary, using the *in* / *not in* operators... but note that they only work on keys.

In [5]:
print( "Charles" in dictionary )
print( "Babbage" in dictionary )
print( True  not in dictionary )

True
False
True


### What Do You Do if You're not on the List?

One challenge with dictionaries is that sometimes we have no real idea if a key exists or not. With a list it's pretty easy to figure out whether or not an index exists because we can just ask Python to tell us the _length_ of the list. So that makes it fairly easy to avoid having the list 'blow up' by throwing an exception.

Rather harder for a dictionary though, so that's why we have the dedicated **`get()`** method: it not only allows us to fetch the *value* associated with a *key*, it also allows us to specify a default value in case the key does not exist:

In [6]:
print( dictionary.get("Lady Ada", "Are you sure you spelled that right?" ) )

Are you sure you spelled that right?


See how this works: the key doesn't exist, but unlike what happened when we asked for `myDict[99]` we don't get an exception, we get the default value specified as the _second_ input to the method `get`. 

So you've learned two things here: that functions (the things marked out by parentheses... don't worry, we'll explain what these are _soon_!) can take more than one input: `get()` takes both the key that we're looking for, and a value to return if Python can't find the key; and that different types (or classes) of data have different methods (there's no `get` for lists).

## Lists of Lists, Dictionaries of Lists, Dictionaries of Dictionaries... Oh my!

OK, this is where it's going to get a little weird but you're also going to see how programming is a litte like Lego: once you get the building blocks, you can make lots of cool/strange/useful contraptions from some pretty simple concepts.

Remember that a list or dictionary can store _anything_: so the first item in your list could itself _be_ a list! For most people starting out on programming this is the point where their brain starts hurting (it happened to us) and you might want to throw up your hands in frustration thinking "I'm never going to understand this!" But if you stick with it, you will. 

And this is really the start of the power of computation.

### A Data Set of City Attributes

Let's start out with what some (annoying) people would call a 'trivial' example of how a list-of-lists (LoLs, though most people aren't laughing) can be useful. Let's think through what's going on below: what happens if we write `cityData[0]`?

In [7]:
# Format: city, country, population, area (km^2)
cityData = [
    ['London','U.K.',8673713,1572],
    ['Paris','France',2229621,105],
    ['Washington, D.C.','U.S.A.',672228,177],
    ['Abuja','Nigeria',1235880,1769],
    ['Beijing','China',21700000,16411],
]

print( cityData[0] )

['London', 'U.K.', 8673713, 1572]


So how would we access something inside the list returned from `cityData[0]`?

Why not try:
```python
cityData[0][1]
```
See if you can figure out how to retrieve and print the following from `cityData`:
1. France
2. 16411
3. Washington, D.C.
Type the code into the coding area below...

### A Phonebook+

So that's a LoL (list-of-lists). Let's extend this idea to what we'll call Phonebook+ which will be a DoL (dictionary-of-lists). In other words, a phonebook that can _do more_ than just give us phone numbers! We're going to build on the emergency phonebook example above.

In [None]:
# American Emergency Number: 911
# British Emergency Number: 999
# Icelandic Emergency Number: 112
# French Emergency Number: 112
# Russian Emergency Number: 102
eNumbers = {
    'IS': ['Icelandic',112],
    'US': ['American',911],
}
print "The " + eNumbers['IS'][0] + " emergency number is " + str(eNumbers['IS'][1])

See if you can create the rest of the `eNumbers` dictionary and then print out the Russian and British emergency numbers.

### Dictionary-of-Dictionaries

OK, this is the last thing we're going to through at you today – getting your head around 'nested' lists and dictionaries is _hard_. Really hard. But it's the all-important first step to thinking about data the way that computer 'thinks' about it. This is really abstract: something that you access by keys, which in turn give you access to other keys... it's got a name: recursion. And it's probably one of the cleverest thing about computing. 

Here's a bit of a complex DoD, combined with a DoL, and other nasties:

In [None]:
cityData2 = {
    'London' : {
        'population': 8673713,
        'area': 1572, 
        'location': [51.507222, -0.1275],
        'country': {
            'ISO2': 'UK',
            'Full': 'United Kingdom',
        },
    },
    'Paris' : {
        'population': 2229621,
        'area': 105.4,
        'location': [48.8567, 2.3508],
        'country': {
            'ISO2': 'FR',
            'Full': 'France',
        },
    }
}

Try the following code in the coding area below:
```python
print cityData2['Paris']
print cityData2['Paris']['country']['ISO2']
print cityData2['Paris']['location'][0]
```

Now, figure out how to print:
`The population of Paris, the capital of France (FR), is 2229621. It has a density of 21153.899 persons per square km.`

Do the same for London.

## Code (Applied Geo-example)

Let's continue our trips around the world! This time though, we'll do things better, and instead of using a simple URL, we are going to use a real-word geographic data type, that you can use on a web-map or in your favourite GIS software.

If you look down below at the `KCL_position` variable you'll see that we're assigning it an apparently complex and scary data structure.  Don't be afraid!  If you look closely enough you will notice that is just made out the "building blocks" that we've seen so far: `floats`, `lists`, `strings`..all wrapped comfortably in a cozy `dictionary`!

This is simply  a formalised way to represent a *geographic marker* (a pin on the map!) in a format called `GeoJSON`.

According to the awesome [Lizy Diamond](https://twitter.com/lyzidiamond?lang=en-gb) 

>[GeoJSON](http://geojson.org/geojson-spec.html) is an open and popular geographic data format commonly used in web applications. It is an extension of a format called [JSON](http://json.org), which stands for *JavaScript Object Notation*. Basically, JSON is a table turned on its side. GeoJSON extends JSON by adding a section called "geometry" such that you can define coordinates for the particular object (point, line, polygon, multi-polygon, etc). A point in a GeoJSON file might look like this:

    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          -122.65335738658904,
          45.512083676585156
        ]
      },
      "properties": {
        "name": "Hungry Heart Cupcakes",
        "address": "1212 SE Hawthorne Boulevard",
        "website": "http://www.hungryheartcupcakes.com",
        "gluten free": "no"
      }
    }
    
>GeoJSON files have to have both a `"geometry"` section and a `"properties"` section. The `"geometry"` section houses the geographic information of the feature (its location and type) and the `"properties"` section houses all of the descriptive information about the feature (like fields in an attribute table). [Source](https://github.com/lyzidiamond/learn-geojson)


Now, in order to have our first "webmap", we have to re-create such `GeoJSON` structure. 

As you can see there are two variables containing King's College Longitute/Latitude coordinate position. Unfortunately they are in the wrong data type. Also, the variable `longitude` is not included in the list `KCLCoords` and the list itself is not assigned as a value to the `KCLGeometry`dictionary.

Take all the necessary steps to fix the code, using the functions we've seen so far.



In [None]:
# don't worry about the following line
# I'm simply requesting a module from Python
# to have additional functions at my disposal
# which usually are not immediately available
import json

# King's College coordinates
# What format are they in? Does it seem appropriate?
# How would you convert them back to numbers?
longitude = '-0.11596798896789551'
latitude = '51.51130657591914'

# Set this up as a coordinate pair 
KCLCoords = [??? , latitude ]

# How can you assign KCLCoords to 
# the key KCLGeometry["coordinates"]?
KCLGeometry = {
        "type": "Point",
        "coordinates": ???
      }

KCL_position = {
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {
        "marker-color": "#7e7e7e",
        "marker-size": "medium",
        "marker-symbol": "building",
        "name": "KCL"
      },
      "geometry": KCLGeometry
    }
  ]
}

# OUTPUT
# -----------------------------------------------------------
# I'm justing using the "imported" module to print the output
# in a nice and formatted way
print(json.dumps(KCL_position, indent=4))

# here I'm saving the variable to a file on your local machine
# You should see it if you click on the 'Home' tab in your open
# browser window (it's the one where you started this notebook)
with open('my-first-marker.geojson', 'w') as outfile:
    json.dump(json.dumps(KCL_position, indent=4), outfile)

After you've run the code, Python will have saved a file called `my-first-marker.geojson` in the folder where you are running the notebook. Try to upload it on [this website (Geojson.io)](http://geojson.io/#map=2/20.0/0.0) and see what it shows!                                               

**Congratulations on finishing your sixth notebook!**


### Further references:

General list or resources
- [Awesome list of resources](https://github.com/vinta/awesome-python)
- [Python Docs](https://docs.python.org/2.7/tutorial/introduction.html)
- [HitchHiker's guide to Python](http://docs.python-guide.org/en/latest/intro/learning/)
- [Python for Informatics](http://www.pythonlearn.com/book_007.pdf)
- [Learn Python the Hard Way - Lists](http://learnpythonthehardway.org/book/ex32.html)
- [Learn Python the Hard Way - Dictionaries](http://learnpythonthehardway.org/book/ex39.html)
- [CodeAcademy](https://www.codecademy.com/courses/python-beginner-en-pwmb1/0/1)

