# 1.2 Dictionaries

*Estimated time for this notebook: 10 minutes*

## 1.2.1 The Python Dictionary

Python supports a container type called a dictionary.

This is also known as an "associative array", "map" or "hash" in other languages.

In a list, we use a number to look up an element:

In [1]:
names = "Martin Luther King".split(" ")

In [2]:
names[1]

'Luther'

In a dictionary, we look up an element using **another object of our choice**:

In dictionaries you use curly brackets and start with what is called a key and then you use a : to say this is the value associated with the key.

In [3]:
me = {"name": "James", "age": 39, "Jobs": ["Programmer", "Teacher"]}

In [4]:
me # it produces a nice dataset with all this information

{'name': 'James', 'age': 39, 'Jobs': ['Programmer', 'Teacher']}

In [6]:
me["Jobs"] # so now, if I want to look what Jobs this individual has, and we get as answer a nice list of this strings

['Programmer', 'Teacher']

In [8]:
me["age"] # and you can do this for other aspects of the dictionary

39

In [10]:
type(me)
dir(me)

['__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__ior__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__ne__',
 '__new__',
 '__or__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__ror__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'clear',
 'copy',
 'fromkeys',
 'get',
 'items',
 'keys',
 'pop',
 'popitem',
 'setdefault',
 'update',
 'values']

In [14]:
me.items()
isinstance("mama", str)

True

### Keys and Values

The things we can use to look up with are called **keys**:

In [9]:
me.keys() # we can ask what are all the keys in my dictionaries

dict_keys(['name', 'age', 'Jobs'])

The things we can look up are called **values**:

In [None]:
me.values()

dict_values(['James', 39, ['Programmer', 'Teacher']])

When we test for containment on a `dict` we test on the **keys**:

In [None]:
"Jobs" in me

True

In [None]:
"James" in me

False

In [None]:
"James" in me.values()

True

### Immutable Keys Only

The way in which dictionaries work is one of the coolest things in computer science:
the "hash table". The details of this are beyond the scope of this course, but we will consider some aspects in the section on performance programming.

One consequence of this implementation is that you can only use **immutable** things as keys.

Anyone can tell me what the type of the object in the () brackets is?

In [None]:
good_match = {("Lamb", "Mint"): True, ("Bacon", "Chocolate"): False}

but if you try to put a list it will not work bc it is not immutable:

In [None]:
illegal = {["Lamb", "Mint"]: True, ["Bacon", "Chocolate"]: False}

TypeError: ignored

Remember -- square brackets denote lists, round brackets denote `tuple`s.

### Dictionary Order

Dictionaries will retain the order of the elements as they are defined (in Python versions >= 3.7).

In [None]:
my_dict = {"0": 0, "1": 1, "2": 2, "3": 3, "4": 4}
print(my_dict)
print(my_dict.values())

{'0': 0, '1': 1, '2': 2, '3': 3, '4': 4}
dict_values([0, 1, 2, 3, 4])


In [None]:
rev_dict = {"4": 4, "3": 3, "2": 2, "1": 1, "0": 0}
print(rev_dict)
print(rev_dict.values())

{'4': 4, '3': 3, '2': 2, '1': 1, '0': 0}
dict_values([4, 3, 2, 1, 0])


Python does not consider the order of the elements relevant to equality:

In [None]:
my_dict == rev_dict

True

## 1.2.2 Sets

A set is a `list` which cannot contain the same element twice.
We make one by calling `set()` on any sequence, e.g. a list or string.

In [None]:
name = "James Hetherington"
unique_letters = set(name)

In [None]:
unique_letters

{' ', 'H', 'J', 'a', 'e', 'g', 'h', 'i', 'm', 'n', 'o', 'r', 's', 't'}

Or by defining a literal like a dictionary, but without the colons:

In [None]:
primes_below_ten = {2, 3, 5, 7}

In [None]:
type(unique_letters)

set

In [None]:
type(primes_below_ten)

set

In [17]:
unique_letters

NameError: name 'unique_letters' is not defined

This will be easier to read if we turn the set of letters back into a string, with `join`:

In [16]:
"".join(unique_letters)

NameError: name 'unique_letters' is not defined

A set has no particular order, but is really useful for checking or storing **unique** values.

Set operations work as in mathematics:

In [18]:
x = set("Hello")
y = set("Goodbye")

In [19]:
x & y  # Intersection

{'e', 'o'}

In [20]:
x | y  # Union

{'G', 'H', 'b', 'd', 'e', 'l', 'o', 'y'}

In [21]:
y - x  # y intersection with complement of x: letters in Goodbye but not in Hello

{'G', 'b', 'd', 'y'}

Your programs will be faster and more readable if you use the appropriate container type for your data's meaning.
Always use a set for lists which can't in principle contain the same data twice, always use a dictionary for anything
which feels like a mapping from keys to values.