# Python Data Structures: Dictionaries Pt. 1

## Recap

### What are Data Structures?

As [Wikipedia summarizes](https://en.wikipedia.org/wiki/Data_structure), a data structure is a tool for organizing, managing, and storing data. They make it possible to organize collections of similar information.

- For example, if we have 20 students in a class, we might store the 20 scores for the first exam in a tuple.
  - Now, all of the data for the exam are grouped together in a structured way!
- This makes it possible for us to more easily calculate various statistics, such as the mean, minimum, or maximum.

In [None]:
# BAD PRACTICE --------------------------------------------------------
## This is not a good idea!
score_1 = 0.89
score_2 = 0.74
score_3 = 0.97
score_4 = 0.83
## etc...
## Calculating summary statistics will involve writing a lot of
## code, and it will be tough to make it reuseable.

# BETTER PRACTICE -----------------------------------------------------
## Here, we use a tuple to group our exam scores
## together!
exam_scores = (0.89, 0.74, 0.97, 0.83)

## Calculating the summary statistics will be much easier!
from statistics import mean

print(min(exam_scores)) # Minimum
print(max(exam_scores)) # Maximum
print(mean(exam_scores)) # Mean

> It can be helpful to think of a data structure as a "container" which holds information. For example, a filing cabinet is a sort of real-world data structure!

There are many different types of data structures. Each has its own strengths and weakness, such as:

- How fast is it to add or remove information from these data structures?
- How fast is it to retrieve information from the structure?

Python's official [Tutorial](https://docs.python.org/3/tutorial/datastructures.html) lists several important built-in data structures. Let's quickly review a few that we've already seen.

### What Data Structures Have we Already Discussed?

You should already be familiar with the following data structures:

- Tuples
  - Immutable
  - Ordered (in a "sequence")
  - Indexed by position
- Lists
  - Mutable
  - Ordered (in a "sequence")
  - Indexed by position
- Sets
  - Mutable
  - Unordered
  - Unique (no repeated values)

We have already discussed some of the key properties of these data structures in this bootcamp. Please see the relevant notebooks for further details, or check the relevant sections of the [Types Page in the Python Documentation](https://docs.python.org/3/library/stdtypes.html) if you want even more information.

### Names, and the Limitations of Tuples and Lists

Let's revisit our example from earlier:

In [None]:
# BAD PRACTICE
## This is not a good idea! ... right?
score_1 = 0.89
score_2 = 0.74
score_3 = 0.97
score_4 = 0.83
## etc...

# BETTER PRACTICE
exam_scores = (0.89, 0.74, 0.97, 0.83)

The first approach to storing information makes it much harder to organize our data. However, it has one advantage which we lose when we put the information in a tuple: we no longer know which score belongs to which student!

It might be nice to associate some *name*, or a **key**, with each of the **values** in our data structure. That way, we would be able to associate two different pieces of data with each other, namely:
- The ID of the student (the **key**)
- Their exam score (the **value**)

Luckily, Python provides us with an incredibly powerful built-in data structure for doing exactly this: [Dictionaries](https://docs.python.org/3/tutorial/datastructures.html#dictionaries).

> See the documentation page for more details: https://docs.python.org/3/library/stdtypes.html#mapping-types-dict

## Overview of Dictionaries

### Dictionaries Map "Keys" to "Values"

Dictionaries are Python's implementation of a [hash table data structure](https://en.wikipedia.org/wiki/Hash_table). They make it easy to pair a **key** with a **value**. This makes it possible for us to group similar pieces of information together *and* give them some sort of name, or ID. For example:

In [None]:
# BAD PRACTICE --------------------------------------------------------
## This is not a good idea!
score_1 = 0.89
score_2 = 0.74
score_3 = 0.97
score_4 = 0.83
## etc...

# GOOD PRACTICE -------------------------------------------------------
## Our scores are now grouped together, AND we still know which student
## earned each score!
exam_scores = {
    "Student 1": 0.89,
    "Student 2": 0.74,
    "Student 3": 0.97,
    "Student 4": 0.83
}

This is similar to how dictionaries work in real life! If we know a **word**, we can look up the **definition**. In Python, if we know the **key**, we can look up the **value**.

### Examples of Key-Value Pairs

We see relationships between keys and values in a myriad of places in the real world. Here are a few examples.

- Matching employee email addresses (keys) with their legal names (values)
- Matching a sports team (key) with their season record (value)
- Matching a TV show (key) with its episode ratings (values)

### Properties of Dictionaries

Dictionaries have the following properties, relevant to our use in this class:

- **Unordered**
  - They contain keys and values with no inherent ordering.
- **Unique Keys, Arbitrary Values**
  - A dictionary can only contain one value for a given key.
  - A dictionary can have many keys pointing to the same value.
- **Indexable through a key-based index**
  - They allow you to access their values by using keys.
- **Mutable**
  - They support in-place mutations and changes to their contained values.
  - They support growing or shrinking operations.
- **Heterogeneous**
  - They can store keys and values of different data types and domains.
    - Dictionary values may be mutable or immutable.
    - Dictionary keys **must be immutable**.
- **Nestable**
  - Their values can consist of arbitrary data. For example, you may build a dictionary of dictionaries.
- **Iterable**
  - They support iteration, so you can traverse them using a loop or comprehension while you perform operations on **each of their keys**.
- **Unsliceable**
  - Extracting a subset of elements from a dictionary will require using filtering.
- **Combinable**
  - They support concatenation operations, so you can combine two or more dictionaries using the merge operation.
  - This can be done to update the existing dictionary, or create a new dictionary.

Let's look at some examples of how to use dictionaries in Python.

## Basic Dictionary Use

### Construction

There are two main ways to create a Python dictionary:

1. Using Curly Braces (also called a "literal")
2. Using the `dict()` function

Let's see some examples. The following code will create two empty dictionaries, and then populate them with values.

In [None]:
# Constructing a new Dictionary
dictionary1 = {} # Curly braces method
dictionary2 = dict() # Dict method

# Populate the dictionary
dictionary1["key1"] = "value1"
dictionary2["key2"] = "value2"

print(f"Dictionary 1: {dictionary1}")
print(f"Dictionary 2: {dictionary2}")

We'll talk more about lines 4-6 ("Populate the Dictionary") in the [Growing and Shrinking](#growing-and-shrinking) section.

What if we already have a collection of keys and values? We can use any of the following methods to create the dictionary:

In [None]:
# Keyword argument list
dictionary1 = dict(key1="value1", key2="value2")

# Dictionary literal
dictionary2 = {"key3":"value3", "key4":"value4"}

# List of tuples
dictionary3 = dict([
    ("key5", "value5"),
    ("key6", "value6")
])

# Display the dictionary
print(f"Dictionary 1: {dictionary1}")
print(f"Dictionary 2: {dictionary2}")
print(f"Dictionary 3: {dictionary3}")

### Indexing

We can think of the act of "looking up" a value for a given key as a type of **indexing**. Actually, we've already seen indexing when we looked at lists and tuples! Let's recap how that works.

- Lists and tuples facilitate **indexing** by *ordering* their values.
- We can use the order of the sequence to index by using integers.
  - The index is the *position* of the value in the sequence. For example: 

In [None]:
# Say we have a list of four food items
food_items = ["apple" , "chicken" , "eggs" , "bread"]
# What is the value in position 2? (remember, we start at 0)
food_items[2]

Python's syntax for indexing values from a list/tuple and a dictionary are identical! However, instead of using a *position* (remember, dictionaries have *no order*), we use the **key** to index the **value**.

In [None]:
# Maybe we want to keep track of how many containers we have of each food?
food_items = {
    "apple": 2,
    "chicken": 1,
    "eggs": 4,
    "bread": 2
}
# How many containers of eggs do we have?
food_items["eggs"]

What happens if we try to index on a key which isn't in the dictionary?

In [None]:
# How much candy do we have?
try:
    food_items["candy"]
except KeyError:
    print("Uh oh! This will give you an error!")

We can see that this causes Python to raise an KeyError! One way to avoid this problem is to use the `.get()` method. This allows you to specify a default value if the key is not present in the dictionary. You could also use a Default Dictionary, which we will discuss later.

In [None]:
# Let's use .get() instead
food_items.get("candy", 0)

### Iteration

We can treat Python Dictionaries as *iterables*. This means we can loop over them like we can with lists or tuples.

- You **should not rely on the elements being presented in a certain order**.
- Iterating over a dictionary will iterate over the **keys**, not the values!

In [None]:
# Create a Harry Potter dictionary
harry_potter_dict = {
    "Harry Potter": "Gryffindor",
    "Ron Weasley": "Gryffindor",
    "Hermione Granger": "Gryffindor",
    "Draco Malfoy": "Slytherin"
}

# Let's see which house each character belongs to
for character in harry_potter_dict:
    # We can use the key to get the value when we iterate!
    house = harry_potter_dict[character]
    # Print the result
    print(f"{character} belongs to {house} house.")

### Dictionary Size and Checking for Membership

- We can use the `len()` function to determine how many keys are in our dictionary.
- We can use the `key in d` expression to determine whether a key is in a dictionary.

In [None]:
# How many characters are in our dictionary?
print(len(harry_potter_dict))

# Is Harry Potter one of the keys?
print("Harry Potter" in harry_potter_dict)

### Adding Elements

We can dynamically add elements to a Python dictionary. Let's look at how we can do that.

- Method 1: "Assign" a value to a non-existent key.
- Method 2: Use the `.update()` method to add many keys and values at the same time.
  - We can store new key-value pairs in a dictionary or an iterable
- Method 3: Use the `.setdefault()` method to add a key and value, but *only if the key does not already exist.*

**IMPORTANT**: Methods 1 and 2 will **remap** the existing values for the given keys! If this is not what you want to have happen, you should implement a check where you see whether a key is already in the dictionary before assigning a value (or use Method 3).

In [37]:
# This is a helper function for printing our dictionaries
from pprint import pprint

# Create a Harry Potter dictionary
harry_potter_dict = {
    "Harry Potter": "Gryffindor",
    "Hermione Granger": "Gryffindor"
}

# Custom printer (ignore)
def pretty_print(text):
    print(text)
    pprint(harry_potter_dict)

# Display the dictionary
pretty_print("Starting Dictionary: ")

Starting Dictionary: 
{'Harry Potter': 'Gryffindor', 'Hermione Granger': 'Gryffindor'}


In [38]:
# METHOD 1 ------------------------------------------------------------
harry_potter_dict["Ron Weasley"] = "Gryffindor"

# Display the dictionary
pretty_print("After Method 1:")

After Method 1:
{'Harry Potter': 'Gryffindor',
 'Hermione Granger': 'Gryffindor',
 'Ron Weasley': 'Gryffindor'}


In [39]:
# METHOD 2 ------------------------------------------------------------
# Use a dictionary to update a dictionary
add_characters_1 = {
    "Albus Dumbledore": "Gryffindor",
    "Luna Lovegood": "Ravenclaw"
}

# Use iterables to update a dictionary
add_characters_2 = [
    ["Draco Malfoy", "Slytherin"],
    ["Cedric Diggory", "Hufflepuff"]
]

# Merge dictionaries
harry_potter_dict.update(add_characters_1)
harry_potter_dict.update(add_characters_2)

# Display the dictionary
pretty_print("After Method 2:")

After Method 2:
{'Albus Dumbledore': 'Gryffindor',
 'Cedric Diggory': 'Hufflepuff',
 'Draco Malfoy': 'Slytherin',
 'Harry Potter': 'Gryffindor',
 'Hermione Granger': 'Gryffindor',
 'Luna Lovegood': 'Ravenclaw',
 'Ron Weasley': 'Gryffindor'}


In [40]:
# METHOD 3 ------------------------------------------------------------
# Adding a new character, but only if the key doesn't already exist
harry_potter_dict.setdefault("Rubeus Hagrid", "Gryffindor")

# Let's try to add a character to the wrong house
harry_potter_dict.setdefault("Harry Potter", "Slytherin")

# Display the dictionary (Notice Harry Potter's House hasn't changed)
pretty_print("After Method 3:")

After Method 3:
{'Albus Dumbledore': 'Gryffindor',
 'Cedric Diggory': 'Hufflepuff',
 'Draco Malfoy': 'Slytherin',
 'Harry Potter': 'Gryffindor',
 'Hermione Granger': 'Gryffindor',
 'Luna Lovegood': 'Ravenclaw',
 'Ron Weasley': 'Gryffindor',
 'Rubeus Hagrid': 'Gryffindor'}


### Removing Elements

We can also dynamically remove elements from a Python dictionary.

- Method 1: Use the `del` keyword.
  - **WARNING:** This will cause Python to raise an error if the key does not exist in your dictionary already!
- Method 2: Use the `.pop()` method.
  - If you specify a default value, this will not raise an error if the key does not exist.

In [43]:
# METHOD 1 ------------------------------------------------------------
if "Ron Weasley" in harry_potter_dict:
    del harry_potter_dict["Ron Weasley"]

# Display the dictionary
pretty_print("After Method 1:")

After Method 1:
{'Albus Dumbledore': 'Gryffindor',
 'Cedric Diggory': 'Hufflepuff',
 'Draco Malfoy': 'Slytherin',
 'Harry Potter': 'Gryffindor',
 'Hermione Granger': 'Gryffindor',
 'Luna Lovegood': 'Ravenclaw',
 'Rubeus Hagrid': 'Gryffindor'}


In [45]:
# METHOD 2 ------------------------------------------------------------
harry_potter_dict.pop("Draco Malfoy", None)

# This key isn't in the dictionary, so if we don't specify a "Default"
# value (specified as None here), Python will raise an error
harry_potter_dict.pop("Bilbo Baggins", None)

# Display the dictionary
pretty_print("After Method 2:")

After Method 2:
{'Albus Dumbledore': 'Gryffindor',
 'Cedric Diggory': 'Hufflepuff',
 'Harry Potter': 'Gryffindor',
 'Hermione Granger': 'Gryffindor',
 'Luna Lovegood': 'Ravenclaw',
 'Rubeus Hagrid': 'Gryffindor'}


### Mutating

Remember, dictionary keys must be *unique*. If we assign a new value to an existing key, the dictionary will remap the relationship!

In [47]:
# Remapping with assignment
harry_potter_dict["Harry Potter"] = "GRADUATED"

# Remapping with .update()
harry_potter_dict.update({
    "Hermione Granger": "GRADUATED",
    "Albus Dumbledore": "INSTRUCTOR",
    "Rubeus Hagrid": "INSTRUCTOR"
})

# Display the dictionary
pretty_print("Harry Potter has now graduated!")

Harry Potter has now graduated!
{'Albus Dumbledore': 'INSTRUCTOR',
 'Cedric Diggory': 'Hufflepuff',
 'Harry Potter': 'GRADUATED',
 'Hermione Granger': 'GRADUATED',
 'Luna Lovegood': 'Ravenclaw',
 'Rubeus Hagrid': 'INSTRUCTOR'}


If the value associated with a key is mutable, we can change the value and the dictionary value will reflect those changes. Here's an example.

In [58]:
# Create a list, which is mutable
fruits = ["apple", "pear", "blueberry"]

# Create a dictionary, where one of the values is the list
food_types = {
    "fruits": fruits
}

# Add a new value to the list
fruits.append("kiwi")
# Do the same thing by indexing into the dictionary
food_types["fruits"].append("banana")

# What does the dictionary contain now?
print(food_types)

{'fruits': ['apple', 'pear', 'blueberry', 'kiwi', 'banana']}


### .keys(), .values(), and .items()

Each of these methods will let you access the contents of the dictionary as a *sequence*.

- `.keys()` will fetch the keys.
- `.values()` will fetch the values.
- `.items()` will return 2-element tuples containing the key and the value.

**N.B.** These can be expensive operations in terms of memory. Be cautious about using them if you have a very large dictionary.

In [None]:
print(harry_potter_dict.keys())

In [None]:
print(harry_potter_dict.values())

In [None]:
print(harry_potter_dict.items())

## Summary

- Dictionaries let you map keys to values.
  - You can use them to group similar collections of data and associate the data with meaningful IDs, such as names.
- Dictionaries are mutable.
  - You can easily add, change, and remove their contents.
- There are often several ways to perform the same task with dictionaries.
  - Check the documentation if you're not sure how to do a specific operation.

# Python Data Structures: Dictionaries Pt. 2

## More Complex Dictionary Use

### Arbitrary Types

So far, we have mainly seen dictionaries where we have used strings as the keys. However, we can use any hashable (which usually means immutable) type as a key in a dictionary, and *any* type as a value. 
- Here's an example of a dictionary where we use:
  - Integers as keys
  - Lists as the values.

In [None]:
november_birthdays = {
    3: ["John"],
    8: ["Harriet", "Shauna"],
    11: ["Davonte"],
    25: ["Elliott"]
}

# Who has a birthday on November 8th?
november_birthdays[8]

### Introduction to Nested Data

As a quick demonstration, let's show that we can store arbitrary values in a dictionary.

The cell below creates a dictionary which maps strings to other, distinct dictionaries. Those dictionaries then map strings to lists. So, the structure is:

- Parent Dictionary:
  - Sci Fi Dictionary
    - List of Books by Author
  - Fantasy Dictionary
    - List of Books by Author

If this seems like a lot to take in, we'll spend more time on this topic later. For now, just know that we can nest data structures like this to create a hierarchy of relationships.

In [50]:
# Here, we use several nested dictionaries to organize a collection
# of fiction
fiction_catalogue = {
    "Science Fiction": {
        "Gibson": ["Burning Chrome", "Neuromancer", "Count Zero", "Mona Lisa Overdrive"],
        "Le Guin": ["The Left Hand of Darkness", "The Telling"],
        "Stevenson": ["Snow Crash"]
    },
    "Fantasy": {
        "Peake": ["Boy in Darkness", "Titus Groan", "Gormenghast", "Titus Alone"],
        "Le Guin": ["A Wizard of Earthsea", "The Tombs of Atuan", "The Farthest Shore"],
        "Jemisin": ["The Fifth Season", "The Obelisk Gate", "The Stone Sky"],
    }
}


# What if we only want Fantasy books?
fantasy_catalogue = fiction_catalogue["Fantasy"]
print("The internal Fantasy Dictionary:")
pprint(fantasy_catalogue)

# OK, let's see what Ursula K. Le Guin wrote.
le_guin_catalogue = fantasy_catalogue["Le Guin"]
print("Fantasy Books Written by Ursula K. Le Guin:")
print(le_guin_catalogue)

The internal Fantasy Dictionary:
{'Jemisin': ['The Fifth Season', 'The Obelisk Gate', 'The Stone Sky'],
 'Le Guin': ['A Wizard of Earthsea',
             'The Tombs of Atuan',
             'The Farthest Shore'],
 'Peake': ['Boy in Darkness', 'Titus Groan', 'Gormenghast', 'Titus Alone']}
Fantasy Books Written by Ursula K. Le Guin:
['A Wizard of Earthsea', 'The Tombs of Atuan', 'The Farthest Shore']


### Heterogeneous Python Dictionaries

So far, we have mainly seen dictionaries which consistently map one *type* of data to another type.

- For example, the key is a string and the value is an integer.

However, Python dictionaries allow you to mix and match within a dictionary. The dictionary below has two entries with the following structure:

- Key: tuple -> Value: list
- Key: string -> Value: dictionary
  - Key: string -> Value: int
  - Key: int -> Value: tuple

In [51]:
messy_dictionary = {
    (1, 2): ["This", "is", "a", "list", "with", "a", "tuple", "key"],
    "dict key": {
        "example_key": 1,
        0: (3, "10")
    } 
}

- We're showing you this to point out that you *can* do it.
- However, you should probably be thinking very carefully before you do something like this.
- Remember, the point of data structures is usually to **group similar pieces of data together**. If you're mixing and matching data types to this degree, your data might not be meaningfully similar enough to store it in one place.
  - If you do this sort of thing, you should be acutely aware of how your code might produce unexpected results.

### Dictionary Comprehensions

You might remember that we can create data structures, such as lists, by using **comprehensions**. We can do the same thing for dictionaries. This allows us to combine iteration and our use of literals to concisely build a dictionary.

The diagram below shows the syntax rules for constructing dictionaries using comprehensions.

![picture](https://raw.githubusercontent.com/gt-cse-6040/skills_oh_week_02/main/Screenshot%202023-01-22%20074704.png)

Let's look at an example.

In [None]:
# We'll create a dictionary which uses the customers as keys
# and assigns a random value to them in the dictionary
from random import randint
customers = ["Alex","Bob","Carol","Dave","Flow","Katie","Nate"]

# The comprehension
discount_dict = {
    customer: randint(1,100)
    for customer in customers
}

print(discount_dict)

{'Alex': 62, 'Bob': 72, 'Carol': 18, 'Dave': 5, 'Flow': 22, 'Katie': 86, 'Nate': 84}


We can also iterate over two data structures simultaneously to create a dictionary with a comprehension.

![picture](https://raw.githubusercontent.com/gt-cse-6040/skills_oh_week_02/main/Screen%2520Shot%25202022-09-09%2520at%25209.20.39%2520AM.png)

In [None]:
# Start by creating two sequences
days = ["Sunday", "Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"]
temp_C = [30.5,32.6,31.8,33.4,29.8,30.2,29.9]

# Creating a dictionary of weekly tempertaures
# from the list of temperatures and days
# Note that we will cover zip in some detail next week,
# when we also discuss the enumerate function

weekly_temp = {day: temp for (day, temp) in zip(days, temp_C)}

print(weekly_temp)

{'Sunday': 30.5, 'Monday': 32.6, 'Tuesday': 31.8, 'Wednesday': 33.4, 'Thursday': 29.8, 'Friday': 30.2, 'Saturday': 29.9}


## Dictionary-Like Containers: Counters and Default Dictionaries

### Why Do These Exist?

Dictionaries are very flexible data structures. There are a few common use-cases for them, and it might be nice to have something slightly customized for those purposes.
- These types are provided through Python's built-in [Collections module](https://docs.python.org/3/library/collections.html#). Follow this link for more details.
- We'll be focusing on the following types here:
  - Counters
  - Default Dictionaries
- **N.B.** Counters and Default Dictionaries are just subclasses of dictionaries
  - We can verify this by running the following code cell
  - It shows us that counters and default dictionaries are, in fact, dictionaries

In [70]:
from collections import Counter, defaultdict
print("Is a counter a dictionary?", isinstance(Counter(), dict))
print("Is a default dictionary a dictionary?", isinstance(defaultdict(), dict))

Is a counter a dictionary? True
Is a default dictionary a dictionary? True


**IMPORTANT!** Sometimes, our autograder may require your solution to be a regular dictionary. In these cases, you may still use a counter or defaultdict to solve the problem. You will simply need to cast your container to a dictionary at the end of your solution. You can do this by writing calling `dict(my_container)`, where `my_container` is your default dictionary or counter.

### Counters

[Counters](https://docs.python.org/3/library/collections.html#collections.Counter) allow us to quickly and easily build dictionaries which store the count of elements contained in an iterable.
- For example, suppose we wish to count the number of occurrences of a character in a string.
  - Here's a sample string: `s = "bbbaaaabaaa"`
  - In this case, `'a'` occurs 7 times and `'b'` occurs 4 times.
- Let's say we want to construct a dictionary `count` such that `count['a'] == 7` and `count['b'] == 4`.
  - Method 1 in the cell below does _not_ work! Try uncommenting it to see.
    - We need to initialize the count to 0 for every new unique key.
  - Method 2 works, but is pretty verbose. Do we really have to write all of this every time we want to count elements and store them in a dictionary?

In [61]:
# Defining our string
s = "bbbaaaabaaa"

# METHOD 1 (does not work!) -------------------------------------------
#count = {}
#for c in s:
#    count[c] += 1
#count

# METHOD 2 (works, but pretty long!) ----------------------------------
# Create an empty dictionary
count = {}
for c in s:
    # Check for membership
    if c not in count:
        count[c] = 0
    assert c in count
    # Update the count
    count[c] += 1
count

{'b': 4, 'a': 7}

Counters let us do this automatically. Here's the same task, but by using a counter.

In [75]:
from collections import Counter

# Create the counter
count = Counter(s)
print ('Initial :', count)

# We can add to it by supplying a new iterable and using .update()
count.update('abcdaab')
print ('Updated:', count)

# If a value hasn't occurred, our counter won't throw an error!
print('How many times have we seen the letter "z"? ', count["z"])

Initial : Counter({'a': 7, 'b': 4})
Updated: Counter({'a': 10, 'b': 6, 'c': 1, 'd': 1})
How many times have we seen the letter "z"?  0


### Default Dictionaries

Sometimes, you might want to create a dictionary which is guaranteed to behave in certain ways when you try to index on a non-existant key. We can do this with [Default Dictionaries](https://docs.python.org/3/library/collections.html#defaultdict-objects).

- Remember, we can use `.get()` to get a default value.
  - However, we'll need to specify the default value *each time* we try to retrieve a value
  - The default value will *not* be automatically added to the dictionary
- Default Dictionaries let us automatically insert a value into the dictionary when we try to index on a non-existant key.
  - We do this by giving it a function, which will return some value by default.
  - Let's look at an example.

In [89]:
# Let's create a counter-like dictionary
default_count = defaultdict(int)

# If a key doesn't exist, it will default to 0 and be added to the
# dictionary
for c in s:
    default_count[c] += 1

print(default_count)



# What if we want to create a dictionary which returns a default 
# string?
# Let's assume we have a starting dictionary
harry_potter_dict = {
    "Harry Potter": "Gryffindor",
    "Ron Weasley": "Gryffindor",
    "Hermione Granger": "Gryffindor",
    "Luna Lovegood": "Ravenclaw",
    "Draco Malfoy": "Slytherin",
    "Cedric Diggory": "Hufflepuff"
}
# Now, create a default dictionary
harry_potter_default = defaultdict(lambda: "UNKNOWN!", harry_potter_dict)
pprint(harry_potter_default)
# What happens if we try to index on a non-existant key?
print("Dumbledore's house is:", harry_potter_default["Albus Dumbeldore"])

defaultdict(<class 'int'>, {'b': 4, 'a': 7})
defaultdict(<function <lambda> at 0x7ff09812d000>,
            {'Cedric Diggory': 'Hufflepuff',
             'Draco Malfoy': 'Slytherin',
             'Harry Potter': 'Gryffindor',
             'Hermione Granger': 'Gryffindor',
             'Luna Lovegood': 'Ravenclaw',
             'Ron Weasley': 'Gryffindor'})
Dumbledore's house is: UNKNOWN!


## Memory, Performance, and Limitations

### Memory Considerations

- In a dictionary, you need to store the key *and* the value.
- In a list or tuple, you only need to store the value.

So, depending on how you organize things, your dictionary may require more memory.

If you use a default dictionary, indexing a new key will use more memory (because you are creating a new record). Keep this in mind if you plan on indexing a lot of arbitrary keys!

### What Data Structure is Faster?

Python dictionaries allow us to associate a value to a unique key, and then to quickly access this value. It’s a good idea to use them whenever we want to find (lookup for) a certain Python object. We can also use lists for this scope, **but they are much slower than dictionaries.**

In [84]:
def find_number_in_list(lst, number):
    if number in lst:
        return True
    else:
        return False

def find_number_in_dict(dct, number):
    if number in dct:
        return True
    else:
        return False

short_list = list(range(100))
long_list = list(range(10000000))

short_dict = {x:x*5 for x in range(1,100)}
long_dict = {x:x*5 for x in range(1,10000000)}

In [85]:
%timeit find_number_in_list(short_list, 99)

416 ns ± 8.28 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [86]:
%timeit find_number_in_list(long_list, 9999999)

51.8 ms ± 240 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [87]:
%timeit find_number_in_dict(short_dict, 99)

73.4 ns ± 0.863 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [88]:
%timeit find_number_in_dict(long_dict, 9999999)

90.6 ns ± 0.873 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


This is because you have to go through the entire list to get what you want. However, a dictionary will return the value you ask for without going through all keys.

**But this keep this in mind - Dictionaries still use more memory than lists, since you need to use space for the keys and the lookup as well, while lists use space only for the values.**

### Limitations

- Dictionaries do not inherently have an order. If you need to do work with sequences, dictionaries may not do what you want them to do.
- Certain scientific computing tasks, like matrix multiplication, can be sped up by using different data structures. Dictionaries may not be a good fit for this sort of work.
- Dictionary keys must be unique.

Generally, try to avoid using a dictionary if a tuple, list, or set will work instead.

## Summary

- Dictionaries can be used to group other data containers, like lists, tuples, and even other dictionaries.
- The [Collections module](https://docs.python.org/3/library/collections.html#) gives us access to Counters and Default Dictionaries.
  - These make common tasks which use dictionaries even easier.
- Dictionaries tend to be much faster than lists and tuples when we want to check for membership, add, or remove items.
  - However, lists and tuples will be better suited for tasks which can be understood by ordering the elements.