# Synopsis

In this unit we will learn that:

1. **Dictionaries** are **mutable unordered** collections whose elements are accessed using **keys**.

    1. Dictionaries are created using the `{}` syntax.
    
    2. Dictionaries are composed of `key, value` pairs.

    3. Each `key, value` pair is called an *item*.
    
    4. `Items` can be added to a dictionary using the built-in method `update()`.
    
    5. `Items` can be changed using instanciation.
    
    6. `Items` can be removed usind the functions `del` and the method `pop()`.
    
2. Dictionaries allow nesting with all data types.

3. We can access all `items`, `keys`, and `values` in a dictionary.

# Read libraries and functions

In [None]:
from IPython.core.display import HTML
from IPython.lib.display import YouTubeVideo
from pathlib import Path

import datetime
import sys


# Videos

In [None]:
vid = YouTubeVideo('XCcpzWs-CI4', width = 600)
display(vid)

# Dictionaries

A `Python` dictionary is an extraordinarily useful data type that expands on the possibilities offered by lists.  In a list one keeps track of the elements by an index that must be an integer.  **Dictionaries keep track of elements by `key`!**

Each **item** in a dictionary has both a **key** and a **value**. You use the `key` to "look up" the `value`. 

This approach is just like if we wanted to look up the meaning of a word in a real dictionary. Also, just like in a real dictionary, it means that all of the `keys` **must** be unique. If we had a `key` multiple times, then we wouldn't know where to go look up its `value`. 

Do you remember `sets`?  **The `keys` in a dictionary form a set!**

The syntax to create a dictionary also uses the syntax`{}`. If we are initializing a dictionary, we enter `key-value` pairs separated by commas.  Each `item` has the `key` separated from the `value` by a colon `:`:

> `a_dict = {key : value, another_key : another_value}`


## What are dictionaries good for?

Great that you would ask! Recall the project involving all the personnel records?  Dictionaries are **the** data type to deal with records.  What are `Date of Birth` and `Age` if not keys? 

Let's retrieve our code so that we can start seeing how great dictionaries are.

In [None]:
def parse_record( filename ):
    """
    
    """
    record = []
    
    # Statements here!
    
    return record

In [None]:
filename = Path.cwd().parent / 'Data' / 'Roster' / 'Agatha_Young_172.txt'

# open file for reading and read the file contents into a list of strings
#


Let's clean up the data by getting rid of useless lines and cleaning up extra characters in good lines

First 4 lines are useless, so we can ignore them

In [None]:
# Copy code from previous code cell



Each line as the structure `field: value`.  We can extract those into, for example, a tuple

In [None]:
# Copy code from previous code cell


The list of tuples that we create for each student contains all information we have available. However, it is not particular easy to access any particular field.  

Imagine we want to find the *Department* where *Agatha Young* works. We need to find the `index` of the `tuple` for which the first item equals *Department* and then print the second item of that `tuple`.

In [None]:
cleaned_lines

In [None]:
cleaned_lines[3][1]

Going from a `list` of `tuples` to a `dictionary` can make all the difference! 

Let's transform our list of tuple into a dictionary and check how much easier it is to retrieve the same information.

In [None]:
# Copy code from previous code cell



In [None]:
record['Department']

In [None]:
record['Favorite Color']

In [None]:
record['Email Address']

Let's look into the properties of dictionaries.  Dictionaries are **mutable**. You can change the `value` of an `item` by assigning something to the corresponding `key`.

Dictionaries are **unordered**. If you print the same dictionary twice, the order in which `items` will be printed does not need to remain the same. 

If you ask for a `key` that does not exist, then you will get an error.  Guess what type of error...

To avoid crashing your code, `Python` dictionaries provide a method `.get()` that provides a fail-safe to missing `keys`.

In [None]:
record['Favorite Sport']

In [None]:
print( record.get('Department') )
print( record.get('Favorite Sport') )

In [None]:
print( record.get('Department', 'Unknown' ) )
print( record.get('Favorite Sport', 'Unknown') )

If you want to add a new `key-value` pair to the dictionary, you access a new `key` and assign it a `value`. 

If you want to add multiple `items` to a dictionary, you can put them into a dictionary and the use the built-in method `update()`.

In [None]:
record['Favorite Sport'] = 'Soccer'
print(record)


record.update( {'Favorite Sports Team': 'S. L. Benfica', 
                'Favorite Sports Team Mascot': 'Eagle'} )
print()
print(record)

To remove a `key-value` pair from a `dict` variable, we can use `del` and provide the `key`. Guess what happens if you provide a `key` that does not exist? 

**Notice that if you fail to provide a `key`, `del` will delete the entire dictionary**. You are unlikely to truly want to do that. 

Alternatively, you can use the built-in method `pop()` and provide a `key`. This method deletes the `item` with `key` and returns the `value`.


In [None]:
del record['Favorite Sports Team Mascot']
print('--')
print( record.pop('Favorite Sports Team') )
print('--')
print( record.pop('Favorite Sport') )
print('--')
del record['Favorite Sports Team Mascot']

In [None]:
record

`Dictionary` objects encode additional types of information. 

You can access all `items`, or all `keys`, or all `values`.

In [None]:
print(record.keys())   # It looks like as list of strings
print()
print(record.items())  # It looks like a list of tuples
print()
print(record.values()) # It looks like a list

Even though all these objects look like lists they are not lists. They are **iterators**.  This means that you can go in order and access each one in turn, but they are not accessible by index.

In [None]:
for value in record.values():
    print(value)

print('--')
print( type( record.values() ) )
print('--')
print( list(record.values())[1] )
print('--')
print( record.values()[1] )

Working with dictionaries can be challenging when you are starting.  

Accessing information by `key` is less natural for some.  Moreover, things can quickly become rather complex when nesting is involved. Keeping track of the elements in a list of dictionaries that contains lists of list is not easy task.  

As in many other situations, being organized and working out specific cases with pencil and paper can make all the difference.

In order to gain experience with these challenges, let's create a list of dictionaries using the code for processing the roster files. 

In [None]:
def parse_record( filename ):
    """
    Takes a Path object and returns a dictionary with the cleaned
    information retrieved from the file read
    
    inputs:
        filename -- Path object
        
    output:
        record -- dict
    """
    record = {}
    
    # Statements here!
    
    return record

In [None]:
roster_path = ''
my_paths = get_path_to_records( roster_path )

print(my_paths[:4])

records = []
for filename in my_paths[:4]:
    records.append( parse_record(filename) )
#     birthday = extract_birthday( record[-1] )
#     age = calculate_age(birthday, relevant_date)
      
print()
print(records)

`my_paths` is a list and in the `for` loop we will iterate over paths to `filenames`.

`records` is a list of dictionaries. Thus, each element in the list is going to be enclosed inside `{}` and separated by commas. Look for `}, {`. Those mark where a dictionary ends and the next begins.

Inside each dictionary, we have `key: value` pairs separated by commas. 

If you can read them, you can write a command that accesses them!

# Exercises

You can go back to the **[previous notebook](nb_09_Functions_n_Refactoring.ipynb)** or finish those other functions here.

In [None]:
# Print the Department of the 5th student record
#


In [None]:
# Print the Height of the 15th student record
#


In [None]:
# Print the string "My name is ______ and I was born in _month___ of ___year." 
# for the 15th student record
#


What if you want to actually print the name of the month instead of a number?...

In [None]:
# Print the string "My name is ______ and I was born in _month___ of ___year." 
# for the 15th student record


In [None]:
# Print the string "My name is ______ and I love _color_." 
# for the 25th student record


In [None]:
# Print the string "My name is ______ and I love _color_." 
# for the first 10 student records 


Find all students that are within 5 years of age of Agatha Young.

Save all the processed records into a JSON file and check that it is all correct.