# Dictionaries

Materials developed by Todd Gureckis released under [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/) license and updated by Shannon Tubridy.

### Containers and Collections

Up to now we have considered single element variables (*numbers*, *strings*) and *lists* which can contain a number of individual elements. Lists are a kind of *container* or *collection* variable and there are several other collection types that are useful to know about for this class: _dictionaries_, _tuples_, and _sets_.

## Dictionaries

Dictionaries are collections of `key`-`value` pairs. 

One easy way to understand the difference between lists and dictionaries is that in a list you can "lookup" an entry of the list using a index value. So doing this:

`assessments = ['attention', 'memory', 'decision making', 'motor control']
assessments[2]`

would give us 'decision making' the value of the third position (idx=2) in the assessments list.

In a dictionary, values are stored relative to a 'key' rather than an ordered index position. To access the value you give the key. 


<div>
<img src="attachment:dict.png" width="500"/>
</div>



### Creating a dictionary

There are two primary ways to make a dictionary.

A dictionary can be made by using curly braces "{" and giving a sequence of key-value pairs separated by commas. 




In [None]:
# make a dictionary called 'person' with the following key-value pairs:
# firstname: shannon
# lastname: tubridy
# office: 867
# hair: brown



In [None]:
# lookup a value using dict_name['key-name']


In the above example we have four _keys_ (firstname, lastname, office, hair) and their associated _values_. In setting up the dictionary entries the key and its associated values are separated by a colon, and the key-value pairs are separated by commas.

Another way to create a dictionary is using the `dict()` function. Its usage should be clear from this example where we use dict() rather than {} and the key value pairs are key name followed by equal's sign and then value. Commas separate the dictionary entries.

In [None]:
person = dict(firstname='shannon', 
              lastname='tubridy', 
              office=867, 
              hair='brown')
person

Both of those examples will result in the same dictionary being defined for a variable called person.

**You will see me use the {'key':'value'} syntax most of the time when I make dictionaries.**

The {} dictionary "contructor" is faster to execute and it is a little more compact to type and read.

### Looking up values in a dictionary

One of main nice things about dictionaries is that you can "lookup" any value you want by the "key":

In [None]:
person

In [None]:
# get the firstame:

In [None]:
# get the hair color

In [None]:
# the key to request can be in a variable
requested_info = 'office'


In [None]:
# output can be stored in a variable
this_guys_office = person[requested_info]

print(this_guys_office)

#### Exercise: make a dictionary 
Make a dictionary with keys `classname`, `building`, `room`, and `day` and values for whatever you want (this class is in mercer 194, room 305). Use your dictionary to look up the building and day.

### Get all the keys in some dictionary

Sometimes you want to look inside a dictionary to see all the elements. To get all the keys:

In [None]:
person.keys()

In [None]:
# The 'in' operator can be used to see if a key
# exists in the dictionary
'firstname' in person

### Get all the values in some dictionary

You can get all the values in the dictionary using dict_name.values()

In [None]:
person.values()

### Get all the key-value pairs in a dictionary

dict.items() gives all the key-vale pairs

In [None]:
person.items()

Looking up a key that doesn't exists in the dictionary results in a `KeyError` error:

In [None]:
# look at the dictionary
person

In [None]:
# request value for a key that doesn't exist
person['middlename']

#### Exercise: look at the error from `person['middlename']` in the last cell.

Take a minute to make sure you see where the error message points to the line of code that was problematic and to link the error message itself to the concepts we're working on. 

It is a **KeyError** and we have been working with dictionary _keys_, so it tells us there is something wrong with how we were using our keys and it tells us what the problematic usage was ('middlename' which doesn't exit in the dictionary).

### Adding to a dictionary

You can add a new key to a dictionary that's already been created.

Use square brackets containing the new key name and then the value on the other side of an equals sign:

In [None]:
# make a dictionary and lookup the 'lastname'
person = { 'firstname': 'shannon', 
          'lastname': 'tubridy', 
          'office': 867, 
          'hair': 'brown'}

# check out an already existing key:
person['lastname']

In [None]:
# lookup the 'department' which isn't in the dictionary -- KEY ERROR
person['department']

In [None]:
# add a new key-value for department = Psychology:
person['department']='Psychology'

# add some more new keys
person['building']='meyer'
person['email']='st704@nyu.edu'

# checkout the resulting dictionary
person

You can also overwrite an existing value:

In [None]:
person['office']='463'
person

In addition to indexing by key using the `[]`, you can use the `.get()` function to lookup by a key.  

This is useful because if you use the dict[key-name] syntax and the key does not exist you will get an error and the code stops running.

The dictionary.get() method takes at least one input (a key whose value you want) but if the key doesn't exist the function returns a replacement or null value instead of an error.

It works like this:

#### dict.get('keyname') if keyname exists

Get the value back.

In [None]:
# use .get() attached to our person dict to 
# lookup the value for key firstname
result = person.get('firstname')
print(result)

#### dict.get('keyname') if keyname doesn't exist

If the key doen't exist in the dictionary and you only give one input to the function it will by default return a special value called `None`

In [None]:
result = person.get('birthplace')
print(result)

#### dict.get('keyname', 'alternative output') if keyname doesn't exist

You can also give an optional second input to .get() which is what to return instead of None if the key doesn't exist.

In [None]:
a = person.get('middlename','that key does not exist')
print(a)

## Dictionary values can be lists, dictionaries or almost anything else

The previous examples showed dictionaries where the values for each key were strings or numbers. Values can be more complicated objects like lists and dictionaries.

In [None]:
# make a dictionary using curly brackets
# to surround comma separated 'key': value pairs 
# 
# Notice that the value for 'completed_tests' is a list
exp_info = {'participant_id': 'A349',
           'age_group': 'older',
           'dob': '22-APR-1951',
           'completed_tests': ['intake', 'working memory', 'semantic fluency']}


print(exp_info)

#### Exercise: use the dictionary .values() method to get a report of all the values in the exp_info dictionary

In [None]:
exp_info.values()

If a requested dictionary value is a list you can interact with it just like any other list

In [None]:
# completed_tasks key has a value that is a list:
exp_info['completed_tests']

In [None]:
# use len to find out how many completed tests there are
len(exp_info['completed_tests'])

In [None]:
# use list indexing to get the last completed_test
exp_info['completed_tests'][-1]

In the previous cell the first part of the code (`exp_info['completed_tests']`) evaluates to a list, and so we can immediately pass an index (`[-1]`) to that result to get into the list.

In [None]:
# store the completed_tests list in a new variable
# and look at the first item
completed_tests = exp_info['completed_tests']
print(completed_tests[0])


**NOTE** In the previous cell the word "completed_test" is used in two different ways that don't overlap or conflict. In one case it is a variable name and to Python it's just a bunch of symbols to put on the outside of a storage bucket. In the other it is a string and being treated like text. From the kernel's perspective there is essentially no relationship between the two sets of words even though they are exactly the same to us.

#### Exercise: add the string 'visual tracking' to the list of completed tasks in the exp_info dictionary

Hint: we learned about the .append() function for lists...

### Dictionary values can be dictionaries

There are times when it makes sense to structure data or information in a dictionary of dictionaries.

Consider a situation where you are collecting information about a company and the information is naturally organized into different categories.

In [None]:
# make a dictionary
# curly brackets, key-value pairs
address = {'number': '6',
          'street': 'Washington Place',
          'city': 'New York',
          'state': 'NY',
          'zip': '10003'}

# another dictionary
administration = {'president': 'Andrew Hamilton',
             'provost': 'Katherine E. Fleming',
             'dean, libraries': 'H. Austin Booth',
             'dean, cas': 'Antonio Merlo'}

# and another
funding = {'tuition': 1000000,
          'endowment': 1000000,
          'grants, federal': 1000000,
          'grants, private': 1000000,
          'other': 1000000}

In [None]:
# get some info from a dictionary
address['city']

#### Exercise: use dictionary[ ] or dictionary.get() to find out how much funding comes from tuition.

It might be preferred to package up all this information about NYU into a single variable. 

A dictionary of dictionaries is a natural way to do this:

In [None]:
# make a new dictionary where the value
# for each key points to one of our 
# existing dictionaries

nyu_info = {'address': address, 
           'administration': administration,
           'funding': funding}

nyu_info

In [None]:
# look up values for a key, get a dictionary back
nyu_info['address']

In [None]:
# The returned value is a dictionary so we can directly
# request a key value from that dictionary
nyu_info['address']['state']

In [None]:
nyu_info['address']['zip']

In [None]:
# the equivalent in a more broken down format

# first get the value for the outer dictionary
# key 'address'
nyu_address = nyu_info['address']

# it's a dictionary:
print(nyu_address)

# now get the state
print(nyu_address['state'])

In [None]:
# get the name of the president

# remind ourselves which keys are in the dictionary
print(nyu_info.keys())


In [None]:
# look in the administration dicationary for president
nyu_info['administration']['president']

#### Exercise: get the name of the 'provost'. The provost is a member of the administration.

### Why learn about dictionaries? 

One answer is in some of the examples we've already done: there are some kinds of information where it just makes sense to have this kind of key-value lookup. A person's name, address information, maybe preference settings in a computer, etc.

Another reason is that dictionaries can be a useful way of organizing data. For example, one might naturally think of the columns of an excel spreadsheet or data file as being labeled with 'keys' that have a list of values underneath them.  

This is exactly a data format that `pandas` (a library that we will use in this class; more on this and other libraries later) likes:

In [None]:
student_data = {'student': [1,2,3,4], 'grades': [0.95, 0.27, 0.45, 0.8] }
student_data

In [None]:
student_data['student']

### From dictionary to dataframe

In the next unit of the class we will start using Pandas Data Frames. Here's a glimpse of how that means.

In [None]:
# import the pandas library using the standard
# short name pd
import pandas as pd


In [None]:
# make a dictionary
student_data = {'student': [1,2,3,4], 'grades': [0.95, 0.27, 0.45, 0.8] }
student_data

In [None]:
# use the dictionary to make a "dataframe" where the 
# column names are the dictionary keys and the
# column values are the corresponding dictionary values
df = pd.DataFrame(student_dict)
df



In [None]:
# get the data from the dataframe 'student' column
df['student']

In [None]:
# get the grade for student number 3

df[df['student']==3]['grades']

### Summary

These pandas examples are included just to preview a context where we will use dictionaries extensively. For now you should primarily be concerned with making sure you are getting familiar with the basic setup for dictionaries:


- dictionaries are composed of key-value pairs
- the primary use of an existing dictionary is to look up the value associated with some key or label
- the values associated with a dictionary key can be numbers, strings, lists, or other dictionaries
- dicationaries can be created in two main ways: curly brackets {} around comma separated key-value pairs or the dict() function syntax outlined in this notebook
