# Dictionaries

*Note: You can explore the [associated workbook](https://mybinder.org/v2/gh/melaniewalsh/Intro-Cultural-Analytics/master?urlpath=lab/tree/book/02-Python/Workbooks/11.5-Dictionaries-WORKBOOK.ipynb) for this chapter in the cloud.*

In this lesson, we're going to learn about Python dictionaries by drawing on Anelise Shrout's [Bellevue Almshouse Dataset](https://www.nyuirish.net/almshouse/the-almshouse-records/), excerpted below.

**Preview The Bellevue Almshouse Dataset**

In [1]:
import pandas
pandas.read_csv("../data/bellevue_almshouse_modified.csv").head(20)

ModuleNotFoundError: No module named 'pandas'

```{margin} The Bellevue Almshouse Dataset 
The Bellevue Almshouse Dataset includes information about Irish-born immigrants who were admitted to the almshouse in the 1840s. The Bellevue Almshouse was part of New York City's public health system, a place where poor, sick, homeless, and otherwise marginalized people were sent — sometimes voluntarily and sometimes forcibly. This dataset was transcribed from the almshouse's own admissions records by Anelise Shrout.
```

We're using the [Bellevue Almshouse Dataset](https://www.nyuirish.net/almshouse/the-almshouse-records/) to practice dictionaries because we want to think deeply about the consequences of reducing human life to data even at this early stage in our Python journey. This immigration data, as Shrout argues in her essay ["(Re)Humanizing Data: Digitally Navigating the Bellevue Almshouse,"](https://crdh.rrchnm.org/essays/v01-10-(re)-humanizing-data/) was "produced with the express purpose of reducing people to bodies; bodies to easily quantifiable aspects; and assigning value to those aspects which proved that the marginalized people to who they belonged were worth less than their elite counterparts."

___

## Dictionary

When we used lists with the Bellevue Almshouse data, it was easier than individually assigning individual variables. We could put multiple names into a single list and multiple ages in a single list.

By using a Python data collection type called a *dictionary*, we can go even further and group each person's name, age, and profession into a single collection.

**Indivudal Variables**

In [None]:
person1_name = 'Mary Gallagher'
person2_name = 'John Sanin (?)'
person1_age = 18
person2_age = 19

**Lists**

In [None]:
names = ['Mary Gallagher', 'John Sanin(?)', 'Anthony Clark', 'Margaret Farrell']
ages = [28, 19, 60, 30]
professions = ['married', 'laborer', 'laborer', 'widow']

**Dictionary**

In [None]:
person1 = {"name": "Mary Gallagher",
             "age": 28,
             "profession": "married"}

In [None]:
type(person1)

dict

In [None]:
person2 = {"name": "John Sanin(?)",
             "age": 19,
             "profession": "laborer"}

## Key-Value

A dictionary is made up of "key"-"value" pairs, which are separated by a colon `:` and separated from other key-value pairs by a comma `,`. A dictionary is always enclosed by curly brackets `{}`. 

In [None]:
person1 = {"name": "Mary Gallagher",
             "age": 28,
             "profession": "married"}

You can check all the keys in a dictionary by using the `.keys()` method or all the values in a dictionary by using the `.values()` method.

In [None]:
person1.keys()

dict_keys(['name', 'age', 'profession'])

In [None]:
person1.values()

dict_values(['Mary Gallagher', 28, 'married'])

## Access Items

You can access a value in a dictionary by using square brackets `[]` and its key name (kind of like how we indexed a string or a list).

In [None]:
person1["name"]

'Mary Gallagher'

In [None]:
person1["age"]

28

In [None]:
person1["profession"]

'married'

## Change Item

You can change a value in a dictionary by re-assigning a new value to a dictionary key.

In [None]:
person1["age"] = 100

In [None]:
person1

{'name': 'Mary Gallagher', 'age': 100, 'profession': 'married'}

In [None]:
person1['profession'] = 'spinster'

In [None]:
person1

{'name': 'Mary Gallagher', 'age': 100, 'profession': 'spinster'}

## Nested Dictionary

You can also nest a dictionary inside another dictionary.

In [None]:
bellevue_people = {
                "person1":
                  {"name": "Mary Gallagher",
                   "age": 28,
                   "profession": "married"},
                "person2":
                  {"name": "John Sanin(?)",
                   "age": 19,
                   "profession": "laborer"}
                }

In [None]:
bellevue_people['person1']

{'name': 'Mary Gallagher', 'age': 28, 'profession': 'married'}

In [None]:
bellevue_people['person1']['name']

'Mary Gallagher'

In [None]:
bellevue_people['person2']

{'name': 'John Sanin(?)', 'age': 19, 'profession': 'laborer'}

In [None]:
bellevue_people['person2']['age']

19

## Iterate Through Dictionary

In [None]:
for person in bellevue_people.keys():
    print(person)

person_1
person_2


In [None]:
for person in bellevue_people.values():
    print(person)

{'name': 'Mary Gallagher', 'age': 28, 'profession': 'married'}
{'name': 'John Sanin(?)', 'age': 19, 'profession': 'laborer'}


In [None]:
for person in bellevue_people.values():
    if person['age'] > 20:
        name = person['name']
        age = person['age']
        print(f'{name} is more than 20 years old. She is {age}.')

Mary Gallagher is more than 20 years old. She is 28.


In [None]:
for person in bellevue_people.items():
    print(person)

('person_1', {'name': 'Mary Gallagher', 'age': 28, 'profession': 'married'})
('person_2', {'name': 'John Sanin(?)', 'age': 19, 'profession': 'laborer'})


## Exercise 1

In [3]:
movie = {'title': 'Selma',
         'site': 'http://www.imdb.com/title/tt1020072/',
         'country': 'US/UK',
         'year_release': 2014,
         'box_office': '$52.1M',
         'director': 'Ava DuVernay',
         'number_of_subjects': 1,
         'subject': 'Martin Luther King, Jr',
         'type_of_subject': 'Activist',
         'race_known': 'Known',
         'subject_race': 'African American',
         'person_of_color': 1,
         'subject_sex': 'Male', 
        'lead_actor_actress': 'David Oyelowo'}

Print out all the "keys" in the dictionary `movie`

In [4]:
for key in movie.keys():
    print(key)

title
site
country
year_release
box_office
director
number_of_subjects
subject
type_of_subject
race_known
subject_race
person_of_color
subject_sex
lead_actor_actress


Print out all the "values" in the dictionary `movie`

In [8]:
for value in movie.values():
    print(value)

Selma
http://www.imdb.com/title/tt1020072/
US/UK
2014
$52.1M
Ava DuVernay
1
Martin Luther King, Jr
Activist
Known
African American
1
Male
David Oyelowo


Access the value for the key "director"

In [11]:
print(movie['director'])

Ava DuVernay


## Exercise 2

By using the Python library called `pandas`, we can read in the entire "biopics.csv" data from the *538* project and make it into a list of dictionaries.

Don't worry about the `pandas` code at this point. We will get to it in a couple of weeks.

In [17]:
import pandas as pd
biopics_df = pd.read_csv('biopics.csv', encoding='utf-8')
biopics_list = biopics_df.to_dict('records')

ModuleNotFoundError: No module named 'pandas'

In [None]:
biopics_list

NameError: name 'biopics_list' is not defined

In [None]:
type(biopics_list)

In [None]:
type(biopics_list[0])

Loop through this list of dictionaries (`biopics_list`) and print out the movie title and release year for all the movies that featured an "African American" `subject`. Print out the movie *title* and *release year* with a "//" in between them.

In [None]:
for biopic_dict in biopics_list...
    #Your code here
        #Your code here

Now, choose a different **"subject_race"** and print out all the *titles* and *release years* for those movies. 

Print out the movie title and release year with a "//" in between them.

Here's a list of values to consider:

- White  
- [blank]  
- African American  
- Multi racial  
- Hispanic (Latin American)  
- Middle Eastern (White)  
- Middle Eastern  
- African  
- Hispanic (White)  
- Hispanic (Latino)  
- Asian  
- Native American  
- Asian American  
- Indian  
- Caribbean  
- Mediterranean  
- Eurasian  
- Hispanic (Latina)

In [None]:
#Your code here
    #Your code here
        #Your code here

## Exercise 3

The Eviction Lab makes its data [available for download here](https://data-downloads.evictionlab.org/). By using the Python library called pandas, we can read in eviction data about cities in the state of New York and make it into a list of dictionaries.

Don't worry about the pandas code at this point. We will get to it in a couple of weeks.

In [2]:
import pandas as pd
cities_df = pd.read_csv('./../data/02-python/ny_cities_eviction.csv', encoding='utf-8')
cities_list = cities_df.to_dict('records')
cities_list

[{'GEOID': 3600155,
  'year': 2000,
  'name': 'Accord',
  'parent-location': 'New York',
  'population': 622.0,
  'poverty-rate': 4.15,
  'renter-occupied-households': nan,
  'pct-renter-occupied': 27.88,
  'median-gross-rent': 650.0,
  'median-household-income': 52083.0,
  'median-property-value': 98700.0,
  'rent-burden': 19.0,
  'pct-white': 90.19,
  'pct-af-am': 2.41,
  'pct-hispanic': 3.86,
  'pct-am-ind': 0.64,
  'pct-asian': 1.45,
  'pct-nh-pi': 0.0,
  'pct-multiple': 1.45,
  'pct-other': 0.0,
  'eviction-filings': nan,
  'evictions': nan,
  'eviction-rate': nan,
  'eviction-filing-rate': nan,
  'low-flag': 0,
  'imputed': 0,
  'subbed': 0},
 {'GEOID': 3600155,
  'year': 2001,
  'name': 'Accord',
  'parent-location': 'New York',
  'population': 622.0,
  'poverty-rate': 4.15,
  'renter-occupied-households': nan,
  'pct-renter-occupied': 27.88,
  'median-gross-rent': 650.0,
  'median-household-income': 52083.0,
  'median-property-value': 98700.0,
  'rent-burden': 19.0,
  'pct-whit

Loop through this list of dictionaries and print out the *year* and *number of evictions* for the city of *Ithaca* (note the `name` key).

Print out the year and number of evictions with a custom f-string.

In [4]:
for city in cities_list:
    if(city['name'] == "Ithaca"):
        print(f'Year: {city["year"]} ... # Of Evictions: {city["evictions"]}')

Year: 2000 ... # Of Evictions: 0.0
Year: 2001 ... # Of Evictions: 0.0
Year: 2002 ... # Of Evictions: 0.0
Year: 2003 ... # Of Evictions: 0.0
Year: 2004 ... # Of Evictions: 0.0
Year: 2005 ... # Of Evictions: 0.0
Year: 2006 ... # Of Evictions: 0.0
Year: 2007 ... # Of Evictions: 0.0
Year: 2008 ... # Of Evictions: 0.0
Year: 2009 ... # Of Evictions: 1.47
Year: 2010 ... # Of Evictions: 6.2
Year: 2011 ... # Of Evictions: 9.63
Year: 2012 ... # Of Evictions: 11.98
Year: 2013 ... # Of Evictions: 9.22
Year: 2014 ... # Of Evictions: 4.47
Year: 2015 ... # Of Evictions: 8.73
Year: 2016 ... # Of Evictions: 6.47
