## Dictionaries

Dictionaries provide a way to store and retrieve data using key-value pairs.

1. **[Introduction to dictionaries](#intro)**
2. **[Dictionary Methods](#dict)**
3. **[Exercises](#exercises)**
4. **[Conclusion](#conclusion)**

Reference Guide Dictionaries: <a href="https://docs.google.com/document/d/1Ks2KJGVvIk3aXQeuQf6OD1LhoVbPj3Yt/edit?usp=drive_link&ouid=117076395228202702809&rtpof=true&sd=true">link</a>

<a id="intro"></a>
### 1. Introduction to dictionaries

In [2]:
# Create a dictionary with pens as keys and the animals they contain as values.
# Dictionaries can be instantiated using braces.
zoo = {
    'pen_1': 'penguins',
    'pen_2': 'zebras',
    'pen_3': 'lions',
    }

# Selecting the `pen_2` key returns `zebras`, the value stored at that key
zoo['pen_2']

# You cannot access a dictionary's values by name using bracket indexing
# because the computer interprets this as a key, not a value.
## zoo['zebras']     ->    This will trhow an error 

'zebras'

In [5]:
# Dictionaries can also be instantiated using the dict() function
zoo = dict(
    pen_1='monkeys',
    pen_2='zebras',
    pen_3='lions',
    )

zoo['pen_2']

'zebras'

In [6]:
# Assign a new key: value pair to an existing dictionary
zoo['pen_4'] = 'crocodiles'
zoo

# Dictionaries are unordered and do not support numerical indexing
## zoo[2]       ->   This will trhow an error

{'pen_1': 'monkeys',
 'pen_2': 'zebras',
 'pen_3': 'lions',
 'pen_4': 'crocodiles'}

In [7]:
# Use the `in` keyword to produce a Boolean of whether a given key exists in a dictionary.
print('pen_1' in zoo)
print('pen_7' in zoo)

True
False


<a id="dict"> </a>
### 2. Dictionary Methods

In [8]:
# Create a list of tuples, each representing the name, age, and position of a
# player on a basketball team.
team = [
    ('Marta', 20, 'center'),
    ('Ana', 22, 'point guard'),
    ('Gabi', 22, 'shooting guard'),
    ('Luz', 21, 'power forward'),
    ('Lorena', 19, 'small forward'),
    ]

In [9]:
# Add new players to the list.
team = [
    ('Marta', 20, 'center'),
    ('Ana', 22, 'point guard'),
    ('Gabi', 22, 'shooting guard'),
    ('Luz', 21, 'power forward'),
    ('Lorena', 19, 'small forward'),
    ('Sandra', 19, 'center'),
    ('Mari', 18, 'point guard'),
    ('Esme', 18, 'shooting guard'),
    ('Lin', 18, 'power forward'),
    ('Sol', 19, 'small forward'),
    ]

In [10]:
# Instantiate an empty dictionary.
new_team = {}

# Loop over the tuples in the list of players and unpack their values.
for name, age, position in team:
    if position in new_team:                    # If position already a key in new_team,
        new_team[position].append((name, age))  # append (name, age) tup to list at that value.
    else:
        new_team[position] = [(name, age)]      # If position not a key in new_team,
                                                # create a new key whose value is a list
                                                # containing (name, age) tup.
new_team

{'center': [('Marta', 20), ('Sandra', 19)],
 'point guard': [('Ana', 22), ('Mari', 18)],
 'shooting guard': [('Gabi', 22), ('Esme', 18)],
 'power forward': [('Luz', 21), ('Lin', 18)],
 'small forward': [('Lorena', 19), ('Sol', 19)]}

In [11]:
# Examine the value at the 'point guard' key.
new_team['point guard']

[('Ana', 22), ('Mari', 18)]

In [12]:
# You can access the a dictionary's keys by looping over them.
for x in new_team:
    print(x)

center
point guard
shooting guard
power forward
small forward


In [13]:
# The keys() method returns the keys of a dictionary.
new_team.keys()

dict_keys(['center', 'point guard', 'shooting guard', 'power forward', 'small forward'])

In [14]:
# The values() method returns all the values in a dictionary.
new_team.values()

dict_values([[('Marta', 20), ('Sandra', 19)], [('Ana', 22), ('Mari', 18)], [('Gabi', 22), ('Esme', 18)], [('Luz', 21), ('Lin', 18)], [('Lorena', 19), ('Sol', 19)]])

In [15]:
# The items() method returns both the keys and the values.
for a, b in new_team.items():
    print(a, b)

center [('Marta', 20), ('Sandra', 19)]
point guard [('Ana', 22), ('Mari', 18)]
shooting guard [('Gabi', 22), ('Esme', 18)]
power forward [('Luz', 21), ('Lin', 18)]
small forward [('Lorena', 19), ('Sol', 19)]


<a id="exercises"> </a>
### 3. Exercises

#### Exercise 1: Create a dictionary to store information</h2></summary>

Dictionaries are useful when you need a data structure to store information that can be referenced or looked up.

In this task you'll begin with three `list` objects:

* `state_list` - an ordered list of the state where each data point was recorded
* `county_list` - an ordered list of the county where each data point was recorded
* `aqi_list` - an ordered list of AQI records

As a refresher, here is an example table of some of the information contained in these variables:

| state_name | county_name | aqi |
| --- | --- | --- |
| Arizona | Maricopa | 9 |
| California | Alameda | 11 |
| California | Sacramento | 35 |
| Kentucky | Jefferson | 6 |
| Louisiana | East Baton Rouge | 5 |


##### 1a: Create a list of tuples

Begin with an intermediary step to prepare the information to be put in a dictionary.

* Convert `state_list`, `county_list`, and `aqi_list` to a list of tuples, where each tuple contains information for a single record: `(state, county, aqi)`.

* Assign the result to a variable called `epa_tuples`.



In [20]:
import ada_c2_labs as lab
state_list = lab.fetch_epa('state')
county_list = lab.fetch_epa('county')
aqi_list = lab.fetch_epa('aqi')

epa_tuples = list(zip(state_list, county_list, aqi_list))

##### 1b: Create a dictionary

Now that you have a list of tuples containing AQI records, use it to create a dictionary that allows you to look up a state and get all the county-AQI pairs associated with that state.

* Create a dictionary called `aqi_dict`:
    * Use a loop to unpack information from each tuple in `epa_tuples`.
    * Your dictionary's keys should be states.
    * The value at each key should be a list of tuples, where each tuple is a county-AQI pair of a record from a given state.

*Example:*
```
[IN]  aqi_dict['Vermont']
[OUT] [('Chittenden', 18.0),
       ('Chittenden', 20.0),
       ('Chittenden', 3.0),
       ('Chittenden', 49.0),
       ('Rutland', 15.0),
       ('Chittenden', 3.0),
       ('Chittenden', 6.0),
       ('Rutland', 3.0),
       ('Rutland', 6.0),
       ('Chittenden', 5.0),
       ('Chittenden', 2.0)]
```

In [21]:
### YOUR CODE HERE ###
aqi_dict = {}
for state, county, aqi in epa_tuples:
    if state in aqi_dict:
        aqi_dict[state].append((county, aqi))
    else:
        aqi_dict[state] = [(county, aqi)]


aqi_dict['Vermont']

[('Chittenden', 18.0),
 ('Chittenden', 20.0),
 ('Chittenden', 3.0),
 ('Chittenden', 49.0),
 ('Rutland', 15.0),
 ('Chittenden', 3.0),
 ('Chittenden', 6.0),
 ('Rutland', 3.0),
 ('Rutland', 6.0),
 ('Chittenden', 5.0),
 ('Chittenden', 2.0)]

#### Exercise 2: Use the dictionary to retrieve information

Now that you have a dictionary of county-AQI readings by state, you can use it to retrieve information and draw further insight from your data.

##### 2a: Calculate how many readings were recorded in the state of Arizona

Use your Python skills to calculate the number of readings that were recorded in the state of Arizona.

*Expected output:*
```
[OUT] 72
```

In [23]:
### YOUR CODE HERE ###
len(aqi_dict['Arizona'])

72

##### 2b: Calculate the mean AQI from the state of California

Use your Python skills to calculate the mean of the AQI readings that were recorded in the state of California. Note that there are many different approaches you can take. Be creative!

*Expected output:*
```
[OUT] 9.412280701754385
```

In [24]:
### YOUR CODE HERE ###
ca_aqi_list = [aqi for county, aqi in aqi_dict['California']]
ca_aqi_mean = sum(ca_aqi_list) / len(ca_aqi_list)
ca_aqi_mean

9.412280701754385

#### Exercise 3: Define a `county_counter()` function

You want to be able to quickly look up how many times a county is represented in a given state's readings. Even though you already have a list containing just county names, it's not safe to rely on the counts from that list alone because some states might have counties with the same name. Therefore, you'll need to use the state-specific information in `aqi_dict` to calculate this information.

##### 3a: Write the function

* Define a function called `county_counter` that takes one argument:
    * `state` - a string of the name of a U.S. state

* Return `county_dict` - a `dictionary` object whose keys are counties of the `state` given in the function's argument. For each county key, the corresponding value should be the count of the number of times that county is represented in the AQI data for that state.

*Example:*
```
[IN]  county_counter('Florida')
[OUT] {'Duval': 13,
       'Hillsborough': 9,
       'Broward': 18,
       'Miami-Dade': 15,
       'Orange': 6,
       'Palm Beach': 5,
       'Pinellas': 6,
       'Sarasota': 9}
```

In [25]:
def county_counter(state):
    county_dict = {}
    for county, aqi in aqi_dict[state]:
        if county in county_dict:
            county_dict[county] +=1
        else:
            county_dict[county] = 1
    return county_dict

##### 3b: Use the function to check Washington County, PA.

Use the `county_counter()` function to calculate how many AQI readings were from `Washington` County, `Pennsylvania`.

*Expected result:*
```
[OUT] 7
```

In [26]:
### YOUR CODE HERE ###
pa_dict = county_counter('Pennsylvania')
pa_dict['Washington']

7

##### 3c: Use the function to check the different counties in Indiana

Use the `county_counter` function to obtain a list of all the different counties in the state of Indiana.

*Expected result:*
```
[OUT] dict_keys(['Marion', 'St. Joseph', 'Vanderburgh', 'Allen', 'Vigo'
      'Hendricks', 'Lake'])
```

In [27]:
### YOUR CODE HERE ###
county_counter('Indiana').keys()

dict_keys(['Marion', 'St. Joseph', 'Vanderburgh', 'Allen', 'Vigo', 'Hendricks', 'Lake'])

#### Exercise 4: Use sets to determine how many counties share names

In this task, you'll create a list of every county from every state, then use it to determine how many counties have the same name.

##### 4a: Construct a list of every county from every state

1.  * Use `aqi_dict` and `county_counter()` to construct a list of every county from every state.
    * Assign the result to a variable called `all_counties`.

2. Find the length of `all_counties`.

*Expected result:*
```
[OUT] 277
```

In [28]:
# 1. ### YOUR CODE HERE ###
all_counties = []
for state in aqi_dict.keys():
    counties = list(county_counter(state).keys())
    all_counties += counties
    
# 2. ### YOUR CODE HERE ###
len(all_counties)

277

##### 4b: Calculate how many counties share names

Use `all_counties` and your knowledge of sets and list methods to determine how many counties share names.

*Expected result:*
```
[OUT] 41
```

In [29]:
shared_count = 0 

for county in set(all_counties): 
    count = all_counties.count(county)
    if count > 1: 
        shared_count += count
        
shared_count

41

<a id="conclusion"> </a>
### 4. Conclusion

- Python has many built-in functions that are useful for building dictionaries and sets.
- Dictionaries in Python are useful for representing data in terms of keys mapped to values.
- A set will not allow duplicate values.
    - The values a set contains are unchangable and unordered.
- Functions and loop iteration can be used to perform calculations on dictionary values.
    - Once the values have been calculated, they can be saved to other data types, such as tuples, lists, and sets.
- There are many ways to access data stored inside a dictionary.