

### External References:
- sets
    - http://www.diveintopython3.net/native-datatypes.html#sets
    - https://docs.python.org/3/tutorial/datastructures.html#sets
- dictionaries
    - https://automatetheboringstuff.com/chapter5/ (first half)
    - http://www.diveintopython3.net/native-datatypes.html#dictionaries
    - http://greenteapress.com/thinkpython/html/thinkpython012.html

    

# Sets (set)
- mutable
- Can only contain immutable values
  - tuple, str, int, float, bool
- unordered
- cannot contain duplicates
- REALLY fast for checking whether objects are in the collection or not
- useful for venn-diagram like questions
    - i.e. what items are in this collection that aren't in this collection
    - what items are in both collections?
- Surrounded with brackets {}


In [None]:
# Creating an empty set
example = set()

len(example)

Adding values to a set one at a time

In [None]:
example.add(3)
example.add(4)
example

Adding a value already in a set has no effect.

In [None]:
example.add(3) # 3 is already in there, so nothing changes
example

Removing things from a set mutates it.

In [None]:
# removing things
example.remove(3)
example

Check for inclusion just like other containers

In [None]:
4 in example

Type-casting to a set.
Works on any sequence.

In [None]:
hello = 'hello'
set(hello)

In [None]:
duplicates = [1, 1, 1, 0, 0, 2, 0, 2]
set(duplicates)

Sets are most useful when you have 2 sets of data.

In [None]:
target_word = set('horse')
guessed = set('hnsje')

What guessed letters are correct?
aka. what values in guessed also appear in target_word?

In [None]:
guessed.intersection(target_word)

In [None]:
guessed & target_word # & is special operator for sets.  It means intersection

What what guessed letters are not correct?

aka what letters appear in guessed but not target_word?

In [None]:
guessed.difference(target_word)

In [None]:
guessed - target_word

What correct letters have not been guessed?
aka What letters appear in target word but not guessed?

In [None]:
target_word.difference(guessed)

In [None]:
target_word - guessed

What are all the letters that have been guessed or are in target word?

In [None]:
target_word.union(guessed)

Can be converted and looped over just like other containers

In [None]:
list(target_word)

In [None]:
for letter in target_word:
    print(letter, 'is in our goal')

Sets are **unordered** meaning no indexing like strings, lists, and tuples

In [None]:
target_word[0]

But remember, no mutable data!

In [None]:
s = set()
s.add([])

In [None]:
s.add(set())

# Dictionary data type

- `dict` in python
- also called `associative array` and `hash` in other languages.
- accessed with a `key`, not an index.
- keys are unique, immutable.
- keys can be: strings, tuples, numbers
- values can be anything.
- represented with curly braces in python. `{}`

A dictionary is a data structure which associates (or maps) a `key` to a `value`.  The keys of a dictionary are analogous to the variables we use in our code.  They are simply a reference to a value.  A dictionary is a collection of these references.

Key takeaway is that **dictionaries associate one value with another**.  That is most often why we use them.

In [None]:
# Creating a dict with braces literal
example_literal = {'vowels': 'aeiouy', 'date': '11-8-2016'}
example_literal

In [None]:
# Creating a dict with the dict function
example_from_function = dict(vowels='aeiouy', date='11-8-2016')
example_from_function

We access data from our dictionary based on its **key**, not index

Format is:

`dictionary[key_name]`

In [None]:
example_from_function['vowels']

In [None]:
print(example_from_function['vowels'])
print("Today's date is", example_literal['date'])

Accessing a key that isn't in the dictionary raises an error

In [None]:
example_from_function['not a key']

We can add new key, value pairs to our dictionary by assigning values to that key

In [None]:
example_from_function['pi'] = 3.14
example_from_function

## Dictionary vs lists

#### The differences:
- dicts are unordered (therefore cant be sliced)
- lists only accessed with index
- KeyError vs IndexError

#### The similarities:
- both are mutable
- both can contain *multiple* values
- both can be iterated over

To be equal, two lists must have the same **items** in the same **order**.

To be equal, two dicts must have the same **keys** with the same **value**

##### Adjust the following data structures to make the comparision  true. Keep the `==`

In [None]:
[3, 2, 1] == [1, 2, 3]

In [None]:
{3:'b', 1:'a', 2:'c'} == {1:'a', 2:'b', 3:'c'}

## keys(), values(), and items()
- dict methods
- used to access dict information in a list-like way

In [None]:
state_abbreviations = {'AK': 'Alaska',
 'AL': 'Alabama',
 'AR': 'Arkansas',
 'AZ': 'Arizona',
 'CA': 'California',
 'CO': 'Colorado',
 'CT': 'Connecticut',
 'DE': 'Delaware',
 'FL': 'Florida',
 'GA': 'Georgia',
 'HI': 'Hawaii',
 'IA': 'Iowa',
 'ID': 'Idaho',
 'IL': 'Illinois',
 'IN': 'Indiana',
 'KS': 'Kansas',
 'KY': 'Kentucky',
 'LA': 'Louisiana',
 'MA': 'Massachusetts',
 'MD': 'Maryland',
 'ME': 'Maine',
 'MI': 'Michigan',
 'MN': 'Minnesota',
 'MO': 'Missouri',
 'MS': 'Mississippi',
 'MT': 'Montana',
 'NC': 'North Carolina',
 'ND': 'North Dakota',
 'NE': 'Nebraska',
 'NH': 'New Hampshire',
 'NJ': 'New Jersey',
 'NM': 'New Mexico',
 'NV': 'Nevada',
 'NY': 'New York',
 'OH': 'Ohio',
 'OK': 'Oklahoma',
 'OR': 'Oregon',
 'PA': 'Pennsylvania',
 'RI': 'Rhode Island',
 'SC': 'South Carolina',
 'SD': 'South Dakota',
 'TN': 'Tennessee',
 'TX': 'Texas',
 'UT': 'Utah',
 'VA': 'Virginia',
 'VT': 'Vermont',
 'WA': 'Washington',
 'WI': 'Wisconsin',
 'WV': 'West Virginia',
 'WY': 'Wyoming'}

In [None]:
len(state_abbreviations)

In [None]:
state_abbreviations.keys()

In [None]:
state_abbreviations.values()

In [None]:
state_abbreviations.items()

In [None]:
# A demsonstration loop.
for key, value in state_abbreviations.items():
    print('The addreviation for', value, 'is:', key)
    

In [None]:
# That was verbose. Let's just get 5 items out of the dict
state_abbreviations.items()[:5]

The results if items(), keys(), and values() are not lists.  They look and act mostly like lists, but if we want the other functionality (accessing by index, mutability) we must convert them to lists.

In [None]:
state_items = state_abbreviations.items()
list(state_items)[:5]

## the 'in' keyword with dicts

`in` and `not in` worked with lists.  They also work with dictionaries. (looks in keys by default)

In [None]:
'CA' in state_abbreviations

In [None]:
'CA' in state_abbreviations.keys()

In [None]:
'Oregon' in state_abbreviations

In [None]:
'Oregon' in state_abbreviations.values()

We can manually check if a key is set in a dictionary.

If it isn't, then we can add it.

In [None]:
if 'PR' not in state_abbreviations:
    state_abbreviations['PR'] = 'Puerto Rico'

In [None]:
state_abbreviations['PR']

## Safely accessing dictionary with get() and setdefault()

Sometimes you don't know if a key is set in a dictionary. If we try to access a key that isn't set we get a `KeyError`
```python
if some_key in some_dict:
    result = some_dict[some_key]
 else:
    result = default_value
```

This can get a little verbose, so we have the `get` method.

```python
result = some_dict.get(some_key, default_value)
```

In [None]:
state_abbreviations['DC']

This line says "Get the value of the 'DC' key in the abbreviations dictionary.  If it's not there, give me the default value ('Unknown' in this case) instead.

In [None]:
state_abbreviations.get('DC', 'Unknown')

In [None]:
import data.states
votes = data.states.ELECTORAL_VOTES

In [None]:
sorted(votes.keys())

In [None]:
votes.get('District of Columbia', 0)

In [None]:
votes.get('Guam', 0)

`get` is great to use when you want to assume the value is of a certain type and a default value makes sense.  An example:

In [None]:
# Count the number of times each character appears
speech = 'Four score and seven years ago our forefathers brought forth upon this continent a new nation'
char_count = {} # empty dict

In [None]:
for char in speech:
    count = char_count.get(char, 0)
    char_count[char] = count + 1
print(char_count)

## Nested Dictionaries and Lists

Often when modeling real world data, it makes sense to compose lists and dicts within eachother.  i.e. a list of dicts or a dict where the values are lists.  This also applies to list of lists and dicts where the values are dicts.

Example: You are having a picnic. You have a guest list and want to keep track of who's bringing what.  There are several ways to structure your data depending on what information you have and how you want to use it.

In [None]:
# In this example, we only care about who's bringing what.
potluck1 = {'Jim': ['apples', 'bananas'],
            'Sally': ['pears', 'bacon'],
            'Hassan': ['cups']
           }

In [None]:
potluck1.keys() # our guests

In [None]:
potluck1.values() # a list of lists representing the foods being brought

**QUESTION**: how would you generate a collection that contains just contains all the items being brought to the picnic?  What would be a good data structure?

In [None]:
# Here we have  list of dicts, each representing one guest and what they are bringing.
# This format is more verbose, but a great example of what you might get
# if you have your guests fill out a form and then download the results.
potluck2 = [
    {'name': 'Hassan', 'foods': ['cups']},
    {'name': 'Jim', 'foods': ['apples', 'bananas'] },
    {'name': 'Sally', 'foods': ['pears', 'bacon']}
]

In [None]:
# Here we have a dict of dicts, where each value is a dict which maps the type: amount of the food they're bringing.

potluck3 = {
    'Hassan': {'cups': 5},
    'Jim': {'apples': 2, 'bananas': 5},
    'Sally': {'pears': 3, 'bacon': 10}
}

In [None]:
# Here is the same data represented as a list of lists.
# This is a good example of what our data might look like if we download it from an excel spreadsheet.
potluck4 = [ # guest, item, quantity
    ['hassan', 'cups', 5],
    ['Sally', 'pears', 3],
    ['Sally', 'bacon', 10],
    ['Jim', 'apples', 2],
    ['Jim', 'bananas', 5]
]

## A simple example

In [None]:
import data.states

In [None]:
electoral_votes = data.states.ELECTORAL_VOTES
state_abbreviations = data.states.STATE_ABBREVIATIONS

Say we have a list of state abbreviations reprensting states that have been won by a candidate.

In [None]:
current = ['PA', 'OR', 'TN', 'MN']

Let's calculate the total number of electoral votes these states are worth.

In [None]:
def total_votes(states):
    total_votes = 0
    for state in current: # loop through our list of states
        name = state_abbreviations[state] # get the name of the state from our abbreviations dictionary
        votes = electoral_votes[name] # get the number of votes that state is worth from our votes dictionary
        total_votes += votes # add the votes to our total
    return total_votes

In [None]:
total_votes(current)

# Review Questions
- What does the code for an empty dictionary look like?
- What happens if you access a key that doesn't exist in a dict like so: states['England']
- What is the difference between the code: `'apple' in foods` and `'apple' in foods.values()`
- What is the difference between: `foo['monkey']` and `foo.get(monkey)`?
- What do lists and dicts have in common?

## Practice:
Update your Recipe data format to use a dictionary instead of a list.