By The End Of This Session You Should Be Able To:
----

- Nested dictionaries
- Sets
- Counters
- dict comprehension
- Sort dictionaries by key

----
- Gotchas
- Python dictionary keys besides strings and integers? 


Nested dictionaries
------

Sometimes you have hierarchical data...

For example, a bunch of people and their attributes

In [98]:
d = {'alex': {'color':'red', 'food':'tacos'},
     'brian': {'color':'black', 'food':'pizza'},
     'lambda_dog': {'color':'brown', 'food':'kibble'}}

In [100]:
d['lambda_dog']

{'color': 'brown', 'food': 'kibble'}

In [99]:
d['lambda_dog']['food']

'kibble'

In [101]:
d.keys()

dict_keys(['alex', 'brian', 'lambda_dog'])

In [102]:
d['lambda_dog'].keys()

dict_keys(['color', 'food'])

Summary
-----

- dicts can store any object including other dictionaries

- __nested dictionaries__ are good way to store hierarchical data

Sets
----

Sometimes you just want a collection of unique values (you don't need the keys)

In [54]:
colors = {'black', 'red', 'brown'}

In [55]:
type(colors)

set

Membership Check is the "killer" feature for Set 
----

In [85]:
'red' in colors

True

In [86]:
'rainbow' in colors

False

Set Methods
----

In [58]:
# set.<tab>

In [59]:
colors.pop()

'brown'

In [60]:
colors.add('brown')

In [61]:
colors

{'black', 'brown', 'red'}

In [62]:
colors.add('brown')

In [63]:
colors

{'black', 'brown', 'red'}

Counters
----

Data Science is mostly counting things
-----

<center><img src="https://muppetmindset.files.wordpress.com/2009/12/count-bats.png" width="700"/></center>

Lets say you want to count the number of letters in a word...

In [64]:
word = 'abracadabra'

In [65]:
from collections import Counter

In [66]:
Counter(word)

Counter({'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2})

In [67]:
c = Counter(word)

In [68]:
c['a']

5

In [69]:
c.most_common()

[('a', 5), ('b', 2), ('r', 2), ('c', 1), ('d', 1)]

Count additional items
----

In [70]:
c['a'] = c['a'] + 1

In [71]:
c['a']

6

In [72]:
c['a'] += 1

In [73]:
c['a']

7

Summary
-----

- Counting the number of items is automatic with Counters

- This is useful for 'word count', a common business problem of find the frequency of words in documents

<br>
<br> 
<br>

----
dict comprehension
-----

In [89]:
# Make a list of numbers
numbers = []
for item in range(10):
    numbers.append(item)
    
numbers

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [90]:
# list comprehension is a better way
[item for item in range(10)]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [93]:
# dict comprehension is also a thing
{x: x*x for x in range(10)}

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

Summary
-----

dict comprehension is another to build a dict.

dict comprehension build dict when there computation.

<br>
<br> 
<br>

----
Sorting dicts by value
-----

In [74]:
high_temps = {}
high_temps['sf'] = .72
high_temps['seatle'] = .73
high_temps['austin'] = 105
high_temps['phoenix'] = 113

In [75]:
sorted(high_temps)

['austin', 'phoenix', 'seatle', 'sf']

In [76]:
sorted(high_temps.values())

[0.72, 0.73, 105, 113]

In [77]:
sorted(high_temps.values(), 
       reverse=True)

[113, 105, 0.73, 0.72]

In [78]:
sorted(high_temps.items(), 
       reverse=True)

[('sf', 0.72), ('seatle', 0.73), ('phoenix', 113), ('austin', 105)]

In [79]:
sorted(high_temps.items(),
       key=high_temps.values(),
       reverse=True)

TypeError: 'dict_values' object is not callable

In [80]:
sorted(high_temps.items(),
       key=lambda item: item[1],
       reverse=True)

[('phoenix', 113), ('austin', 105), ('seatle', 0.73), ('sf', 0.72)]

Take home message
----

Python has powerful built-in methods that can mixed n' matched to conduct advanced analysis

Summary
------

-

Play with the code
------

[bit.ly/py_dict_intro](http://bit.ly/py_dict_intro)

Further study
------

- The mighty python dictionary  
- The mighty python dictionary made mightier
- Raymond Hettinger Modern Python Dictionaries A confluence of a dozen great ideas a

Dicts keys have to be hashable
------

The following are hashable (and can be keys):
- strings
- numbers (ints & floats)
- tuples
- frozensets

the following are not hashable (and can __not__) be keys:
- lists
- dictionaries
- sets

<br>
<br> 
<br>

----