# Chapter 7 - Dictionary Tricks

___
## 7.1 Dictionary Default Values

Python’s dictionaries have a get() method for looking up a key while
providing a fallback value

### Summary
* Avoid explicit key in ```dict``` checks when testing for membership.
* EAFP-style exception handling or using the built-in ```get()```
method is preferable.
* In some cases, the ```collections.defaultdict``` class from the
standard library can also be helpful.

#### the problem

In [1]:
name_for_userid = {
    382: 'Alice',
    950: 'Bob',
    590: 'Dilbert',
}

In [2]:
def greeting(userid):
    return 'Hi %s' % name_for_userid[userid]

In [3]:
greeting(382)

'Hi Alice'

In [4]:
greeting(33333)

KeyError: 33333

#### Returning a default value

In [5]:
# this is not very pythonic
def greeting(userid):
    if userid in name_for_userid:
        return 'Hi %s!' % name_for_userid[userid]
    else:
        return 'Hi there!'

In [6]:
greeting(382)

'Hi Alice!'

In [7]:
greeting(1)

'Hi there!'

In [8]:
# a more pythonic implementation using the EAFP - easier to ask for forgiveness than permission
def greeting(userid):
    try:
        return 'Hi %s!' % name_for_userid[userid]
    except KeyError:
        return 'Hi there'

### preferred implmentation with the get method

In [9]:
# the most pythonic way to do this is with the get function
def greeting(userid):
    return 'Hi %s!' % name_for_userid.get(
        userid, 'there')

In [10]:
greeting(33333)

'Hi there!'

In [11]:
greeting(382)

'Hi Alice!'

___
## 7.2 Sorting dictionaries for fun and profit

Python dicionaries don't have an inherent order. 

### Summary
* When creating sorted “views” of dictionaries and other collections, you can influence the sort order with a **key func**.
* *Key funcs* are an important concept in Python. The most frequently used ones were even added to the operator module in the standard library.
* Functions are first-class citizens in Python. This is a powerful
feature you’ll find used everywhere in the language.


In [12]:
# default sorted
xs = {'a' : 4, 'c' : 2, 'b' : 3, 'd' : 1}
sorted(xs.items())

[('a', 4), ('b', 3), ('c', 2), ('d', 1)]

In [13]:
sorted(xs.items(), key=lambda x:x[1], reverse=True)

[('a', 4), ('b', 3), ('c', 2), ('d', 1)]

In [14]:
import operator
sorted(xs.items(), key=operator.itemgetter(1), reverse=True)

[('a', 4), ('b', 3), ('c', 2), ('d', 1)]

In [15]:
# a more abstract key func
xs = {'a' : -4, 'c' : 2, 'b' : 3, 'd' : 1}
sorted(xs.items(), key=lambda x: abs(x[1])**(-0.5))

[('a', -4), ('b', 3), ('c', 2), ('d', 1)]

___
## 7.3 Emulating switch/case statements with dicts

### Summary 
* Python doesn’t have a switch/case statement. But in some cases
you can avoid long if-chains with a dictionary-based dispatch
table.
* Once again Python’s first-class functions prove to be a powerful
tool. But with great power comes great responsibility.

Python doesn’t have switch/case statements so it’s sometimes necessary to write long ```if…elif…else``` chains as a workaround. In this chapter you’ll discover a trick you can use to emulate switch/case statements in Python with dictionaries and first-class functions.

In [16]:
# we make use of python's first-class functions
def myfunc(a,b):
    return a + b
funcs = [myfunc]
funcs[0](2,3)

5

### first attempt at solution using conditional logic

In [22]:
def dispatch_if(operator, x, y):
    if operator == 'add':
        return x + y
    elif operator == 'sub':
        return x - y
    elif operator == 'mul':
        return x * y
    elif operator == 'div':
        return x/y
    # implicit return statement

In [23]:
dispatch_if('mul', 2, 8)

16

### Preferred solution

#### Notes on this implementation
First of all, every time we call ```dispatch_dict()```, it creates a temporary dictionary and a bunch of lambdas for the opcode lookup. This
isn’t ideal from a performance perspective. For code that needs to be
fast, **it makes more sense to create the dictionary once as a constant**
and then to reference it when the function is called. We don’t want to
recreate the dictionary every time we need to do a lookup.

In [24]:
# we can pass functions around as objects
# we also make use of the get function and provide a default function
def dispatch_dict(operator, x, y):
     return {
         'add': lambda: x + y,
         'sub': lambda: x - y,
         'mul': lambda: x * y,
         'div': lambda: x / y,
     }.get(operator, lambda: None)()

In [25]:
dispatch_dict('unknown', 2, 8)

___
## 7.4 The craziest dict expression in the west

“The **Boolean type is a subtype of the integer type**, and
**Boolean values behave like the values 0 and 1**, respectively, in almost all contexts, the exception being that
when converted to a string, the strings ‘False’ or ‘True’
are returned, respectively"

(Python docs, https://docs.python.org/3/reference/datamodel.html#the-standard-type-hierarchy)

### Summary 
* Dictionaries treat keys as identical if their ```__eq__``` comparison
result says they’re equal and their hash values are the same.
* **Unexpected dictionary key collisions** can and will lead to surprising results.


In [27]:
zen_koan = {True : 'yes', 1 : 'no', 1.0 : 'maybe'}
zen_koan

{True: 'maybe'}

**The original key object is not updated when a new value is associated with it**

In [28]:
# equivalent to this sequence of code
xs = dict()
xs[True] = 'yes'
xs[1] = 'no'
xs[1.0] = 'maybe'

In [29]:
# they are all equivalent
True == 1 == 1.0

True

In [30]:
['no', 'yes'][True]

'yes'

This behaviour is related to hash collisions in the implementation of the ```dict``` class

### Investigating the hashtable structure 

This class is special in two ways:

* First, because its ```__eq__``` dunder method always returns ```True```, all instances of this class will pretend they’re equal to any other object
* And second, each ```AlwaysEquals``` instance will also return a unique
hash value generated by the built-in ```id()``` function

#### Equality comparison test

In [31]:
class AlwaysEquals:
    def __eq__(self, other):
        return True
    
    def __hash__(self):
        return id(self)

In [32]:
AlwaysEquals() == AlwaysEquals()

True

In [33]:
AlwaysEquals() == 42

True

In [34]:
AlwaysEquals() == 'whaaaat?'

True

In CPython, ```id()``` returns the **address of the object in memory**, which
is guaranteed to be unique.

In [35]:
objects = [AlwaysEquals(),
           AlwaysEquals(),
           AlwaysEquals(),]
[hash(obj) for obj in objects]

[1686759430320, 1686759429936, 1686759428448]

This will allow us to **test if dictionary keys are overwritten** based on their **equality comparison result alone**.

In [37]:
# the keys in the next example are not overwritten even though they compare as equal
{AlwaysEquals(): 'yes', AlwaysEquals(): 'no'}

{<__main__.AlwaysEquals at 0x188bab254c0>: 'yes',
 <__main__.AlwaysEquals at 0x188bab256a0>: 'no'}

#### hash comparison test and overwritting keys

Instances of this class will return the same hash but compares as **non-equal**.

In [38]:
class SameHash:
    def __hash__(self):
        return 1

In [39]:
a = SameHash()
b = SameHash()
a == b

False

In [40]:
hash(a), hash(b)

(1, 1)

In [42]:
{a : 'a', b : 'b'}

{<__main__.SameHash at 0x188bab25e50>: 'a',
 <__main__.SameHash at 0x188bab25130>: 'b'}

The keys getting overwritten effect isn’t caused
by hash value collisions alone either

### Summary of findings

Both equality and the hash value have to be the same to get this strange behaviour

In [46]:
# they all compare equal
True == 1 == 1.0

True

In [47]:
# they all have the same hash value
(hash(True), hash(1), hash(1.0))

(1, 1, 1)

In [48]:
# the combined effect is this strange dictionary expression
{True: 'yes', 1: 'no', 1.0: 'maybe'}

{True: 'maybe'}

___
## 7.5 So many ways to merge dictionaries

Often using dictionaries as the underlying data structure for representing configuration keys and values is beneficial. And so frequently a way is needed to combine or to **merge the config defaults and the user overrides** into a single dictionary with the final configuration values.

Sometimes you need a way to merge two or more dictionaries into one, so that the resulting dictionary contains a combination of the keys and values of the source dicts.

In [49]:
# config defaults
xs = {'a' : 1, 'b' : 2}
# overrides
ys = {'b' : 3, 'c': 4}

### Classical solution for merging

In [54]:
zs = {}
zs.update(xs)
# note that b is the default value for b is overwritten
zs.update(ys)

In [55]:
zs

{'a': 1, 'b': 3, 'c': 4}

In [56]:
# implementation
def update(dict1, dict2):
    for key, value in dict2.items():
        dict1[key] = value

### Modern approach 
However, just like making repeated ```update()``` calls, this approach
only works for merging two dictionaries and **cannot be generalized
to combine an arbitrary number of dictionaries in one step**

In [57]:
zs = dict(xs, **ys)
zs

{'a': 1, 'b': 3, 'c': 4}

___
## 7.6 Dictionary Pretty-Printing

### Summary
* The default to-string conversion for dictionary objects in
Python can be difficult to read.
* The ```pprint``` and ```json``` module are “higher-fidelity” options built
into the Python standard library.
* Be careful with using ```json.dumps()``` and non-primitive keys
and values as this will trigger a ```TypeError```.


In [58]:
# no identation!
mapping = {'a': 23, 'b': 42, 'c': 0xc0ffee}
str(mapping)

"{'a': 23, 'b': 42, 'c': 12648430}"

### ```json``` module 
While this looks nice and readable, it isn’t a perfect solution. Printing dictionaries with the json module **only works with dicts that contain primitive types**—you’ll **run into trouble trying to print a dictionary
that contains a non-primitive data type, like a function**.


In [59]:
import json

print(json.dumps(mapping, indent=4, sort_keys=True))

{
    "a": 23,
    "b": 42,
    "c": 12648430
}


In [61]:
mapping['d'] = {1, 2, 3}
try: 
    json.dumps(mapping)
except TypeError as err:
    print(err)

Object of type set is not JSON serializable


### ```pprint``` module

Handles non-primitive types and prints the dictionary in a reproducible manner. However, it doesn;t represent nested structures as well as ```json```.

In [62]:
import pprint

pprint.pprint(mapping)

{'a': 23, 'b': 42, 'c': 12648430, 'd': {1, 2, 3}}
