# Dictionaries
Notes from [Deep Dive 3](https://www.udemy.com/course/python-3-deep-dive-part-3/) section 3. Topics covered:

1\. [Creating dictionaries](#creating-dictionaries)

* [constructor](#constructor)
* [comprehensions](#comprehensions)
* [fromkeys()](#fromkeys)
    
2\. [Common operations](#common-operations)

* [basic operations](#basic-operations)
* [get()](#get)
* [pop()](#pop)
* [setdefault()](#setdefault)

3\. [Dictionary views](#dictionary-views)

4\.  [Updating, merging and copying](#updating-merging-copying)

* [update()](#update)
* [unpacking dictionaries](#unpacking-dictionaries)

5\.  [Custom classes and hashing](#custom-classes-and-hashing)

* [object hashes](#object-hashes)
* [custom classes](#custom-classes)

<a id='creating-dictionaries'></a>
## 1. Creating dictionaries

<a id='constructor'></a>
### 1.1 Constructor

__Example 1__:

In [1]:
d = {'randy': ['Randy', 'Marsh', 45],
     (0, 0): 'origin',
     'repr': lambda x: x ** 2,
     'eric': {'name': 'Eric Cartman',
              'age': 10}
    }

d

{'randy': ['Randy', 'Marsh', 45],
 (0, 0): 'origin',
 'repr': <function __main__.<lambda>(x)>,
 'eric': {'name': 'Eric Cartman', 'age': 10}}

In [2]:
d = dict(randy=['Randy', 'Marsh', 45],
         repr=lambda x: x ** 2,
         eric={'name': 'Eric Cartman',
               'age': 10},
         twin=dict(name='Eric Cartman', age=10)
        )

d

{'randy': ['Randy', 'Marsh', 45],
 'repr': <function __main__.<lambda>(x)>,
 'eric': {'name': 'Eric Cartman', 'age': 10},
 'twin': {'name': 'Eric Cartman', 'age': 10}}

<hr>

__Example 2__: 

Use `zip()` to generate dict

In [3]:
d = dict(zip("abc", range(1, 4)))
d

{'a': 1, 'b': 2, 'c': 3}

<hr>

<a id='comprehensions'></a>
### 1.2 Comprehensions

__Example 3A__: 

Use `list comprehension` (for comparison) to generate all combinations of (x, y) coordinates.

In [4]:
import math

x_coords = [-2, -1, 0, 1, 2] 
y_coords = [-2, -1, 0, 1, 2] 

In [5]:
grid = [(x, y) for x in x_coords for y in y_coords]

print(grid)

[(-2, -2), (-2, -1), (-2, 0), (-2, 1), (-2, 2), (-1, -2), (-1, -1), (-1, 0), (-1, 1), (-1, 2), (0, -2), (0, -1), (0, 0), (0, 1), (0, 2), (1, -2), (1, -1), (1, 0), (1, 1), (1, 2), (2, -2), (2, -1), (2, 0), (2, 1), (2, 2)]


<hr>

__Example 3B__: 

Generate a dict with `dictionary comprehensions`, where coord tuple is key and calculated distance from origin is value

In [6]:
grid_extended = {(x, y): math.hypot(x, y) for x, y in grid}
grid_extended

{(-2, -2): 2.8284271247461903,
 (-2, -1): 2.23606797749979,
 (-2, 0): 2.0,
 (-2, 1): 2.23606797749979,
 (-2, 2): 2.8284271247461903,
 (-1, -2): 2.23606797749979,
 (-1, -1): 1.4142135623730951,
 (-1, 0): 1.0,
 (-1, 1): 1.4142135623730951,
 (-1, 2): 2.23606797749979,
 (0, -2): 2.0,
 (0, -1): 1.0,
 (0, 0): 0.0,
 (0, 1): 1.0,
 (0, 2): 2.0,
 (1, -2): 2.23606797749979,
 (1, -1): 1.4142135623730951,
 (1, 0): 1.0,
 (1, 1): 1.4142135623730951,
 (1, 2): 2.23606797749979,
 (2, -2): 2.8284271247461903,
 (2, -1): 2.23606797749979,
 (2, 0): 2.0,
 (2, 1): 2.23606797749979,
 (2, 2): 2.8284271247461903}

<hr>

<a id='fromkeys'></a>
### 1.3 fromkeys()

Creates a dictionary with `specified keys` all assigned to the same value

```
d = dict.fromkeys(iterable, value)
```
* `value` (optional) = `None` if not provided

__Example 4__:

In [7]:
d = dict.fromkeys(['a', (0,0), 100], 'N/A')

d

{'a': 'N/A', (0, 0): 'N/A', 100: 'N/A'}

In [8]:
d = dict.fromkeys((i**2 for i in range(1, 5)), False)

d

{1: False, 4: False, 9: False, 16: False}

<hr>

<a id='common-operations'></a>
## 2. Common operations

<a id='basic-operations'></a>
### 2.1 Basic operations

* `d[key]`
* `del d[key]`
* `d.popitem()` - for Python 3.6+

__Example 5__: 

Standard method to look for key in a dictionary: `d[key]`

In [9]:
d = dict(zip("abc", range(1, 4)))
d

{'a': 1, 'b': 2, 'c': 3}

In [10]:
d['a']

1

Calling non-existing key will result as `KeyError`

In [11]:
try:
    d['python']
except KeyError as ex:
    print(f'KeyError: {ex} not in d.')

KeyError: 'python' not in d.


<hr>

__Example 6__:

Standard method to remove element from dictionary: `del d[key]`

In [12]:
d = dict(zip("abcdef", range(1, 7)))
d

{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5, 'f': 6}

In [13]:
del d['b']
del d['e']
d

{'a': 1, 'c': 3, 'd': 4, 'f': 6}

Calling non-existing key will result as `KeyError`

In [14]:
try:
    d['python']
except KeyError as ex:
    print(f'KeyError: {ex} not in d.')

KeyError: 'python' not in d.


<hr>

__Example 7__:

Starting with Python version==3.6 and later, dictionary remains ordered in order of insertion

`d.popitem()`:
* removes an item from `d`
    * prior to Python 3.6 removes __random__ item (key, value)
    * from Python 3.6 onwards removes __last inserted__ item (key, value)
* returns removed tuple (key, value)
* `KeyError` if dictionary is empty

In [15]:
d

{'a': 1, 'c': 3, 'd': 4, 'f': 6}

In [16]:
d.popitem()

('f', 6)

In [17]:
d

{'a': 1, 'c': 3, 'd': 4}

<hr>

<a id='get'></a>
### 2.2. get()

```
d.get(key, default)
```

* returns value if key found or `default` if key not found

__Example 8__: 

Look for key in dictionary using `get()`:
* results with `None` if key not in dictionary (instead of KeyError)
* return specified value if key not in dict

In [18]:
# Calling non-existing key with get will result with None
result = d.get('python')

type(result)

NoneType

In [19]:
# Calling non-existing key with 'get' will result with value specified
result = d.get('z', 'N/A')

result

'N/A'

In [20]:
# Calling existing key
result = d.get('a', 'N/A')

result

1

<hr>

__Example 9-A__: 

Create character counter with `get()` function.

* if character doesn't exist in the dictionary, `get()` returns 0 as defined and then adds 1
* if character exists in the dictionary, `get()` returns existing count value and adds 1 

In [21]:
text = "Dictionaries are ubiquitous in Python. Classes are essentially dictionaries, modules are dictionaries, namespaces are dictionaries, sets are dictionaries and many more."

In [22]:
counts = {}

for c in text:
    counts[c] = counts.get(c, 0) + 1

print(counts)

{'D': 1, 'i': 19, 'c': 6, 't': 9, 'o': 9, 'n': 11, 'a': 16, 'r': 11, 'e': 18, 's': 16, ' ': 20, 'u': 4, 'b': 1, 'q': 1, 'P': 1, 'y': 3, 'h': 1, '.': 2, 'C': 1, 'l': 4, 'd': 6, ',': 3, 'm': 4, 'p': 1}


<hr>

__Example 9-B:__

Extended counter, additionally cleaning of uppercase and lowercase duplicates and other characters

In [23]:
counts = {}
for c in text:
    
    key = c.lower().strip()
    if key:  # if key not empty
        # If c doesn't exist get returns 0 and then adds 1
        counts[key] = counts.get(key, 0) +1

print(counts)

{'d': 7, 'i': 19, 'c': 7, 't': 9, 'o': 9, 'n': 11, 'a': 16, 'r': 11, 'e': 18, 's': 16, 'u': 4, 'b': 1, 'q': 1, 'p': 2, 'y': 3, 'h': 1, '.': 2, 'l': 4, ',': 3, 'm': 4}


<hr>

<a id='pop'></a>
### 2.3 pop() 

* use to remove key specified element from dictionary
* returns value of removed element or specified value if key not in dict

__Example 10__:
* `pop()` removes `a` from the dictionary and returns it's value
* `z` not in the dictionary hence `pop()` returns 100.

In [24]:
d = dict.fromkeys('abcd', 0)
d

{'a': 0, 'b': 0, 'c': 0, 'd': 0}

In [25]:
result = d.pop('a', 100)
result

0

In [26]:
result = d.pop('z', 100)
result

100

In [27]:
d

{'b': 0, 'c': 0, 'd': 0}

<hr>

<a id='setdefault'></a>
### 2.4 setdefault() 
* use `setdefault()` to append new key to dictionary 
* returns value if key found in dict or set new key/value and returns new value

__Example 11__:
* appends new key `x` to dictionary and returns the new value
* returns the value of existing key `b`

In [28]:
d.setdefault('x', 100)

100

In [29]:
d.setdefault('b', 100)

0

In [30]:
d

{'b': 0, 'c': 0, 'd': 0, 'x': 100}

<hr>

__Example 12-A__:

Categorise characters to lower, upper or None

In [31]:
text = "Dictionaries are ubiquitous in Python. Classes are essentially dictionaries, modules are dictionaries, namespaces are dictionaries, sets are dictionaries and many more."

In [32]:
import string

print(string.ascii_lowercase)
print(string.ascii_uppercase)

abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ


In [33]:
def cat_key(c):
    """ Categorizes if character is lower, upper, other or None """
    categories = {' ': None,
                  string.ascii_lowercase: 'lower',
                  string.ascii_uppercase: 'upper'}
    
    # for key in categories dict
    for key in categories:
        if c in key:
        # if character in key string return value
            return categories[key]
    else:
        return 'other'

In [34]:
cat_key('A'), cat_key('g'), cat_key('@'), cat_key(' ')

('upper', 'lower', 'other', None)

<hr>

__Example 12-B__:

Alternative cat_key() function to categorise characters:
* create 3 category dictionaries and use `.fromkeys()` to assign 'lower' or 'upper' value to all keys
* unpack all 3 dictionaries to a single categories dict
* use `.get()` to locate character in categories dict or return 'other'

In [35]:
def cat_key2(c):
    """ Categorizes if character is lower, upper, other or None """
    cat_1 = {' ': None}
    # Generates dictioneries with all lower/upper case keys and assigned values
    cat_2 = dict.fromkeys(string.ascii_lowercase, 'lower')
    cat_3 = dict.fromkeys(string.ascii_uppercase, 'upper')
    # Unpacks cat_1, cat_2, cat_3 into a single dictionary
    categories = {**cat_1, **cat_2, **cat_3}
    return categories.get(c, 'other')

In [36]:
cat_key2('A'), cat_key2('g'), cat_key2('@'), cat_key2(' ')

('upper', 'lower', 'other', None)

In [37]:
categories = {}
for c in text:
    
    key = cat_key(c)
    if key:
        categories.setdefault(key, set()).add(c)
    
for cat in categories:
    print(f'{cat}: ', ''.join(categories[cat]))

upper:  DPC
lower:  qsuyobcepnthdmaril
other:  ,.


<hr>

<a id='dictionary-views'></a>
## 3. Dictionary views

### 3.1 Views and manipulating views


* `keys()` view behaves like a `set`
    * keys are unique and hashable (which is required for sets)
    * `union`, `intersection`, `difference` operations of these key views are possible (like sets)
    

* `values()` view does __not__ behave like a set
    * in general values are not unique
    * in general values are not hashable
    

* `items()` __may__ behave like a set
    * elements of `items()` are guaranteed unique (since keys are unique)
    * if all vales are hashable, then behaves like a set
    * if one or more values unhashable, then don't behave like a set
    

<hr>

__Example 13__:

Set operations


In [38]:
s1 = {'a', 'b', 'c'}
s2 = {'b', 'c', 'd'}

In [39]:
# union 
s1 | s2

{'a', 'b', 'c', 'd'}

In [40]:
# intersection 
s1 & s2

{'b', 'c'}

In [41]:
# difference 
s1 - s2

{'a'}

<hr>

__Example 14__:
* Check for common keys in d1 and d2 dictionaries
* If key exists in both dicts, copy the key to a new dictionary and assign values from both dicts


In [42]:
d1= {'a': 1, 'b': 2, 'c':3}
d2= {'b': 2, 'c': 30, 'd': 4}

In [43]:
k1 = d1.keys()
k2 = d2.keys()

key (set) `intersection`

In [44]:
k1 & k2

{'b', 'c'}

In [45]:
# copy reoccurring key to a new dictionary and assign values from origin dicts
new_dict = {key: (d1[key], d2[key]) for key in d1.keys() & d2.keys()}
new_dict

{'b': (2, 2), 'c': (3, 30)}

<hr>

__Example 15__:

Identify items of which keys are not common in both dictionaries (d & e)

In [46]:
d1= {'a': 1, 'b': 2, 'c':3, 'd': 4}
d2= {'a': 10, 'b': 20, 'c': 30, 'e': 5}

Define equation to find unique keys: 
* keys `union` - keys `intersection`, or
* alternatively use `symmetric difference` which is equal to union - intersection

In [47]:
# keys union - keys intersection
k = (d1.keys() | d2.keys()) - (d1.keys() & d2.keys())
k

{'d', 'e'}

In [48]:
# Alternatively use ^ symmetric difference which is equal to union - intersection
k = d1.keys() ^ d2.keys()
k

{'d', 'e'}

Use `get()` to return value of existing key or None for non-existing key, of which `or` returns the existing value.

In [49]:
# Use get() to return value of existing key
d1.get('d') or d2.get('d')

4

<hr>

In [50]:
results = {}
for key in k:
    results[key] = d1.get(key) or d2.get(key)
print(results)

{'e': 5, 'd': 4}


In [51]:
results = {key: d1.get(key) or d2.get(key) for key in d1.keys() ^ d2.keys()}
print(results)

{'e': 5, 'd': 4}


<hr>

<a id='updating-merging-copying'></a>
## 4. Updating, Merging and Copying

<a id='update'></a>
### 4.1 update() 
* Updates dict with keys and values from specified dict
    * Overwrites initial values
    * Adds new keys, values

__Example 16__:

`d1.update(d2)` resutls in:
* updated `b`
* inserted `c`

In [52]:
d1 = {'a': 1, 'b': 2}
d2 = {'b': 20, 'c': 3}

In [53]:
d1.update(d2)

d1

{'a': 1, 'b': 20, 'c': 3}

<hr>

<a id='unpacking-dictionaries'></a>
### 4.2 Unpacking dictionaries

* last 'update' wins
* insertion order is preserved (Python 3.6+)

__Example 17__:

Unpack predefined dictionaries to a new dictionary

In [54]:
d1 = {'a': 1, 'b': 2}
d2 = {'a': 10, (0,0): 'origin'}
d3 = {'b': 20, 'c': 30, 'a': 100}

In [55]:
d = {**d1, **d2, **d3}
d

{'a': 100, 'b': 20, (0, 0): 'origin', 'c': 30}

<hr>

__Example 18__:

Unpack predefined dictionaries to a new dictionary

In [56]:
conf_defaults = dict.fromkeys(('host', 'port', 'user', 'pwd', 'database'), None)
conf_defaults

{'host': None, 'port': None, 'user': None, 'pwd': None, 'database': None}

In [57]:
conf_global ={'port': 5432, 'database': 'deepdive'}
conf_global

{'port': 5432, 'database': 'deepdive'}

In [58]:
conf_dev = {'host': 'localhost', 'user': 'test', 'pwd': 'test'}
conf_dev

{'host': 'localhost', 'user': 'test', 'pwd': 'test'}

In [59]:
conf_prod = {'host': 'prodpg.deepdive.com', 'user': '$prod_user', 'pwd': '$prod_pwd', 'database': 'deepdive_prod'}
conf_prod

{'host': 'prodpg.deepdive.com',
 'user': '$prod_user',
 'pwd': '$prod_pwd',
 'database': 'deepdive_prod'}

<hr>

Different content of final dictionary depends on unpacking order.

`conf_defaults` --> `global` --> `dev`

In [60]:
conf = {**conf_defaults, **conf_global, **conf_dev}
conf

{'host': 'localhost',
 'port': 5432,
 'user': 'test',
 'pwd': 'test',
 'database': 'deepdive'}

<hr>

`conf_defaults` --> `global` --> `prod`

In [61]:
conf = {**conf_defaults, **conf_global, **conf_prod}
conf

{'host': 'prodpg.deepdive.com',
 'port': 5432,
 'user': '$prod_user',
 'pwd': '$prod_pwd',
 'database': 'deepdive_prod'}

<hr>

__Example 19__: 

passing keyword arguments to a function

In [62]:
def my_func(*, kw1, kw2, kw3):
    print(kw1, kw2, kw3)

In [63]:
d = {'kw2': 20, 'kw1': 10, 'kw3': 30}

In [64]:
my_func(**d)

10 20 30


### 4.3 Copying dictionaries

__Example 20__:

`shallow copies`:
* container is a new object -> dictionaries are independent objects
* copied container elements/values are `shared references` with original object -> keys/values are `shared references`

In [65]:
d= {'a': 1, 'b': 2, 'c':3, 'd': 4}

In [66]:
d_copy = d.copy()
d_copy = {**d}
d_copy = dict(d)

# slower (don't use)
d_copy = {k: v for k, v in d.items()}

In [67]:
d_copy

{'a': 1, 'b': 2, 'c': 3, 'd': 4}

In [68]:
d.popitem()

('d', 4)

In [69]:
d

{'a': 1, 'b': 2, 'c': 3}

In [70]:
d_copy

{'a': 1, 'b': 2, 'c': 3, 'd': 4}

In [71]:
d['c'] = 30
d

{'a': 1, 'b': 2, 'c': 30}

In [72]:
d_copy

{'a': 1, 'b': 2, 'c': 3, 'd': 4}

<hr>

__Example 21__:

`deepcopy`
* no shared reference

In [73]:
from copy import deepcopy

d1 = deepcopy(d)
d1

{'a': 1, 'b': 2, 'c': 30}

<hr>

<a id='custom-classes-and-hashing'></a>
## 5. Custom Classes and Hashing

<a id='object-hashes'></a>
### 5.1 Object hashes

<hr>

Basic structure of dictionary elements: `key`: `value`

- value: any Python object (integer, custom class or instance, function, module, any Python object...)
- key: any `hashable` object
    - immutable objects are hashable 
        - int, float, string, complex, binary, Decimal, Fraction, frozenset, functions
        - tuples if all elements are also hashable
    - mutable objects are not hashable
        - set, dictionary, list

<hr>

If an object is hashable:
* the hash of the object must be an `integer` value
* if two objects compare `equal` (`==`), the `hashes` must also be `equal`

Important: two objects that do `not` compare equal, `may` still have the same hash (hash collision).
More hash collisions, slower dictionaries.

<hr>

__Example 22__:

1.  `t1` and `t2` are two different objects
2.  `t1` and `t2` are equal
3.  `t1` and `t2` are not the same object
4.  if two objects are `equal` then they should have the same `hash`
5.  hence `t1` and `t2` used as a key will point to the same value 

In [74]:
t1 = (1, 2, 3)
t2 = (1, 2, 3)

1. Separate id proves `t1` and `t2` are two different objects

In [75]:
id(t1), id(t2)

(84519360, 85109504)

2. `t1` and `t2` are equal

In [76]:
t1==t2

True

3. `t1` and `t2` are not the same object

In [77]:
t1 is t2

False

4.  If two objects are `equal` then they should have the same `hash`

In [78]:
hash(t1), hash(t2)

(529344067295497451, 529344067295497451)

In [79]:
hash(t1) == hash(t2)

True

5. Hence `t1` and `t2` could be used to recover the same value from a dict

In [80]:
d = {t1: 100}

`t2` is different object than `t1`, but since that object is `equal` and has the same `hash` we recover the same key

In [81]:
d[t1]

100

In [82]:
d[t2]

100

In [83]:
d[(1, 2, 3)]

100

<hr>

<a id='custom-classes'></a>
### 5.2 Custom classes
In order to achieve similar result with class objects as with tuples, which is to use different but equal objects as dict keys, it is required that objects are `equal` and have the same `hash` value.

The class instance require to specify `__eq__` method to be able to check equality between objects.

After doing that python by itself specifies that the class objecs are not hashable anymore by setting `__hash__` to `None`.

This requires also to specify manually `__hash__` method.

__Example 23__:

In [84]:
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    
    def __repr__(self):
        return f'({self.x}, {self.y})'
    
    def __eq__(self, other):
        # check if other is instance of another tuple with the same length
        if isinstance(other, tuple) and len(other) == 2:
            other = Point(*other)
            
        # check if other is instance of Point class
        if isinstance(other, Point):
            return self.x == other.x and self.y == other.y
        
        else:
            return False
    
    def __hash__(self):
        return hash((self.x, self.y))

In [85]:
pt1 = Point(0,0)
pt2 = Point(1,1)
points = {pt1: 'origin', pt2: 'point at (1,1)'}
points

{(0, 0): 'origin', (1, 1): 'point at (1,1)'}

<hr>

Now we can get the same value out of points dictionary in various ways, using:
* Point instance assigned to `pt1`
* another Point instance with the same x and y `Point(0,0)`
* tuple with the same length and values `(0,0)`

In [86]:
points[pt1], points[Point(0, 0)], points[(0,0)]

('origin', 'origin', 'origin')

<hr>

__Example 24__:

In the following class example object equation depends only on `Person.id` comparison. 

This will no longer work if we mutate any of existing parameters of `pt1` or `pt2` because of new hash value for that key is different to original key hash.

In [87]:
class Person:
    def __init__(self, id_, name, age):
        self._id = id_
        self.name = name
        self.age = age
    
    def __repr__(self):
        return f'Person(id={self._id}, name={self.name}, age={self.age})'
    
    def __eq__(self, other):
        if isinstance(other, Person):
            return self._id == other._id
        else:
            return False
    
    def __hash__(self):
        return hash(self._id)

In [88]:
p1 = Person ('john', 'John', 28)
p1

Person(id=john, name=John, age=28)

In [89]:
persons = {p1: 'john object'}
persons[p1]

'john object'

In [90]:
# object comparison is based on id attribute which is 'john'
persons[Person('john', 'qwerty', 30)]

'john object'

The object is no longer sensitive to attribute change

In [91]:
p1.name = 'Eric'
p1.age = 70
p1

Person(id=john, name=Eric, age=70)

In [92]:
persons[Person('john', None, None)]

'john object'