# Chapter 3. Dictionaries and Sets
---

## ToC


1. [Variations of dict](#variations-of-dict)    
    1.1. [`collections.OrderedDict`](#collectionsordereddict)  
    1.2. [`collections.ChainMap`](#collectionschainmap)  
    1.3. [`collections.Counter`](#collectionscounter)  
    1.4. [`shelve.Shelf`](#shelveshelf)  
    1.5. [Subclassing `UserDict` Instead of `dict`](#subclassing-userdict-instead-of-dict)
2. [Immutable Mappings](#immutable-mappings)


---

## Variations of dict

Overview of mapping types included in the standard library, besides `defaultdict`

### `collections.OrderedDict`

Now that the built-in dict also keeps the keys ordered since Python 3.6, the most
common reason to use OrderedDict is writing code that is backward compatible with
earlier Python versions.

#### `collections.ChainMap`

A ChainMap instance holds a list of mappings that can be searched as one. The lookup
is performed on each input mapping in the order it appears in the constructor call,
and succeeds as soon as the key is found in one of those mappings. For example:

In [15]:
d1 = dict(a=1, b=3)
d2 = dict(a=2, b=4, c=6)
from collections import ChainMap
merged_dict = d1 | d2
chain = ChainMap(d1, d2)
chain

ChainMap({'a': 1, 'b': 3}, {'a': 2, 'b': 4, 'c': 6})

In [16]:
merged_dict

{'a': 2, 'b': 4, 'c': 6}

In [17]:
chain['a']

1

In [18]:
chain['c']

6

The ChainMap instance does not copy the input mappings, but holds references to them. Updates or insertions to a `ChainMap` only affect the first input mapping:

In [19]:
chain['c'] = -1
chain

ChainMap({'a': 1, 'b': 3, 'c': -1}, {'a': 2, 'b': 4, 'c': 6})

In [20]:
d1

{'a': 1, 'b': 3, 'c': -1}

In [None]:
d2

{'a': 2, 'b': 4, 'c': 6}

**Note:** ChainMap is useful to implement interpreters for languages with nested scopes, where each mapping represents a scope context, from the innermost enclosing scope to the outermost scope.

Several examples in documentation: [collections.ChainMap](https://docs.python.org/3/library/collections.html#collections.ChainMap)
```python
import builtins
pylookup = ChainMap(locals(), globals(), vars(builtins))
```

#### `collections.Counter`

A mapping that holds an integer count for each key. Updating an existing key adds to
its count. This can be used to count instances of hashable objects or as a multiset

In [28]:
import collections
ct = collections.Counter('abracadabra')
ct

Counter({'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1})

In [29]:
ct.update('aaaaazzz')
ct

Counter({'a': 10, 'z': 3, 'b': 2, 'r': 2, 'c': 1, 'd': 1})

In [30]:
ct.most_common(3)

[('a', 10), ('z', 3), ('b', 2)]

### `shelve.Shelf`

provides persistent storage for a mapping of string keys to Python objects serialized in the `pickle` binary format. The curious
name of shelve makes sense when you realize that pickle jars are stored on shelves.

The `shelve.open  module-level function returns a shelve.Shelf instance—a simple
key-value DBM database backed by the dbm module, with these characteristics:
- `shelve.Shelf` subclasses `abc.MutableMapping`, so it provides the essential methods
we expect of a mapping type.
- In addition, `shelve.Shelf` provides a few other I/O management methods, like
`sync` and `close`.
- A `Shelf` instance is a context manager, so you can use a with block to make sure
it is closed after use.
- Keys and values are saved whenever a new value is assigned to a key.
- The keys must be strings.
- The values must be objects that the `pickle` module can serialize.

![Figure 40](https://raw.githubusercontent.com/berserkhmdvhb/Training-Python/main/figures/Part_I/40.PNG)

`OrderedDict`, `ChainMap`, `Counter`, and `Shelf` are ready to use but can also be customized
by subclassing. In contrast, `UserDict` is intended only as a base class to be
extended (via subclassing)

### Subclassing `UserDict` Instead of `dict`

It’s better to create a new mapping type by extending collections.UserDict rather than `dict`, because the built-in has some implementation shortcuts that end up forcing us to override methods that we can just inherit from `UserDict` with no problems.

Note that `UserDict` does not inherit from `dict`, but uses composition: it has an internal
`dict` instance, called `data`, which holds the actual items. This avoids undesired
recursion when coding special methods like `__setitem__`, and simplifies the coding
of `__contains__`

**Example:** The file `strkeydict0.py` subclasses `dict` but the file file `strkeydict.py` subclasses `UserDict`
The differences in the latter implementation:
- `__contains__` is simpler: we can assume all stored keys are `str`, and we can check on self.data instead of invoking `self.keys()` as we did in `StrKeyDict0`.
- `__setitem__` converts any key to a `str`. This method is easier to overwrite when we can delegate to the `self.data` attribute

Tests for item retrieval using `d[key]` notation:

In [23]:
from strkeydict import StrKeyDict
d = StrKeyDict([('2', 'two'), ('4', 'four')])
d

{'2': 'two', '4': 'four'}

In [24]:
d['2']

'two'

In [25]:
d[4]

'four'

In [26]:
d[1]

KeyError: '1'

In [27]:
d['one']

KeyError: 'one'

Tests for item retrieval using `d.get(key)` notation:

In [28]:
d.get('2')

'two'

In [29]:
d.get(4)

'four'

In [30]:
d.get(1, 'N/A')

'N/A'

In [31]:
d.get(1, 'Dummy Key')

'Dummy Key'

Tests for the `in` operator:

In [32]:
2 in d

True

In [33]:
1 in d

False

Testing for `set` operator:

In [34]:
d[4] = 'Four'
d

{'2': 'two', '4': 'Four'}

In [35]:
d[4]

'Four'

In [36]:
d['4'] = 'FOUR'
d

{'2': 'two', '4': 'FOUR'}

In [37]:
d[2]

'two'

## Immutable Mappings

The mapping types provided by the standard library are all mutable, but you may
need to prevent users from changing a mapping by accident. A concrete use case can
be found, again, in a hardware programming library like *Pingo*:  

**Example:** The `board.pins` mapping represents the physical
GPIO pins on the device. As such, it’s useful to prevent inadvertent updates to
`board.pins` because the hardware can’t be changed via software, so any change in the
mapping would make it inconsistent with the physical reality of the device

In [9]:
from types import MappingProxyType
d = {1: 'A'}
d_proxy = MappingProxyType(d)
d_proxy

mappingproxy({1: 'A'})

In [10]:
# Items in d can be seen through d_proxy.
d_proxy[1]

'A'

In [11]:
# Changes cannot be made through d_proxy.
d_proxy[2] = 'x'

TypeError: 'mappingproxy' object does not support item assignment

In [12]:
d[2] = 'B'
# d_proxy is dynamic: any change in d is reflected.
d_proxy

mappingproxy({1: 'A', 2: 'B'})

In [13]:
d_proxy[2]

'B'

Here is how this could be used in practice in the hardware programming scenario:
the constructor in a concrete `Board` subclass would fill a private mapping with the pin
objects, and expose it to clients of the API via a public `.pins` attribute implemented
as a `mappingproxy`. That way the clients would not be able to add, remove, or change
pins by accident.