# Chapter5: Common Data Structures in Python

## 5.1 Dictionaries, Maps and Hashtables

“In Python, dictionaries (or “dicts” for short) are a central data structure. Dicts store an arbitrary number of objects, each identified by a unique dictionary key.

Dictionaries are also often called maps, hashmaps, lookup tables, or associative arrays. They allow for the efficient lookup, insertion, and deletion of any object associated with a given key.”

Excerpt From: Dan Bader. “Python Tricks: The Book.” Apple Books. 

### `dict` - Your Go-To Dictionary

“Python’s dictionaries are indexed by keys that can be of any hashable type: A hashable object has a hash value which never changes during its lifetime (see `__hash__`), and it can be compared to other objects (see `__eq__`). In addition, hashable objects which compare as equal must have the same hash value.

Python dictionaries are based on a well-tested and finely tuned hash table implementation that provides the performance characteristics you’d expect: O(1) time complexity for lookup, insert, update, and delete operations in the average case.

Besides “plain” dict objects, Python’s standard library also includes a number of specialized dictionary implementations. These specialized dictionaries are all based on the built-in dictionary class (and share its performance characteristics), but add some convenience features on top of that.”

Excerpt From: Dan Bader. “Python Tricks: The Book.” Apple Books. 

In [1]:
phonebook = {
    'bob': 7387,
    'alice': 3719
}

In [3]:
phonebook['alice']

3719

In [2]:
squares = {x: x * x for x in range(6)}

In [4]:
squares

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

### `collections.OrderedDict` - Remember the Insertion Order of Keys

“While standard dict instances preserve the insertion order of keys in CPython 3.6 and above, this is just a side effect of the CPython implementation and is not defined in the language spec. So, if key order is important for your algorithm to work, it’s best to communicate this clearly by explicitly using the OrderDict class.

By the way, OrderedDict is not a built-in part of the core language and must be imported from the collections module in the standard library.”

Excerpt From: Dan Bader. “Python Tricks: The Book.” Apple Books. 

In [10]:
import collections

d = collections.OrderedDict(one=1, two=2, three=3)
d

OrderedDict([('one', 1), ('two', 2), ('three', 3)])

In [11]:
d['four'] = 4
d

OrderedDict([('one', 1), ('two', 2), ('three', 3), ('four', 4)])

### `collections.defaultdict` - Return Default Values for Missing Keys

In [12]:
from collections import defaultdict

dd = defaultdict(list)
dd['dogs'].append('Rufus')
dd['dogs'].append('Kathrin')
dd['dogs']

['Rufus', 'Kathrin']

### `collections.ChainMap` - Searching Multiple Dictionaries as a Single Mapping

“The collections.ChainMap data structure groups multiple dictionaries into a single mapping. Lookups search the underlying mappings one by one until a key is found. Insertions, updates, and deletions only affect the first mapping added to the chain.”

Excerpt From: Dan Bader. “Python Tricks: The Book.” Apple Books. 

In [13]:
from collections import ChainMap

dict1 = {'one': 1, 'two': 2}
dict2 = {'three': 3, 'four': 4}
chain = ChainMap(dict1, dict2)

chain

ChainMap({'one': 1, 'two': 2}, {'three': 3, 'four': 4})

In [14]:
chain['three']

3

### `types.MappingProxyType` - A Wrapper for Making Read-Only Dictionaries

In [15]:
from types import MappingProxyType

writable = {'one': 1, 'two': 2}
read_only = MappingProxyType(writable)

read_only['one']

1

In [16]:
writable['one'] = 42
read_only

mappingproxy({'one': 42, 'two': 2})

### Key Takeaways

- Dictionaries are the central data structure in Python.
- The built-in dict type will be “good enough” most of the time.
- Specialized implementations, like read-only or ordered dicts, are available in the Python standard library.

## 5.2 Array Data Structures

### `list` - Mutable Dynamic Arrays

In [28]:
arr = ['one', 'two', 'three']
arr[0]

'one'

In [29]:
arr

['one', 'two', 'three']

In [30]:
arr[1] = 'hello'
arr

['one', 'hello', 'three']

In [31]:
del arr[1]

In [32]:
arr.append(23)
arr

['one', 'three', 23]

### `tuple` - Immutable Containers

In [44]:
arr = 'one', 'two', 'three'

In [45]:
arr

('one', 'two', 'three')

In [46]:
arr[0]

'one'

In [47]:
del arr[1]

TypeError: 'tuple' object doesn't support item deletion

In [48]:
arr + (23, )

('one', 'two', 'three', 23)

### `array.array` - Basic Typed Arrays

In [50]:
import array

arr = array.array('f', (1.0, 1.5, 2.0, 2.5))
arr[1]

1.5

In [51]:
arr

array('f', [1.0, 1.5, 2.0, 2.5])

In [52]:
arr[1] = 23.0
arr

array('f', [1.0, 23.0, 2.0, 2.5])

In [53]:
del arr[1]
arr

array('f', [1.0, 2.0, 2.5])

In [54]:
arr.append(42.0)
arr

array('f', [1.0, 2.0, 2.5, 42.0])

In [55]:
arr[1] = 'hello'

TypeError: must be real number, not str

### `str` - Immutable Arrays of Unicode Characters

In [56]:
arr = 'abcd'
arr[1]

'b'

In [57]:
arr[1] = 'e'

TypeError: 'str' object does not support item assignment

In [58]:
del arr[1]

TypeError: 'str' object doesn't support item deletion

In [59]:
list('abcd')

['a', 'b', 'c', 'd']

In [60]:
''.join(list('abcd'))

'abcd'

In [61]:
type('acc')

str

In [62]:
type('abc'[0])

str

### `bytes` - Immutable Arrays of Single Bytes

In [63]:
arr = bytes((0, 1, 2, 3))
arr[1]

1

In [64]:
arr

b'\x00\x01\x02\x03'

In [65]:
bytes((0, 300))

ValueError: bytes must be in range(0, 256)

In [66]:
arr[1] = 23

TypeError: 'bytes' object does not support item assignment

In [67]:
del arr[1]

TypeError: 'bytes' object doesn't support item deletion

### `bytearry` - Mutable Arrays of Single Bytes

In [68]:
arr = bytearray((0, 1, 2, 3))
arr[1]

1

In [69]:
arr

bytearray(b'\x00\x01\x02\x03')

In [70]:
arr[1] = 23
arr

bytearray(b'\x00\x17\x02\x03')

In [71]:
del arr[1]
arr

bytearray(b'\x00\x02\x03')

In [72]:
arr.append(42)
arr

bytearray(b'\x00\x02\x03*')

In [73]:
arr[1] = 'hello'

TypeError: an integer is required

In [74]:
arr[1] = 300

ValueError: byte must be in range(0, 256)

In [75]:
bytes(arr)

b'\x00\x02\x03*'

In [76]:
arr = bytes((0, 1, 2, 3))
bytearray(arr)

bytearray(b'\x00\x01\x02\x03')

### Key Takeaways

- You need to store arbitrary objects, potentially with mixed data types? Use a list or a tuple, depending on whether you want an immutable data structure or not.
- You have numeric (integer or floating point) data and tight packing and performance is important? Try out array.array and see if it does everything you need. Also, consider going beyond the standard library and try out packages like NumPy or Pandas.
- You have textual data represented as Unicode characters? Use Python’s built-in str. If you need a “mutable string,” use a list of characters.
- You want to store a contiguous block of bytes? Use the immutable bytes type, or bytearray if you need a mutable data structure.