# Understanding python dictionary resizing

An exercise in understanding hash table resizing for python dictionaries, which implement an open addressing
scheme akin to linear probing under the hood.

First create a new empty `dict` and use the `sys.getsizeof()` method to assess its size in bytes.

In [1]:
import sys

d = dict()
sys.getsizeof(d)

64

In [None]:
d = dict()
# By default python dictionaries start with 8 buckets
length = 8

# Here we are going to run a loop and add a new key/value pair at each iteration
# We measure the size of the dict before and after we add a new item. If
# the size changes then we know python has resized the dictionary after the insertion.
# For each iteration we print out information about the number of elements, current load
# factor and the size in bytes.
for i in range(1, 51):
    pre_size = sys.getsizeof(d)
    d[i] = 1
    post_size = sys.getsizeof(d)
    # Here we ignore the case where pre_size == 64 because python empty dict is 64 bytes
    # and it increases in size to 224 upon insertion of the first element without resizing
    # the internal hash table (it's some kind of optimization thing).
    if pre_size != post_size & pre_size != 64:
        length = length * 2
    print(f"Elements {i}\tload factor {i/length:.4f}", end="\t")
    print(f"size (bytes) {post_size}")

# Determine the element insertion number that will trigger the next resizing
Run the above code which will generate a series of outputs that will look like this:
```
Elements 1	load factor 0.1250	size (bytes) 224
```
You will see the size of the hash table change several times in this output. Inspect
these results, particularly looking at the # of elements in the hash table and the corresponding load factor
before and after a resizing. Your task is to figure out the element number that will trigger
the next resizing of the hash table, and to hypothesize on the load factor beyond which python
initiates a resizing.