## Hashing

- Allows you to quickly retrieve things out of a container.
- If a container is hashable, you can call the function `hash` to retrieve it's hash.
- Numbers hash to themselves up to very large numbers.
- Mutable items are __unhashable__ and will error if called `hash` on.


In [14]:
x = ('a', 'b')
print(hash(x))
y = 1_000_000_000_000
print(hash(y))

813276519451644522
1000000000000


## Sets

- Unordered container containing no duplicates
- Big block of memory is used for sets
- When an item is added to a set, the hash of that item is used to determine the index of the object in the block of memory.

In [27]:
x = set()
x.add('cat')
print(x)

{'cat'}


## Set unions and intersections

- You can find the union of a set and the intersection of the set
- The intersection finds the values that are in at least one
- Union finds values that are in both
- `symmetric_difference` find values that are unique in both sets

In [6]:
x = {1, 2, 3, 5}
y = {3, 4, 6, 7, 8}
print(x.union(y)) # equivalent to `x | y`
print(x.intersection(y)) # 3 is in both, `x & y`
print(x.symmetric_difference(y)) # union - intersection
print(x.union(y) - x.intersection(y)) # same as above, use x ^ y for XOR

{1, 2, 3, 4, 5, 6, 7, 8}
{3}
{1, 2, 4, 5, 6, 7, 8}
{1, 2, 4, 5, 6, 7, 8}
{1, 2, 3, 4, 5, 6, 7, 8}


## Dictionaries

- Containers of key-value pairs.
- Ordered since py3.6
- Constructed by `{key: value}`
- Key must be an immutable object, value can be any

In [8]:
x = {}
x['hello'] = 'world'
x['never'] = 'gonna'

print(x)
print(x.get('never'), x['never']) # two ways to get dict items

{'hello': 'world', 'never': 'gonna'}
gonna gonna


In [10]:
# morrison moment
# sample.txt: words seperated by spaces & new lines

input_file_name = 'sample.txt'
output_file_name = input_file_name + '.dump'
items = {}
with open(input_file_name, 'r') as fp:
    for line in fp:
        words = line.split()
        for w in words:
            items[w] = items.get(w, 0) + 1
            
with open(output_file_name, 'w') as fp:
    for k,v in items.items():
        fp.write(f'{k}: {v}\n')