# Hash Tables

## When to use them

- Storing a set of strings/numbers
- When you want O(1) performance for
  - Lookups
  - Insertions (Can actually be O(n) if the hash table must be resized)
  - Deletions

## Efficiency Tips
Looking up if a key is in a dict?

`if key in dict` is what you should use!

The `in` operation on a dict, or the dict_keys object you get back from calling keys() on it (in 3.x), is not O(N), it's O(1)!

No need to use .get() in an if statement

## Anagrams Example

Using the collections class we can avoid KeyErrors by setting each key we add to have a default value. In this case, an empty list

In [11]:
anagrams = ['debitcard', 'elvis', 'silent', 'badcredit', 'lives', 'freedom', 'listen', 'levis', 'money']

import collections

def find_anagrams(anagram_list):
    sorted_string_to_anagrams = collections.defaultdict(list)
    for s in anagram_list:
        # We have to use sorted() to sort the specific chars in the string
        # Results in a string list that we must convert with join in order to use it as a key for the dictionary
        sorted_string_to_anagrams[''.join(sorted(s))].append(s)
    
    print(sorted_string_to_anagrams.keys())
    return [group for group in sorted_string_to_anagrams.values() if len(group) > 1]

print(find_anagrams(anagrams))


dict_keys(['abcddeirt', 'eilsv', 'eilnst', 'deefmor', 'emnoy'])
[['debitcard', 'badcredit'], ['elvis', 'lives', 'levis'], ['silent', 'listen']]


## Hash Table Libraries

- dict
- set (only keys, no values)
- collections.defaultdict
- collections.Counter
- collections.OrderedDict


### collections.Counter 

collections.Counter can be used to count number of occurences of keys conveniently. Example below

In [12]:
import collections

c = collections.Counter(a=3, b=1)
d = collections.Counter(a=1, b=2)

# Adds 2 counters together
print(c + d)

# Subtract them, keeps only the positive counts
print(c - d)

# Intersection (min(c[x], d[x]))
print(c & d)

# Union (max(c[x], d[x]))
print(c | d)

Counter({'a': 4, 'b': 3})
Counter({'a': 2})
Counter({'a': 1, 'b': 1})
Counter({'a': 3, 'b': 2})


### Set

Sets have these important operations

- s.add(42)
- s.remove(42)
- s.discard(123)

In [22]:
s = set([1, 2, 3])
t = set([3, 4, 5])


s.add(42)
t.remove(4)

# Discard is the same as remove, but doesnt error if value doesnt exist
s.discard(100)

print(s, t)

# Says s is a subset of t
print(s <= t)

# Elements in s that arent in t
print(s - t)



{1, 2, 3, 42} {3, 5}
False
{1, 2, 42}


### Hash Table functions

- items() iterates over key-value pairs
- keys() iterates over keys
- values() iterates over values

Note: Mutable types like lists cannot be set as keys. This is to deter change the key after adding it so lookup wont fail later

In [31]:
somedict = {'a': 1, 'b': 2, 'c': 3}

print(somedict.items())

# How to access them
for item in somedict.items():
    print(item[0], item[1])

print(somedict.keys())
print(somedict.values())

dict_items([('a', 1), ('b', 2), ('c', 3)])
a 1
b 2
c 3
dict_keys(['a', 'b', 'c'])
dict_values([1, 2, 3])


### OrderedDict

OrderedDict is useful when you want to simulate have a dictionary that keeps items in order from when you inserted them.

This acts like a queue, but with fast lookup and pop times.

pop(key) is especially useful if you're trying to remove a key in middle of the data structure quickly. (LRU Cache problem)

Tip! Use `pop(key, None)` to ensure you get None as default if the key doesn't exist and won't error out

popitem methods
- popitem() pops the last element
- popitem(last = False) pops the first element


In [41]:
import collections

ordered_dict = collections.OrderedDict()

ordered_dict['b'] = 2
ordered_dict['a'] = 1
ordered_dict['d'] = 4
ordered_dict['c'] = 3

print(ordered_dict)

ordered_dict.pop('a')

# This shouldn't exist, using None to handle potential error
print(ordered_dict.pop('f', None))

# Aftermath of popping a
print(ordered_dict)

# Showing what happens when trying to pop the last item in the queue aka first item in the dict
ordered_dict.popitem(last=False)

OrderedDict([('b', 2), ('a', 1), ('d', 4), ('c', 3)])
None
OrderedDict([('b', 2), ('d', 4), ('c', 3)])
3


('b', 2)

In [45]:
hmap = {'1':1, '2':2}

if not hmap.get('3'):
    print('yo')

yo
