# Hash

We can use hashing to implement map and set interfaces. HashMap is basically a Python dictionary. 

> When questions use the workds "unique", "count", or "frequency", it is almost certain a hashmap or hashset will be useful for solving the problem.
>
> Recall that the difference between a set and a map is that in sets contain keys only, while maps contain key-value pairs.


## Motivation

| Operation | TreeMap | HashMap | Array | Notes |
|:----------|:--------|:--------|:------|:------|
| Insert | O(log n) | O(1) | O(n) | For HashMap, O(1) is actually average case, O(n) is worst case if every elem is unique. but practically we say O(1) is worst case.|
| Remove | O(log n) | O(1) | O(n) | Again, O(1) is actually average case|
| Search | O(log n) | O(1) | O(log n), if sorted |Again, O(1) is actually average case |
| Inorder Traversal | O(n) | - | - | |
| Access | O(n) | | | |
| Insertion | O(1)* | | | Assumes reference to desired node |
| Deletion | O(1)* | | | Assumes reference to desired node |

## Operations

HashMaps are not ordered (keys dont have any order) and dont allow duplicates (of keys).

The downside of hash maps is that they are not ordered, so it not possible to traverse the keys of a hashmap in any particular order. If you were to iterate through all the keys, you would first need to sort them, which will run in $O(n log n)$ time, making it slower than an in-order traversal of a tree map which is $O(n)$.

Example use case (as a phonebook):

```python
names = ["alice", "brad", "collin", "brad", "dylan", "kim"]

countMap = {}
for name in names:
    # If countMap does not contain name
    if name not in countMap:
        countMap[name] = 1
    else:
        countMap[name] += 1
```

---

### Time Complexity

The above counting algorithm, when implemented using a hash map, is more efficient than using a tree map. With a tree map, a single insertion operation would cost O(log n) time and if `n` is the size of the array, it would total to $O(n log n)$ time.

With a hashmap, this costs $O(n)$, as a single insertion operation is $O(1)$ and we go through the entire list of `n` items

### Space Complexity

The space consumed by a hash map is n is proportional to the number of unique keys in the array.

---



## Problem: Two Sum

Given an array of integers nums and an integer target, return the indices i and j such that `nums[i] + nums[j] == target` and `i != j`. Integers can be duplicates

You may assume that every input has exactly `one pair` of indices i and j that satisfy the condition.

Return the answer with the smaller index first.

```
Input: nums = [3,2,4], target = 6
Output: [1,2]

Input: nums = [3,3], target = 6
Output: [0,1]
```

*My incorrect sol*

First build a hashmap `all` of `elem:index` of all the elements in nums. Iterate through the array, and if we find the corresponding pair in `all`, then we just return. However, by doing this, we lose track of the order and it will return the wrong indexes.


In [1]:
#my wrong approach
def twoSum(nums, target):
#neetcode solution
    all={}
    for i, n in enumerate(nums):
        if n not in all:
            all[n]=i

    #this way of keeping track of numbers is way cleaner lol
    for ind,n in enumerate(nums):
        if target-n in all:
            return [ind,all[target-n]]
        
print(twoSum([3,3],6))

[0, 0]


In [2]:
def twoSum(nums, target):
        prevMap = {}  # val -> index

        #this way of keeping track of numbers is way cleaner lol
        for i, n in enumerate(nums):
            diff = target - n
            if diff in prevMap:
                return [prevMap[diff], i]
            prevMap[n] = i

print(twoSum([3,3],6))


[0, 1]


## Problem: LRU Cache

https://neetcode.io/problems/lru-cache

Note in Python, it's generally faster to check for existence using if key in dictionary than to use a try-except block. This is because exceptions are relatively expensive operations.

In [None]:
#my solution using a dynamic list

class LRUCache:

    def __init__(self, capacity: int):
        '''we use a list to keep track of the history'''
        self.cap=capacity
        self.cache={}

        self.q=[]
    

    def get(self, key: int) -> int:
        if key in self.cache:
            #update history by finding that element and moving it to the right
            pos=self.q.index(key)
            self.q.pop(pos)
            self.q.append(key)
            return self.cache[key]
        else:
            return -1
        

    def put(self, key: int, value: int) -> None:
        if key in self.cache:
            self.cache[key]=value
            #update history
            pos=self.q.index(key)
            self.q.pop(pos)
            self.q.append(key)
            #self.q.append(self.q.pop(self.q.index(key)))
        else:
            self.cache[key]=value
            #add new item to history
            self.q.append(key)
        
        if len(self.q)>self.cap:
            #remove this item from the dict
            del self.cache[self.q[0]]
            #remove from history
            self.q.pop(0)

In [None]:
#neetcode solution
class Node:
    def __init__(self, key, val):
        #the nodes store both the value of key,val, and the order history
        self.key, self.val = key, val
        self.prev = self.next = None

class LRUCache:
    def __init__(self, capacity: int):
        self.cap = capacity
        self.cache = {}  # map key to node, WOW that is smart

        #instantiate linked list
        self.left, self.right = Node(0, 0), Node(0, 0)
        self.left.next, self.right.prev = self.right, self.left

    def remove(self, node):
        #remove this node object
        prev, nxt = node.prev, node.next
        prev.next, nxt.prev = nxt, prev

    def insert(self, node):
        #insert at the end
        prev, nxt = self.right.prev, self.right
        prev.next = nxt.prev = node
        node.next, node.prev = nxt, prev

    def get(self, key: int) -> int:
        if key in self.cache:
            self.remove(self.cache[key])
            self.insert(self.cache[key])
            return self.cache[key].val
        return -1

    def put(self, key: int, value: int) -> None:
        if key in self.cache:
            #literally remove this node object
            #THIS IS THE REFERENCE TO THE NODE OBJECT
            # the dictionary stores the reference to the address of the node object so we can access it in O(1) time, SO SMARTTTTT
            self.remove(self.cache[key])
        self.cache[key] = Node(key, value)
        self.insert(self.cache[key])

        if len(self.cache) > self.cap:
            #remove the first elem from linkedlist
            lru = self.left.next
            self.remove(lru)
            #delete relevant item from dict
            del self.cache[lru.key]
