# Cache Fundamentals
The idea behind caching is quite simple:
- Store data that is often needed somewhere that can be retrieved very fast.

Caches should to be fast along two dimensions:
1. **Maximum use**: Ensure that as many requests as possible go to it (*cache hit*), not to main memory (*cache miss*)
2. **Small overhead**: testing membership and deciding when to replace an should be as fast as possible.

In general, you must pre-define the size of your cache, and implement a strategy for what items to keep and what items to evict

## LRU Cache
A **Least Recently Used (LRU) Cache** organizes items in *order of use*, allowing you to quickly identify which item hasn't been used for the longest amount of time.
https://www.interviewcake.com/concept/java/lru-cache

- Picture a clothes rack, where clothes are always hung up on one side. To find the least-recently used item, look at the item on the other end of the rack.

#### Strengths:
- **Super fast access.** LRU caches store items in order from most-recently used to least-recently used. That means both can be accessed in O(1) time.

- **Super fast updates.** Each time an item is accessed, updating the cache takes O(1) time.

#### Weaknesses:
- **Space heavy.** An LRU cache tracking *n* items requires a linked list of length *n*, AND a hash map holding *n* items. That's O(n) space, but it's still two data structures (as opposed to one).



Let's Explore a bit first!

In [2]:
# Your LRUCache object will be instantiated and called as such:
lru = LRUCache(3)
lru.put(1,1)
lru.put(2,2)
lru.put(3,3)

see_inside(lru)


key: 3, value: 3, order: 0
key: 2, value: 2, order: 1
key: 1, value: 1, order: 2


In [3]:
lru.put(2,4)
see_inside(lru)

key: 2, value: 4, order: 0
key: 3, value: 3, order: 1
key: 1, value: 1, order: 2


In [4]:
lru.put(5,5)
see_inside(lru)

key: 5, value: 5, order: 0
key: 2, value: 4, order: 1
key: 3, value: 3, order: 2


In [5]:
lru.get(3)
see_inside(lru)

key: 3, value: 3, order: 0
key: 5, value: 5, order: 1
key: 2, value: 4, order: 2


In [6]:
lru.put(7,7)
see_inside(lru)

key: 7, value: 7, order: 0
key: 3, value: 3, order: 1
key: 5, value: 5, order: 2


## Implementation:

- Under the hood, an LRU cache is often implemented by pairing a *doubly-linked list* with a *hash map*. (OrderedDict in Python)

![ordered-dict](img/ordered-dict.png)

## Accessing and Evicting

Here are the steps we'd run through each time an item was accessed:

1. Look up the item in our hash map.

  1. If the item is in the hash table, then it's already in our cache—this is called a "*cache hit*"

  2. Use the hash table to quickly find the corresponding linked list node.

  3. Move the item's linked list node to the head of the linked list, since it's now the most recently used (so it shouldn't get evicted any time soon).

2. If the item isn't in the hash table, we have a "*cache miss*". We need to load the item into the cache:

  1. Create a new linked list node for the item. Insert it at the head of the linked list.

  2. Add the item to our hash map, storing the newly-created linked list node as the value.

  3. Is our cache full? If so, we need to evict something:

    1. Grab the least-recently used cache item—it'll be at the tail of the linked list.

    2. Evict that item from the cache by removing it from the linked list and the hash map.
    
Keeping all the pointers straight as you move around linked list nodes is tricky!

All of those steps are O(1), so put together it takes O(1) time to update our cache each time an element is accessed. Pretty cool!

In [None]:
# Shortest solution from LeetCode, extending OrderedDict
class LRUCacheOD(OrderedDict):

    def __init__(self, capacity):
        """
        :type capacity: int
        """
        self.capacity = capacity

    def get(self, key):
        """
        :type key: int
        :rtype: int
        """
        
        # if the key was not found return -1
        if key not in self:
            return - 1
        
        # since we accessed the key, move it to the end so it will stay in the cache longer
        self.move_to_end(key)
        
        # return the value
        return self[key]

    def put(self, key, value):
        """
        :type key: int
        :type value: int
        :rtype: void
        """
        # if the key already in the dict
        if key in self:
            # it was accessed, so move it to the end so it stays in the cache longer!
            self.move_to_end(key)
        
        # reset the value if it already existed, or create a new key-value pair if it did not
        self[key] = value
        
        # check to see if we are over capacity (now that we may have already added a new item)
        if len(self) > self.capacity:
            
            # if so, we have to evict an item!
            # the 'last' flag refers to if you want the last item added (like a stack). 
            # In this case we actually want to the first item added (like a queue).
            self.popitem(last = False)


In [1]:

# Simple node object for a doubly linked list
# Each node keeps track of a key/value and pointers to the next and prev nodes
class Node:
    def __init__(self, k, v):
        self.key = k
        self.val = v
        self.prev = None
        self.next = None

class LRUCache:
    def __init__(self, capacity):
        """
        :type capacity: int
        """
        self.capacity = capacity
        self.dic = dict()
        # a neat trick where the cache object itself keeps track of the
        # head and tail directly (acting like a sentinel node)
        # self.prev refers to the front of the LL (most recently added)
        # and self.next refers to the end (next in line for deletion)
        self.prev = self.next = self

    # when you add a new node, put it in the front of the LL
    def _add(self, node):
        # self.prev refers to the old front of the LL
        p = self.prev
        # now that node points to our new node
        p.next = node

        # the cache object must update its pointer so the front points to our new node
        self.prev = node

        # and our new node's prev to the old front of the LL
        node.prev = p
        # and the next points to the cache object (like a sentinel)
        node.next = self

    # when removing a node, you just bypass the pointers to it
    # and let the garbage collector handle the rest
    def _remove(self, node):
        p = node.prev
        n = node.next
        p.next = n
        n.prev = p

    # when you access a key, you need to move that node to the front
    # so it will not be deleted soon
    def get(self, key):
        """
        :type key: int
        :rtype: int
        """
        # first check if the key even exists!
        if key in self.dic:
            # if it was found, get a pointer to the correct node from the dict
            n = self.dic[key]
            # then remove all links to that node
            self._remove(n)
            # and add it to the front of the LL
            self._add(n)
            # finally, return the value that was asked for
            return n.val

        # if the key was not found return -1
        return -1

    # When you put a new value in, you also move it to the front of the list
    def put(self, key, value):
        """
        :type key: int
        :type value: int
        :rtype: None
        """
        # if the value was already found
        if key in self.dic:
            # we want to remove links to the node that currenly holds its value
            # so that we can just recreate it at the front of the LL
            self._remove(self.dic[key])
        # make a new node to store the key/value pair
        n = Node(key, value)

        # add that node to the LL (inserting it to the front)
        self._add(n)
        # point the dictionary to that new node
        # note that if the key already existed, we are just reassigning its value
        self.dic[key] = n

        # If we added a new entry to our dictionary, we may be over capacity now
        if len(self.dic) > self.capacity:
            # in that case, we want to remove the last node added
            # which will be referenced by self.next (the tail of the LL)
            n = self.next
            # then we remove the pointers to that node
            self._remove(n)
            # and remove it from the dictionary
            # at this point, there will really be no more pointers to that node
            # and the garbage collector will get rid of it.
            del self.dic[n.key]

def see_inside(lru):
    n = lru.prev
    for i in range(len(lru.dic)):
        print('key: '+str(n.key)+', value: '+str(n.val)+', order: '+str(i))
        n = n.prev