<a href="https://colab.research.google.com/github/dan-manolescu/data-structures-fun/blob/main/C10_Caches.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [15]:
from typing import Any
from random import randrange

# LRU Cache

Our LRU cache implementation will use a HashTable and a Queue. First let's import the hash table from another colab notebook.

In [16]:
%%capture
from google.colab import drive
drive.mount('mnt', force_remount=True)
%run 'mnt/My Drive/Colab Notebooks/C9-Hash Tables.ipynb'

Then let's re-implement the Queue. This is similar to the QueueLL from "C3-Stacks & Queues" notebook but with a different internal linked list implementation (a double linked list)

In [17]:
class QueueListNode:
    def __init__(self, value: Any):
        self.value = value
        self.next = None
        self.prev = None

In [18]:
class CacheQueue:
    def __init__(self):
        self.front = None
        self.back = None

    def Enqueue(self, value: Any) -> None:
        node = QueueListNode(value)
        # check if the queue is empty and if it add new node and point both front and back at it.
        if self.back == None:
            self.front = node
            self.back = node
        else:
            self.back.next = node
            # Set the node prev pointer to the previous last element (back).
            node.prev = self.back
            self.back = node

    def Dequeue(self) -> Any:
        if self.front == None:
            return None

        value = self.front.value
        self.front = self.front.next
        if self.front == None:
            self.back = None
        else:
            # Set the prev pointer of the new front to None to indicate there's no other node in front of it.
            self.front.prev = None
        return value

    def RemoveNode(self, node: QueueListNode) -> None:
        '''
        Removes a given node from the middle of the queue.
        '''
        if node.prev != None:
            node.prev.next = node.next
        if node.next != None:
            node.next.prev = node.prev
        if node == self.front:
            self.front = self.front.next
        if node == self.back:
            self.back = self.back.prev



LRU Cache implementation

The CacheEntry class represents an entry into the hash table of the LRU cache and it consists of three entries: the key for retrieving the cache value, the actual cache value and a pointer to the corresponding entry in the cache's queue.

In [19]:
class CacheEntry:
    def __init__(self, key: Any, value: Any, node: QueueListNode):
        self.key = key
        self.value = value
        self.node = node

Our LRUCache class implementation with a method to do a CacheLookup. Every time the user asks for an entry that entry is either returned from the cache or retrieved from the slower data source.

In [20]:
class LRUCache:
    def __init__(self, max_size: int):
        # The HashTable implementation can be found in C9-Hash Tables.
        self.ht = HashTable(max_size)
        # The QueueLL (queue as a LinkedList) can be found in C3-Stacks & Queues.
        self.q = CacheQueue()
        self.max_size = max_size
        self.current_size = 0

    def CacheLookup(self, key: Any) -> Any:
        '''
        Returns the value in the cache for a given key.
        For a cache hit the function returns the value directly and updates how
        recently the data was accessed.
        For a cache miss then fetch the data, insert it into the cache and
        as needed evict the oldest data.
        '''
        # Check if the key exists in our cache table.
        entry = self.ht.HashTableLookup(key)

        if entry == None:
            # If we have a cache miss then before fetching the entry from the
            # more expensive data store then evict the oldest item from the
            # front of the queue if the cache is full.
            if self.current_size >= self.max_size:
                key_to_remove = self.q.Dequeue()
                self.ht.HashTableRemove(key_to_remove)
                self.current_size -= 1

            # Actually retrieve here the data from the slow data source.
            # For our example we just put random gibberish.
            data = randrange(0, 100)

            # Add the key to the back of the queue.
            self.q.Enqueue(key)

            # Create a new hash table entry from key, data and the back pointer in the queue.
            entry = CacheEntry(key, data, self.q.back)
            self.ht.HashTableInsert(key, entry)
            self.current_size += 1
        else:
            # Cache hit.
            # Reset this key's location in the queue by moving it from its current
            # location to the back of the queue.
            # First remove it from the queue with the help of the pointer's cache entry.
            self.q.RemoveNode(entry.node)
            # Then re-add the key to the back of the queue.
            self.q.Enqueue(key)
            # Update the CacheEntry's pointer to the queue node.
            entry.node = self.q.back

        return entry.value