# Details
You are given a stream of integers representing user events arriving in real time.

Design a data structure that supports the following operations efficiently:

- `add(x)` – Add an integer x from the stream

- `remove(x)` – Remove one occurrence of integer x if it exists

- `getTopK(k)` – Return the k most frequent integers, ordered by highest frequency first

## Constraints

1. The stream can contain millions of elements
2. **add** and **remove** must be close to `O(1)` average time
3. **getTopK(k)** should be faster than sorting the entire dataset every time

## Tasks

1. Describe the data structures you would use
2. Explain how each operation works
3. Analyze the time and space complexity of each operation

In [None]:
from heapq import nlargest

class EventTracker:
    def __init__(self):
        # Using a defaultdict for automatic handling of missing keys
        self.Events = {}

    def add(self, value):
        # Increment the occurrence count for the event
        if value in self.Events:
            self.Events[value] += 1
        else: 
            self.Events[value] = 1
        return True

    def remove(self, value):
        # Check if the event exists
        if value not in self.Events:
            return None
        # Decrement the occurrence count and remove if it reaches zero
        self.Events[value] -= 1
        if self.Events[value] == 0:
            del self.Events[value]
        return True

    def getTopK(self, k):
        # Use nlargest to fetch top k events efficiently
        top_k = dict(nlargest(k, self.Events.items(), key=lambda item: item[1]))
        return top_k


In [12]:
sample_events = EventTracker()

sample_event_codes = [
    1012, 2045, 3099, 1012, 4501, 3320, 2045, 7781, 9023, 1187,
    4501, 6722, 3099, 1187, 5566, 7781, 9900, 2045, 3320, 1012,
    4455, 5566, 6722, 7781, 8899, 9900, 1187, 2045, 3099, 4501,
    5566, 6722, 7781, 8899, 9900, 1012, 2045, 3320, 4455, 5566,
    6722, 7781, 8899, 9900, 1187, 2045, 3099, 4501, 5566, 6722,
    7781, 8899, 9900, 1012, 2045, 3320, 4455, 5566, 6722, 7781,
    8899, 9900, 1187, 2045, 3099, 4501, 5566, 6722, 7781, 8899,
    9900, 1012, 2045, 3320, 4455, 5566, 6722, 7781, 8899, 9900,
    1187, 2045, 3099, 4501, 5566, 6722, 7781, 8899, 9900, 1012
]

print("Adding elements to the hashmap")
for i in sample_event_codes:
    sample_events.add(i)

print("done adding events into the hashmap\n")

print(f"the top 10 events are: {sample_events.getTopK(10)}\n\n")

print("remove events from the hashmap\n")
sample_events.remove(2045)

print(f"the NEW top 10 events are: {sample_events.getTopK(10)}\n\n")


Adding elements to the hashmap
done adding events into the hashmap

the top 10 events are: {2045: 10, 7781: 10, 6722: 9, 5566: 9, 9900: 9, 8899: 8, 1012: 7, 3099: 6, 4501: 6, 1187: 6}


remove events from the hashmap

the NEW top 10 events are: {7781: 10, 2045: 9, 6722: 9, 5566: 9, 9900: 9, 8899: 8, 1012: 7, 3099: 6, 4501: 6, 1187: 6}


