1. **Problem Statement**

You are given a list of event types generated by an application.
Your task is to compute how many times each event type occurred.

This simulates a basic metrics aggregation step in an ETL pipeline.

**Input Format**

A list of strings events

Each string represents an event type (e.g., "login", "logout")

events = [
    "login",
    "logout",
    "login",
    "purchase",
    "login",
    "purchase"
]


**Output Format**

Return a dictionary where:

key = event type

value = number of occurrences

{
    "login": 3,
    "logout": 1,
    "purchase": 2
}


**Constraints**

1 <= len(events) <= 10^5

Event names are lowercase strings

Assume input fits in memory

Use plain dictionaries (do NOT use Counter for this question)

In [0]:
events = [ "login", "logout", "login", "purchase", "login", "purchase" ]

event_type = {}

for e in events:
    event_type[e] = event_type.get(e,0) + 1

event_type   

**2. Problem Statement**

You are given a list of HTTP status codes produced by a backend service.
Build a frequency map to count how many times each status code appears.

This mirrors a common log-metrics aggregation in backend data pipelines.

**Input Format**

A list of integers status_codes

status_codes = [
    200, 500, 200, 404, 200, 500, 403
]


**Output Format**

Return a dictionary where:

key = HTTP status code

value = count of occurrences

{
    200: 3,
    500: 2,
    404: 1,
    403: 1
}


**Constraints**

1 <= len(status_codes) <= 10^5

Status codes are integers (e.g., 200, 404, 500)

Order does NOT matter

Use basic dict operations only (no Counter, no defaultdict)

In [0]:
status_codes = [ 200, 500, 200, 404, 200, 500, 403 ]

code_freq = {}

for s in status_codes:
    code_freq[s] = code_freq.get(s,0) + 1

code_freq

**3. Problem Statement**

You are given a list of user IDs representing API calls.
Each user ID may appear multiple times.
Build a frequency map of API calls per user.

This simulates counting requests per user before rate-limit checks.

**Input Format**

A list of integers user_ids

user_ids = [
    101, 102, 101, 103, 101, 102, 104
]

**Output Format**

Return a dictionary where:

key = user_id

value = number of API calls made by that user

{
    101: 3,
    102: 2,
    103: 1,
    104: 1
}

**Constraints**

1 <= len(user_ids) <= 10^5

User IDs are integers

Input fits in memory

❌ No Counter

❌ No defaultdict

✅ Use plain dict

In [0]:
user_ids = [101, 102, 101, 103, 101, 102, 104]

id_freq = {}

for u in user_ids:
    id_freq[u] = id_freq.get(u,0) + 1

id_freq

**4. Problem Statement**

You are processing application logs represented as (user_id, success_flag) tuples.

Count only successful API calls per user.

This mirrors a real ETL filter + aggregation step (filter → aggregate).

**Input Format**

A list of tuples (user_id: int, success: bool)

logs = [
    (101, True),
    (102, False),
    (101, True),
    (101, False),
    (102, True),
    (103, True),
    (103, True)
]

**Output Format**

Return a dictionary where:

key = user_id

value = number of successful calls

{
    101: 2,
    102: 1,
    103: 2
}

**Constraints**

1 <= len(logs) <= 10^5

User IDs are integers

success_flag is boolean

❌ No Counter

❌ No defaultdict

✅ Use plain dictionary

Single pass preferred

In [0]:
logs = [
    (101, True),
    (102, False),
    (101, True),
    (101, False),
    (102, True),
    (103, True),
    (103, True)
]

success_call = {}

for key, success in logs:
    if success == True:
        success_call[key] = success_call.get(key,0) + 1

success_call

**5. Problem Statement**

You are processing a large log file stream of user activity events.
Each log record is a tuple:

(user_id, event_type, timestamp)


You must compute a frequency map of events per user, but ONLY for events that occurred within the last N seconds relative to the maximum timestamp seen so far.

This simulates late-arriving data + sliding time window aggregation in a real data pipeline.

**Input Format**

An iterable (stream-like) of tuples:

(user_id: int, event_type: str, timestamp: int)


Example:

logs = [
    (101, "click", 100),
    (102, "view", 105),
    (101, "click", 110),
    (101, "purchase", 160),
    (102, "view", 170),
    (101, "click", 190)
]

An integer window_size (seconds)

window_size = 60


Window Definition

Only count events where:

timestamp >= (max_timestamp_seen_so_far - window_size)


Using the example above:

max_timestamp = 190

Valid window: >= 130

**Output Format**

Return a nested frequency map:

{
    user_id: {
        event_type: count
    }
}


**Expected Output:**

{
    101: {
        "purchase": 1,
        "click": 1
    },
    102: {
        "view": 1
    }
}


**Constraints**

Stream size can be millions of records

Input may NOT be sorted by timestamp

Memory is limited — you cannot store the entire stream

❌ No Counter

❌ No defaultdict

✅ Use plain dictionaries

✅ Single pass logic expected

Clean-up of expired events is required

In [0]:
logs = [
    (101, "click", 100),
    (102, "view", 105),
    (101, "click", 110),
    (101, "purchase", 160),
    (102, "view", 170),
    (101, "click", 190)
]

