**⭐ 1. What This Pattern Solves**

Efficiently compute cumulative sums over sequences.

Optimize range-sum queries in ETL or analytics pipelines.

Avoid repeated summation over large datasets (reduces repeated sum() calls).

Useful in metrics, transaction aggregations, or windowed calculations.

**⭐ 2. SQL Equivalent**

In [0]:
%sql
-- Compute cumulative sum over a column, ordered by id
SELECT id, value,
       SUM(value) OVER (ORDER BY id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS prefix_sum
FROM table_name;

**⭐ 3. Core Idea**

Precompute cumulative sums so any range sum can be answered in O(1) using subtraction.
Works because sum of elements [i:j] = prefix[j] - prefix[i-1].

**⭐ 4. Template Code (MEMORIZE THIS)**

In [0]:
# Compute prefix sums
arr = [3, 1, 4, 1, 5]
prefix = [0] * (len(arr) + 1)  # extra 0 at start for easier indexing

for i in range(len(arr)):
    prefix[i + 1] = prefix[i] + arr[i]

# Range sum from l to r (inclusive)
l, r = 1, 3
range_sum = prefix[r + 1] - prefix[l]
print(range_sum)

**⭐ 5. Detailed Example**

In [0]:
arr = [2, 7, 1, 8, 2]

prefix = [0] * (len(arr) + 1)
for i in range(len(arr)):
    prefix[i + 1] = prefix[i] + arr[i]

# sum from index 1 to 3
l, r = 1, 3
range_sum = prefix[r + 1] - prefix[l]
print(range_sum)

# 16  # 7 + 1 + 8

**⭐ 6. Mini Practice Problems**

Given a list of daily sales, compute the total sales between day 5 and day 12 efficiently.

Find the maximum sum of any contiguous subarray using prefix sums.

Preprocess an array so that queries for sum of elements between any two indices are O(1).

**⭐ 7. Full Data Engineering Scenario**

Problem Statement:
You have a streaming transaction log of amounts per customer per day. You want to compute cumulative spending for each customer and support queries like “total spend between day X and Y”.

Expected Output:

Customer cumulative spend array

Quick range-sum queries

In [0]:
# transactions: list of tuples (customer_id, day, amount)
from collections import defaultdict

prefix_sums = defaultdict(list)

# Step 1: Group transactions by customer and sort by day
for customer, day, amount in sorted(transactions, key=lambda x: (x[0], x[1])):
    if prefix_sums[customer]:
        prefix_sums[customer].append(prefix_sums[customer][-1] + amount)
    else:
        prefix_sums[customer].append(amount)

# Step 2: Answer range queries
# range_sum = prefix_sums[customer][r] - prefix_sums[customer][l-1]

**⭐ 8. Time & Space Complexity**

Time Complexity:

O(n) to build prefix sums

O(1) per range query

Space Complexity: O(n) for the prefix array

**⭐ 9. Common Pitfalls & Mistakes**

❌ Off-by-one errors in indexing (prefix[i+1] vs prefix[i])
❌ Forgetting extra initial 0 for easier range calculation
❌ Recomputing prefix sums for every query instead of precomputing
✔ Use a length n+1 array to simplify inclusive range queries
✔ Ensure sorting when prefix sums depend on order (like dates or IDs)