**⭐ 1. What This Pattern Solves**

Splits large datasets into smaller, manageable pieces for processing.

Useful in ETL pipelines where memory or API limits exist.

Enables batch processing, streaming, or parallelization of data.

Avoids holding the entire dataset in memory at once.

**⭐ 2. SQL Equivalent**

In [0]:
%sql
-- Process rows in batches of N using OFFSET + LIMIT
SELECT *
FROM transactions
ORDER BY transaction_id
LIMIT 1000 OFFSET 0;

SELECT *
FROM transactions
ORDER BY transaction_id
LIMIT 1000 OFFSET 1000;

**⭐ 3. Core Idea**

Break a large iterable into fixed-size pieces; process each piece independently to save memory and improve efficiency.

**⭐ 4. Template Code (MEMORIZE THIS)**

In [0]:
def chunked(iterable, size):
    """Yield successive chunks from iterable of given size."""
    for i in range(0, len(iterable), size):
        yield iterable[i:i + size]

# Usage
for chunk in chunked(data, 1000):
    process(chunk)

**⭐ 5. Detailed Example**

In [0]:
data = list(range(1, 11))  # [1,2,3,...,10]
for chunk in chunked(data, 3):
    print(chunk)
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
[10]


**⭐ 6. Mini Practice Problems**

Split a list of 50 log entries into chunks of 7.

Process a CSV file in chunks of 500 rows using a generator.

Chunk a string into pieces of 4 characters each.

**⭐ 7. Full Data Engineering Scenario**

Problem: A payments API allows only 100 transactions per request. You need to send 2,350 transactions.

Expected Output: 24 API calls (23 full batches, 1 partial batch).

In [0]:
def send_batches(transactions, batch_size=100):
    for batch in chunked(transactions, batch_size):
        api.send(batch)

**⭐ 8. Time & Space Complexity**

Time Complexity: O(n) – every element is visited once.

Space Complexity: O(k) – each chunk of size k is held in memory at a time.

**⭐ 9. Common Pitfalls & Mistakes**

❌ Loading the entire dataset in memory before chunking.
❌ Modifying chunks in place without copying if the original dataset must remain intact.
✔ Use generators to avoid high memory usage.
✔ Ensure the last chunk may be smaller than the chunk size.