**⭐ 1. What This Pattern Solves**

Used for:

Fact ↔ Dimension enrichment

Lookup tables

Mapping IDs to objects

Associating metadata

Key-based merging in Python

Simulating SQL INNER JOIN logic

ETL enrichment in Bronze → Silver steps

Foreign key lookups in memory

Log enrichment with user/device/session info

This is one of the most important Python patterns for DE interviews.

**⭐ 2. SQL Equivalent**

In [0]:
%sql
SELECT f.*, d.*
FROM fact f
JOIN dim d
ON f.id = d.id;

**⭐ 3. Core Idea**

Build a dictionary index from the dimension table for O(1) lookups.

**⭐ 4. Template Code (MEMORIZE THIS)**

In [0]:
dim_index = {d[key]: d for d in dim}

for f in fact:
    if f[key] in dim_index:
        enriched = {**f, **dim_index[f[key]]}

**⭐ 5. Detailed Example**

In [0]:
## Dimension table:
dim = [
  {"id":1, "name":"Alice"},
  {"id":2, "name":"Bob"}
]

## Fact table:
fact = [
  {"id":1, "amt":100},
  {"id":2, "amt":200},
  {"id":3, "amt":300}
]

## Apply pattern:
dim_index = {d["id"]: d for d in dim}

out = []
for f in fact:
    if f["id"] in dim_index:
        out.append({**f, **dim_index[f["id"]]})

In [0]:
## Output
[
  {"id":1,"amt":100,"name":"Alice"},
  {"id":2,"amt":200,"name":"Bob"}
]


**⭐ 6. Mini Practice Problems**
Problem 1

Enrich orders with product info.

Problem 2

Enrich logins with user roles.

Problem 3

Match error codes to their descriptions.

**⭐ 7. Full Data Engineering Problem**

In [0]:
## users.json:
[ {"uid":1,"country":"US"}, {"uid":2,"country":"CA"} ]
## events.json:
[ {"uid":1,"ev":"open"}, {"uid":3,"ev":"click"}, {"uid":2,"ev":"open"} ]


In [0]:
## Task:Perform an INNER JOIN on uid.
user_index = {u["uid"]: u for u in users}

out = []
for e in events:
    if e["uid"] in user_index:
        out.append({**e, **user_index[e["uid"]]})

**⭐ 8. Time & Space Complexity**

Build index: O(n)

Lookup: O(1) average

Overall join: O(n + m)

Same complexity profile as Spark broadcast join but at Python scale.

**⭐ 9. Common Pitfalls**

❌ Using nested loops → O(n²)
❌ Not building a hash index first
❌ Assuming all keys exist → KeyError
❌ Forgetting to merge both dictionaries

✔ Always build dim_index first.