**⭐ 1. What This Pattern Solves**

Unpivoting (also called melt) transforms columns into rows, the opposite of pivot. This is useful for tidying wide tables or preparing data for analytics.

Use cases:

Transform monthly sales columns (Jan, Feb, Mar) into (month, amount) rows.

Convert survey responses stored as multiple columns into long format.

Prepare wide tables for aggregation, joins, or ML features.

**⭐ 2. SQL Equivalent**

In [0]:
%sql
SELECT name, 'Jan' AS month, Jan AS amount FROM sales
UNION ALL
SELECT name, 'Feb' AS month, Feb AS amount FROM sales
UNION ALL
SELECT name, 'Mar' AS month, Mar AS amount FROM sales;


In [0]:
%sql
SELECT name, month, amount
FROM sales
LATERAL VIEW stack(3, 'Jan', Jan, 'Feb', Feb, 'Mar', Mar) AS (month, amount)

**⭐ 3. Core Idea**

Convert multiple columns into key-value rows using either stack() (Spark SQL) or selectExpr() in PySpark. This simplifies aggregation and visualization.

**⭐ 4. Template Code (MEMORIZE THIS)**

In [0]:
# Using selectExpr with stack
df.selectExpr(
    "id",
    "stack(3, 'Jan', Jan, 'Feb', Feb, 'Mar', Mar) as (month, amount)"
)

**⭐ 5. Detailed Example**

In [0]:
data = [("Alice", 100, 150), ("Bob", 200, 50)]
df = spark.createDataFrame(data, ["name", "Jan", "Feb"])
df.show()

In [0]:
+-----+---+---+
|name |Jan|Feb|
+-----+---+---+
|Alice|100|150|
|Bob  |200|50 |
+-----+---+---+

In [0]:
df_unpivot = df.selectExpr(
    "name",
    "stack(2, 'Jan', Jan, 'Feb', Feb) as (month, amount)"
)
df_unpivot.show()

In [0]:
+-----+-----+------+
|name |month|amount|
+-----+-----+------+
|Alice|Jan  |100   |
|Alice|Feb  |150   |
|Bob  |Jan  |200   |
|Bob  |Feb  |50    |
+-----+-----+------+


**⭐ 6. Mini Practice Problems**

Unpivot columns Q1, Q2, Q3 into (quarter, value) for each employee.

Melt product_sales columns (online, offline) into (channel, sales).

Unpivot wide survey data with multiple categorical columns into long format.

**⭐ 7. Full Data Engineering Problem**

You have a sales report table with columns store_id, Jan, Feb, Mar.

Task: Convert it to a long table (store_id, month, sales) for aggregation, forecasting, or ML features.

Pattern: selectExpr() + stack() → write to Delta Silver table for further processing.

**⭐ 8. Time & Space Complexity**

Time: O(n * m) → n rows, m columns being unpivoted

Space: Output rows = n * m

**⭐ 9. Common Pitfalls**

Forgetting the correct number of columns in stack() → causes errors.

Not aliasing key/value columns → hard to use downstream.

Unpivoting many columns without filtering → memory-heavy.

Confusing pivot vs unpivot → check your analytics goal.