In [None]:
What is Pivoting?

Pivoting means transforming rows into columns — similar to creating a pivot table in Excel.

It’s useful when you have categorical values that you want to spread across multiple columns.

## Example Data

| **department** | **year** | **revenue** |
| -------------- | -------- | ----------- |
| Sales          | 2023     | 1000        |
| Sales          | 2024     | 1200        |
| HR             | 2023     | 700         |
| HR             | 2024     | 750         |


In [1]:
from pyspark.sql import SparkSession
from pyspark.sql.functions import sum

spark = SparkSession.builder.appName("PivotExample").getOrCreate()

25/11/07 19:03:17 WARN Utils: Your hostname, user-HP-Pavilion-x360-Convertible-14-dh0xxx resolves to a loopback address: 127.0.1.1; using 192.168.1.24 instead (on interface wlo1)
25/11/07 19:03:17 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
25/11/07 19:03:18 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
25/11/07 19:03:18 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
25/11/07 19:03:18 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.


In [2]:
data = [
    ("Sales", 2023, 1000),
    ("Sales", 2024, 1200),
    ("HR", 2023, 700),
    ("HR", 2024, 750)
]

df = spark.createDataFrame(data, ["department", "year", "revenue"])
df.show()

                                                                                

+----------+----+-------+
|department|year|revenue|
+----------+----+-------+
|     Sales|2023|   1000|
|     Sales|2024|   1200|
|        HR|2023|    700|
|        HR|2024|    750|
+----------+----+-------+



In [3]:
pivoted_df = (
    df.groupBy("department")
      .pivot("year")
      .agg(sum("revenue"))
)

pivoted_df.show()

+----------+----+----+
|department|2023|2024|
+----------+----+----+
|     Sales|1000|1200|
|        HR| 700| 750|
+----------+----+----+

