üü¶ 1. Import Libraries

In [2]:
import pandas as pd

üü¶ 2. Sample Long-Format DataFrame

In [3]:
df = pd.DataFrame({
    "City": ["Toronto", "Toronto", "Vancouver", "Vancouver", "Montreal", "Montreal"],
    "Month": ["Jan", "Feb", "Jan", "Feb", "Jan", "Feb"],
    "Sales": [120, 150, 90, 100, 80, 110]
})

df

Unnamed: 0,City,Month,Sales
0,Toronto,Jan,120
1,Toronto,Feb,150
2,Vancouver,Jan,90
3,Vancouver,Feb,100
4,Montreal,Jan,80
5,Montreal,Feb,110


üü¶ 3. Basic pivot() ‚Äî Long ‚Üí Wide

In [4]:
df_wide = df.pivot(index="City", columns="Month", values="Sales")
df_wide

Month,Feb,Jan
City,Unnamed: 1_level_1,Unnamed: 2_level_1
Montreal,110,80
Toronto,150,120
Vancouver,100,90


üü¶ 4. Handling Duplicates (Why pivot() Can Fail)

In [6]:
df_dup = pd.DataFrame({
    "City": ["Toronto", "Toronto", "Toronto"],
    "Month": ["Jan", "Jan", "Feb"],
    "Sales": [120, 125, 150]
})

# ‚ùå This will raise a ValueError
df_dup.pivot(index="City", columns="Month", values="Sales")

ValueError: Index contains duplicate entries, cannot reshape

In [8]:
# ‚úî To handle duplicates ‚Üí use pivot_table() instead:
pd.pivot_table(df_dup, index="City", columns="Month", values="Sales", aggfunc="mean")


Month,Feb,Jan
City,Unnamed: 1_level_1,Unnamed: 2_level_1
Toronto,150.0,122.5


üü¶ 5. Multiple Value Columns

In [9]:
df2 = pd.DataFrame({
    "City": ["Toronto", "Toronto", "Vancouver", "Vancouver"],
    "Month": ["Jan", "Feb", "Jan", "Feb"],
    "Sales": [120, 150, 90, 100],
    "Units": [10, 12, 8, 9]
})

df2.pivot(index="City", columns="Month")

Unnamed: 0_level_0,Sales,Sales,Units,Units
Month,Feb,Jan,Feb,Jan
City,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2
Toronto,150,120,12,10
Vancouver,100,90,9,8


üü¶ 6. Resetting Column Names

In [10]:
df_flat = df2.pivot(index="City", columns="Month")
df_flat.columns = ['_'.join(col).strip() for col in df_flat.columns.values]
df_flat.reset_index(inplace=True)
df_flat

Unnamed: 0,City,Sales_Feb,Sales_Jan,Units_Feb,Units_Jan
0,Toronto,150,120,12,10
1,Vancouver,100,90,9,8


In [11]:
ridership = pd.DataFrame({
    "Route": [10, 10, 20, 20],
    "Day": ["Weekday", "Weekend", "Weekday", "Weekend"],
    "Ridership": [5000, 3000, 4500, 2500]
})

ridership_wide = ridership.pivot(index="Route", columns="Day", values="Ridership")
ridership_wide

Day,Weekday,Weekend
Route,Unnamed: 1_level_1,Unnamed: 2_level_1
10,5000,3000
20,4500,2500


üü¶ 7. Flattening Columns After Pivot

In [12]:
ridership_wide.columns.name = None
ridership_wide.reset_index(inplace=True)
ridership_wide

Unnamed: 0,Route,Weekday,Weekend
0,10,5000,3000
1,20,4500,2500


## üü¶ Summary

You learned how to:

‚úî Convert long ‚Üí wide format using pivot()

‚úî Understand when to use pivot() vs pivot_table()

‚úî Handle duplicates safely using pivot_table()

‚úî Pivot multiple value columns and flatten the result