In [2]:
import pandas as pd

# Reshaping Data

In [3]:
# Sample DataFrame
df = pd.DataFrame({
    "Name": ["Onkar", "Amit", "Sara"],
    "Math": [85, 90, 88],
    "Science": [78, 92, 91],
    "English": [80, 86, 89]
})

df

Unnamed: 0,Name,Math,Science,English
0,Onkar,85,78,80
1,Amit,90,92,86
2,Sara,88,91,89


## 1. `.melt()` - Wide -> Long format

It converts columns into rows.

In [10]:
pd.melt(
    df,
    id_vars=["Name"],
    var_name="Subject",
    value_name="Marks"
)

Unnamed: 0,Name,Subject,Marks
0,Onkar,Math,85
1,Amit,Math,90
2,Sara,Math,88
3,Onkar,Science,78
4,Amit,Science,92
5,Sara,Science,91
6,Onkar,English,80
7,Amit,English,86
8,Sara,English,89


**Parameters:**
1. DataFrame
2. `id_vars` -> Columns we want to keep as it is.
3. `value_vars` -> Columns we want to "unpivot" or turn into rows. If not specified the all columns except `id_vars`.
4. `var_name` -> Sets name for new column that will hold your old column.
5. `value_name` -> Sets name for new column that will hold actual data value from melted columns.
6. `ignore_index` -> Decides whether to keep original row numbers or start over. (`True`-fresh index (default), `False`-Original row index).
7. `col_level` -> only used when df have multiIndex columns.

## 2. Why `melt()` is important

Most of the libraries (Seaborn, ML, stats) prefer long formats.  
- Plot subject-wise marks
- Groupby on subject
- Train ML models

In [11]:
pd.melt(
    df,
    id_vars=["Name"],
    var_name="Subject",
    value_name="Marks"
).groupby("Subject")["Marks"].mean()

Subject
English    85.000000
Math       87.666667
Science    87.000000
Name: Marks, dtype: float64

## 3. `stack()` - Columns -> Rows (Index-based)

`stack()` works on index level

In [25]:
indexed = df.set_index("Name")
indexed

Unnamed: 0_level_0,Math,Science,English
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Onkar,85,78,80
Amit,90,92,86
Sara,88,91,89


In [22]:
indexed.stack()

Name          
Onkar  Math       85
       Science    78
       English    80
Amit   Math       90
       Science    92
       English    86
Sara   Math       88
       Science    91
       English    89
dtype: int64

In [23]:
indexed.stack().reset_index(name="Marks")

Unnamed: 0,Name,level_1,Marks
0,Onkar,Math,85
1,Onkar,Science,78
2,Onkar,English,80
3,Amit,Math,90
4,Amit,Science,92
5,Amit,English,86
6,Sara,Math,88
7,Sara,Science,91
8,Sara,English,89


## 4. `unstack()` - Rows -> Columns (reverse of stack)

This returns the original DataFrame back.

In [24]:
stacked = indexed.stack()
stacked.unstack()

Unnamed: 0_level_0,Math,Science,English
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Onkar,85,78,80
Amit,90,92,86
Sara,88,91,89


# Summary

1. `melt()` -> It converts columns into rows.  
   id_vars - Columns want to keep as same.  
   value_vars - Columns want to unpivot, If not specified then all columns which are not specified in id_vars.  
   var_name - Name of the column for column names.  
   value_name - Name of column for column values.  
   ignore_index - True: fresh index (default), False: Original index  
2. `stack()` -> It keeps index column as it is (not at all) and all other columns are converted into rows.  
   It returns series. We have to convert it to df.  
4. `unstack()` -> Converts the stacked series into original df.  