# Reshaping Data using `melt()` and `pivot()` in Pandas

Once your data is clean, the next step is often to **reshape or reformat** it for analysis or visualization.  
Pandas provides two key methods for this: **`melt()`** and **`pivot()`**.

---

## üîÑ `melt()` ‚Äî Wide to Long

The **`melt()`** method unpivots a DataFrame from **wide format to long format**.  
In other words, it converts multiple columns into **key-value pairs**.

### üß† When to Use `melt()`

Use `melt()` when:
- Each **row** is an observation.
- Each **column** represents a variable or measurement.
- You want to **reshape** data into a longer, ‚Äútidy‚Äù format for analysis or plotting.

---

### üß© Syntax

```python
df.melt(
    id_vars=None,
    value_vars=None,
    var_name=None,
    value_name="value",
    col_level=None
)
```

### üìò Parameters

| Parameter | Description |
|------------|--------------|
| `id_vars` | Columns to keep fixed (identifiers). |
| `value_vars` | Columns to unpivot (to be melted). |
| `var_name` | Name for the new variable column. Default = `'variable'`. |
| `value_name` | Name for the new value column. Default = `'value'`. |
| `col_level` | Used for multi-level columns. |

---

### üìä Example: Using `melt()`

```python
import pandas as pd

# Sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Math': [85, 78, 92],
    'Science': [90, 82, 89],
    'English': [88, 85, 94]
}

df = pd.DataFrame(data)
print(df)
```

**Output (Wide Format):**
```
      Name  Math  Science  English
0    Alice    85       90       88
1      Bob    78       82       85
2  Charlie    92       89       94
```

Now, use `melt()` to reshape this into a **long format**:

```python
df_long = df.melt(
    id_vars=["Name"],
    value_vars=["Math", "Science", "English"],
    var_name="Subject",
    value_name="Score"
)
print(df_long)
```

**Output (Long Format):**
```
      Name  Subject  Score
0    Alice     Math     85
1      Bob     Math     78
2  Charlie     Math     92
3    Alice  Science     90
4      Bob  Science     82
5  Charlie  Science     89
6    Alice  English     88
7      Bob  English     85
8  Charlie  English     94
```

### ‚úÖ Explanation

- `id_vars=["Name"]` ‚Üí Keep ‚ÄúName‚Äù fixed as the identifier.  
- `value_vars=["Math", "Science", "English"]` ‚Üí Columns to melt.  
- `var_name="Subject"` ‚Üí New column for variable names.  
- `value_name="Score"` ‚Üí New column for values.

### üí° Why Use `melt()`

- **Data normalization** ‚Äî prepares data for modeling or visualization.  
- **Visualization-friendly** ‚Äî plotting libraries (like Seaborn) prefer long-form data.  
- **Tidy data principle** ‚Äî each variable forms a column, each observation a row.

---

## üîÅ `pivot()` ‚Äî Long to Wide

The **`pivot()`** method does the **reverse** of `melt()`: it converts long-format data **back to wide-format**.

### üß© Syntax

```python
df.pivot(index=None, columns=None, values=None)
```

### üìò Parameters

| Parameter | Description |
|------------|--------------|
| `index` | Column whose unique values become rows. |
| `columns` | Column whose unique values become columns. |
| `values` | Column whose values fill the new DataFrame. |

---

### üìä Example: Using `pivot()`

Suppose we have this **long-format DataFrame**:

```python
data = {
    "Name": ["Alice", "Alice", "Alice", "Bob", "Bob", "Bob", "Charlie", "Charlie", "Charlie"],
    "Subject": ["Math", "Science", "English", "Math", "Science", "English", "Math", "Science", "English"],
    "Score": [85, 90, 88, 78, 82, 85, 92, 89, 94]
}
df_long = pd.DataFrame(data)
print(df_long)
```

**Output (Long Format):**
```
      Name  Subject  Score
0    Alice     Math     85
1    Alice  Science     90
2    Alice  English     88
3      Bob     Math     78
4      Bob  Science     82
5      Bob  English     85
6  Charlie     Math     92
7  Charlie  Science     89
8  Charlie  English     94
```

Now convert it to **wide format** using `pivot()`:

```python
df_wide = df_long.pivot(index="Name", columns="Subject", values="Score")
print(df_wide)
```

**Output (Wide Format):**
```
Subject   English  Math  Science
Name                           
Alice         88    85       90
Bob           85    78       82
Charlie       94    92       89
```

### ‚úÖ Explanation

- `index="Name"` ‚Üí Becomes the new row labels.  
- `columns="Subject"` ‚Üí Unique subjects become column headers.  
- `values="Score"` ‚Üí Fill values into the corresponding cells.

---

## ‚öôÔ∏è Handling Duplicates with `pivot_table()`

If your data has **duplicate entries** for the same combination of `index` and `columns`,  
`pivot()` will raise an error.  
In that case, use **`pivot_table()`** with an aggregation function.

### Example:

```python
data = {
    "Name": ["Alice", "Alice", "Alice", "Bob", "Bob"],
    "Subject": ["Math", "Math", "Science", "Math", "Science"],
    "Score": [85, 80, 90, 78, 82]
}

df = pd.DataFrame(data)
df_table = df.pivot_table(index="Name", columns="Subject", values="Score", aggfunc="mean")
print(df_table)
```

**Output:**
```
Subject   Math  Science
Name                    
Alice     82.5     90.0
Bob       78.0     82.0
```

Here, the Math score for Alice is averaged ‚Üí (85 + 80) / 2 = 82.5.

---

## ‚úÖ Summary

| Method | Direction | Description |
|---------|-------------|-------------|
| `melt()` | Wide ‚Üí Long | Converts columns into rows |
| `pivot()` | Long ‚Üí Wide | Converts rows into columns |
| `pivot_table()` | Long ‚Üí Wide | Handles duplicates by aggregating values |

---

### üí° Key Takeaways

- Use **`melt()`** to go *long* ‚Üí normalize or tidy your data.  
- Use **`pivot()`** to go *wide* ‚Üí reformat for reports or visualization.  
- Use **`pivot_table()`** when you have duplicates that need aggregation (like mean or sum).

---


In [1]:
import pandas as pd
import numpy as np

In [2]:
import pandas as pd

# Sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Math': [85, 78, 92],
    'Science': [90, 82, 89],
    'English': [88, 85, 94]
}

df = pd.DataFrame(data)

# Display the DataFrame
print(df)

      Name  Math  Science  English
0    Alice    85       90       88
1      Bob    78       82       85
2  Charlie    92       89       94


In [3]:
melt_df = pd.melt(df, id_vars="Name", var_name="Subject", value_name="Marks")
melt_df

Unnamed: 0,Name,Subject,Marks
0,Alice,Math,85
1,Bob,Math,78
2,Charlie,Math,92
3,Alice,Science,90
4,Bob,Science,82
5,Charlie,Science,89
6,Alice,English,88
7,Bob,English,85
8,Charlie,English,94


In [10]:
#      Remain Fixed     All others are melted its name
df2 = df.melt(id_vars='Name', var_name='Sub', value_name='Markss').copy()          # Value colum name 
df2

Unnamed: 0,Name,Sub,Markss
0,Alice,Math,85
1,Bob,Math,78
2,Charlie,Math,92
3,Alice,Science,90
4,Bob,Science,82
5,Charlie,Science,89
6,Alice,English,88
7,Bob,English,85
8,Charlie,English,94


### PIVOT()

In [11]:
df2.pivot(index='Name', columns='Sub', values='Markss')

Sub,English,Math,Science
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Alice,88,85,90
Bob,85,78,82
Charlie,94,92,89


In [None]:
# Pivot throws error when it find duplicate values

### Practice

In [12]:
data = {
    "Name": ["John", "Sara", "Mike"],
    "Physics": [70, 85, 90],
    "Chemistry": [75, 88, 80],
    "Biology": [72, 91, 78]
}

df3 = pd.DataFrame(data)

In [13]:
df3

Unnamed: 0,Name,Physics,Chemistry,Biology
0,John,70,75,72
1,Sara,85,88,91
2,Mike,90,80,78


In [17]:
df4 = df3.melt(id_vars='Name', var_name='Sub', value_name='Marks').copy()

In [19]:
df4.pivot(index='Name', columns='Sub', values='Marks')

Sub,Biology,Chemistry,Physics
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
John,72,75,70
Mike,78,80,90
Sara,91,88,85


### For melt 
- id_vars = Who remains fixed
- var_name = melted column name        ,
- value_var = value column name

df.melt(id_vars='Name', var_name='Sub', value_var='Marks)

### For pivot()
- index = Who stayed fix
- columns = var_name
- values = value_name

df.pivot(index='Name', columns='Sub', values='Marks')

In [2]:
#### Prac.

In [3]:
import pandas as pd

In [8]:
import pandas as pd

data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Math": [85, 78, 92],
    "Science": [90, 82, 89],
    "English": [88, 85, 94]
}

df2 = pd.DataFrame(data)


In [9]:
df2

Unnamed: 0,Name,Math,Science,English
0,Alice,85,90,88
1,Bob,78,82,85
2,Charlie,92,89,94


In [18]:
df3 = df2.melt(id_vars='Name', var_name='Sub', value_name='Marks')

In [19]:
df3.pivot_table(index='Name', columns='Sub', values='Marks', aggfunc='mean')

Sub,English,Math,Science
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Alice,88.0,85.0,90.0
Bob,85.0,78.0,82.0
Charlie,94.0,92.0,89.0


In [23]:
# Q4. Using pivot_table(), calculate the maximum score per subject and display subjects as rows.

df3.pivot_table(index="Sub", values="Marks", aggfunc="max")

Unnamed: 0_level_0,Marks
Sub,Unnamed: 1_level_1
English,94
Math,92
Science,90


In [27]:
# Q5. After melting the data in Q1, sort the result by Score in descending order.

df4 = df3.sort_values('Marks', ascending=False)

In [29]:
df4.sort_index()

Unnamed: 0,Name,Sub,Marks
0,Alice,Math,85
1,Bob,Math,78
2,Charlie,Math,92
3,Alice,Science,90
4,Bob,Science,82
5,Charlie,Science,89
6,Alice,English,88
7,Bob,English,85
8,Charlie,English,94
