# RESHAPING DATA USING MELT AND PIVOT

## Melt and Pivot
melt() — Wide to Long
The melt() method in Pandas is used to unpivot a DataFrame from wide format to long format. In other words, it takes columns that represent different variables and combines them into key-value pairs (i.e., long-form data).

## When to Use melt():
- When you have a DataFrame where each row is an observation, and each column represents a different variable or measurement, and you want to reshape the data into a longer format for easier analysis or visualization.

In [3]:
import pandas as pd 
data = {
    'Name':['anshika','sanya','manshi','avani','aliza'],
    'Math':[99,99,89,98,95],
    'Science':[89,78,45,77,88],
    'English':[67,66,56,43,45]
}
df = pd.DataFrame(data)
print(df)

      Name  Math  Science  English
0  anshika    99       89       67
1    sanya    99       78       66
2   manshi    89       45       56
3    avani    98       77       43
4    aliza    95       88       45


## Parameters:
- id_vars: The columns that you want to keep fixed (these columns will remain as identifiers).
- value_vars: The columns you want to unpivot (the ones you want to "melt" into a single column).
- var_name: The name to use for the new column that will contain the names of the melted columns (default is 'variable').
- value_name: The name to use for the new column that will contain the values from the melted columns (default is 'value').
- col_level: Used for multi-level column DataFrames.

## Using melt():
If we want to "melt" the DataFrame so that each row represents a student-subject pair, we can do:

In [6]:
df2=df.melt(id_vars=['Name'],value_vars=['Math','Science','English'],var_name=['Subjects'],value_name='Score')
df2

Unnamed: 0,Name,Subjects,Score
0,anshika,Math,99
1,sanya,Math,99
2,manshi,Math,89
3,avani,Math,98
4,aliza,Math,95
5,anshika,Science,89
6,sanya,Science,78
7,manshi,Science,45
8,avani,Science,77
9,aliza,Science,88


### Explanation:
- id_vars=["Name"]: We keep the "Name" column as it is because it's the identifier.
- value_vars=["Math", "Science", "English"]: These are the columns we want to melt.
- var_name="Subject": The new column containing the names of the subjects.
- value_name="Score": The new column containing the scores.

## Why Use melt()?
- Data normalization: Helps in transforming data for statistical modeling and data visualization.
- Pivot tables: Many times, plotting functions or statistical models work better with long-format data.

This is useful for converting columns into rows — perfect for plotting or tidy data formats.

## pivot() — Long to Wide
The pivot() function in Pandas is used to reshape data, specifically to turn long-format data into wide-format data. This is the reverse operation of melt().

## How it works:
- pivot() takes a long-format DataFrame and turns it into a wide-format DataFrame by specifying which columns will become the new columns, the rows, and the values.

### Parameters:
- index: The column whose unique values will become the rows of the new DataFrame.
- columns: The column whose unique values will become the columns of the new DataFrame.
- values: The column whose values will fill the new DataFrame. These will become the actual data (values in the table).

### Using pivot() to reshape it into wide format:

In [12]:
print(df2.columns)

Index(['Name', 'Subjects', 'Score'], dtype='object')


In [13]:
df2.pivot(index="Name",columns="Subjects",values="Score")
df2

Unnamed: 0,Name,Subjects,Score
0,anshika,Math,99
1,sanya,Math,99
2,manshi,Math,89
3,avani,Math,98
4,aliza,Math,95
5,anshika,Science,89
6,sanya,Science,78
7,manshi,Science,45
8,avani,Science,77
9,aliza,Science,88


In [14]:
df2.pivot_table(index="Name",columns="Subjects",values="Score",aggfunc="mean")


Subjects,English,Math,Science
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
aliza,45,95,88
anshika,67,99,89
avani,43,98,77
manshi,56,89,45
sanya,66,99,78


In [15]:
type(df2)

pandas.core.frame.DataFrame

## Explanation:
- index="Name": The unique values in the "Name" column will become the rows in the new DataFrame.
- columns="Subject": The unique values in the "Subject" column will become the columns in the new DataFrame.
- values="Score": The values from the "Score" column will populate the table.
## Why use pivot()?
1 - Better data structure: It makes data easier to analyze when you have categories that you want to split into multiple columns.
2 - Easier visualization: Often, you want to represent data in a format where categories are split across columns (for example, when creating pivot tables for reporting).
3 - Aggregating data: You can perform aggregations (like sum, mean, etc.) to group values before pivoting.
#### Important Notes:
4 - Duplicate Entries: If you have multiple rows with the same combination of index and columns, pivot() will raise an error. In such cases, you should use pivot_table() (which can handle duplicate entries by aggregating them).

In this case, the Math score for Alice is averaged (85 + 80) / 2 = 82.5. If a cell is empty, it means there was no value for that combination.

### Summary:
- Use melt() to go long, pivot() to go wide
- pivot() is used to turn long-format data into wide-format by spreading unique column values into separate columns.
- If there are duplicate values for a given combination of index and columns, you should use pivot_table() with an aggregation function to handle the duplicates.