### DataFrame.pivot_table()
- The **DataFrame.pivot_table()** method in pandas is a powerful tool for creating a pivot table from your DataFrame. 
- A pivot table is a data summarization tool that allows you to aggregate, group, and reorganize data to display it in a more structured way, often summarizing data with statistics like sum, mean, or count.


- https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pivot_table.html
- **Syntax**
  ```python
  DataFrame.pivot_table(
    values=None, 
    index=None, 
    columns=None, 
    aggfunc='mean', 
    fill_value=None, 
    margins=False, 
    dropna=True, 
    margins_name='All'
)
```


- **Key Parameters:**
    - values: The column(s) to aggregate.
    - index: The rows you want to group by.
    - columns: The columns you want to group by.
	- aggfunc: The aggregation function (e.g., mean, sum, count). The default is mean.
	- fill_value: The value to replace missing values.
	- margins: If True, adds a row/column with aggregate values.
	- margins_name: Name for the margins row/column (default is “All”).

In [5]:
import pandas as pd

# Sample DataFrame
data = {
    'Salesperson': ['Alice', 'Bob', 'Charlie', 'Alice', 'Bob', 'Charlie'],
    'Region': ['North', 'South', 'East', 'North', 'North', 'South'],
    'Sales': [200, 150, 300, 100, 250, 400]
}

df = pd.DataFrame(data)
df

Unnamed: 0,Salesperson,Region,Sales
0,Alice,North,200
1,Bob,South,150
2,Charlie,East,300
3,Alice,North,100
4,Bob,North,250
5,Charlie,South,400


### groupby(): grouped by Salesperson

In [33]:
#groupby_sp = df.groupby('Salesperson')[['Sales']].sum()  //same
groupby_sp = df.groupby(by='Salesperson')[['Sales']].sum()
groupby_sp

Unnamed: 0_level_0,Sales
Salesperson,Unnamed: 1_level_1
Alice,450
Bob,600
Charlie,800


### groupby(): grouped by Region

In [9]:
groupby_sp = df.groupby('Region')[['Sales']].sum()
groupby_sp

Unnamed: 0_level_0,Sales
Region,Unnamed: 1_level_1
East,300
North,550
South,550


### pivot_table(): 1. Basic Operation: 
- rows grouped by sales person and
- columns grouped by region

In [11]:
df

Unnamed: 0,Salesperson,Region,Sales
0,Alice,North,200
1,Bob,South,150
2,Charlie,East,300
3,Alice,North,100
4,Bob,North,250
5,Charlie,South,400


In [12]:

# Creating a pivot table to show total sales per salesperson in each region
pivot = df.pivot_table(values='Sales', index='Salesperson', columns='Region', aggfunc='sum')

pivot

Region,East,North,South
Salesperson,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Alice,,300.0,
Bob,,250.0,150.0
Charlie,300.0,,400.0


- **Explanation:**

	- values='Sales': We are summarizing the Sales column.
	- index='Salesperson': **Rows are grouped by Salesperson.**
	- columns='Region': **Columns are grouped by Region.**
	- aggfunc='sum': We are summing the sales data for each combination of Salesperson and Region.
	- The result shows the total sales for each Salesperson in each Region. Missing values (NaN) indicate that a salesperson didn’t make sales in that region.

### pivot_table(): 2. Handling Missing Values

In [15]:
pivot = df.pivot_table(values='Sales', index='Salesperson', columns='Region', aggfunc='sum', fill_value=0)
pivot

Region,East,North,South
Salesperson,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Alice,0,300,0
Bob,0,250,150
Charlie,300,0,400


### pivot_table(): 3. Adding Margins (Total Values)

In [17]:

# Creating a pivot table to show total sales per salesperson in each region
pivot = df.pivot_table(values='Sales', index='Salesperson', columns='Region', aggfunc='sum', margins = True)

pivot

Region,East,North,South,All
Salesperson,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Alice,,300.0,,300
Bob,,250.0,150.0,400
Charlie,300.0,,400.0,700
All,300.0,550.0,550.0,1400


### pivot_table(): 4. Using Multiple Aggregation Functions

In [39]:
# Using both sum and mean as aggregation functions
pivot = df.pivot_table(values='Sales', index='Salesperson', columns='Region', aggfunc=['sum', 'mean'])

pivot

Unnamed: 0_level_0,sum,sum,sum,mean,mean,mean
Region,East,North,South,East,North,South
Salesperson,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
Alice,,450.0,,,150.0,
Bob,,,600.0,,,200.0
Charlie,800.0,,,266.666667,,


In [41]:
pivot = df.pivot_table(values='Sales', index='Salesperson', columns='Region', aggfunc=['sum', 'mean'], fill_value=0, margins=True)
pivot

Unnamed: 0_level_0,sum,sum,sum,sum,mean,mean,mean,mean
Region,East,North,South,All,East,North,South,All
Salesperson,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
Alice,0,450,0,450,0.0,150.0,0.0,150.0
Bob,0,0,600,600,0.0,0.0,200.0,200.0
Charlie,800,0,0,800,266.666667,0.0,0.0,266.666667
All,800,450,600,1850,266.666667,150.0,200.0,205.555556


## <span style="color:orangered"> Practice </span>
Using the given dataset, create a pivot table that:
1. Shows the **total sales** for each combination of Region and Product.
2. Shows the **total sales** for each combination of Salesperson and Region .
2. Show the **average Commision** for each combination of Region and Product.

In [22]:
import pandas as pd

# Creating the DataFrame
data = {
    'Salesperson': ['Alice', 'Bob', 'Charlie', 'Alice', 'Bob', 'Charlie', 'Alice', 'Bob', 'Charlie'],
    'Region': ['North', 'South', 'East', 'North', 'South', 'East', 'North', 'South', 'East'],
    'Product': ['Books', 'Toys', 'Clothes', 'Toys', 'Clothes', 'Books', 'Clothes', 'Books', 'Toys'],
    'Sales': [200, 150, 300, 100, 250, 400, 150, 200, 100],
    'Commission': [20, 15, 30, 10, 25, 40, 15, 20, 10]
}

df = pd.DataFrame(data)
df

Unnamed: 0,Salesperson,Region,Product,Sales,Commission
0,Alice,North,Books,200,20
1,Bob,South,Toys,150,15
2,Charlie,East,Clothes,300,30
3,Alice,North,Toys,100,10
4,Bob,South,Clothes,250,25
5,Charlie,East,Books,400,40
6,Alice,North,Clothes,150,15
7,Bob,South,Books,200,20
8,Charlie,East,Toys,100,10


In [43]:
#Shows the total sales for each combination of Region and Product.
pivot = df.pivot_table(values='Sales', index='Product', columns='Region', aggfunc=['sum'], fill_value=0, margins=False)
pivot

Unnamed: 0_level_0,sum,sum,sum
Region,East,North,South
Product,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
Books,400,200,200
Clothes,300,150,250
Toys,100,100,150


In [45]:
#Shows the total sales for each combination of Salesperson and Region .
pivot = df.pivot_table(values='Sales', index='Salesperson', columns='Region', aggfunc=['sum'], fill_value=0)
pivot

Unnamed: 0_level_0,sum,sum,sum
Region,East,North,South
Salesperson,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
Alice,0,450,0
Bob,0,0,600
Charlie,800,0,0


In [53]:
#Show the average Commision for each combination of Region and Product.
pivot = df.pivot_table(values='Sales', index='Product', columns='Region', aggfunc=['mean'], fill_value=0)
pivot

Unnamed: 0_level_0,mean,mean,mean
Region,East,North,South
Product,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
Books,400.0,200.0,200.0
Clothes,300.0,150.0,250.0
Toys,100.0,100.0,150.0
