

---

### What is a Pivot Table?

A **pivot table** is a way to **summarize and reorganize data** so you can easily see patterns or comparisons.

Imagine you have a big table of data with many rows, like sales records that include the product, region, and amount sold. A pivot table lets you:

* Group the data by one or more categories (like grouping by product and region).
* Show the results in a new table where rows and columns represent those groups.
* Calculate summary statistics (like sums, averages, counts) for the grouped data.

---

### Why Use a Pivot Table?

* To **quickly analyze** how data is distributed across different categories.
* To **compare values** across two or more factors easily.
* To **turn long data into a more readable, table format**.
* To **see summaries** of large datasets without writing complex code.

---

### How It Works (Conceptually)

1. You decide which columns you want as **row labels** (e.g., products).
2. You decide which columns you want as **column labels** (e.g., regions).
3. You pick what kind of summary you want (e.g., total sales, average sales).
4. The pivot table then arranges your data into a grid, with row categories down the side, column categories across the top, and the summarized values in the middle.

---

### Example in Simple Terms

If you have sales data for different products in different cities, a pivot table can show:

* Rows as product names
* Columns as cities
* Values as the total sales amount for each product in each city

So you can quickly see which product sells best where.




In [1]:
#pivot table
#summarization, reorganization, manipulation of data
import pandas as pd

data = {
    "sales_person":["a","b","a","a","b","c"],
    "region":["north","north","south","east","east","south"],
    "sales":[1000,2000,5000,800,1500,250]
}

df = pd.DataFrame(data)

#pivot table for summarizing total sales by sales_person and region
pivot_table = pd.pivot_table(df, values='sales', index='sales_person', columns='region', aggfunc='sum', fill_value=0)
#fill_value=0 replaces NaN with 0
print("Pivot Table:\n\n", pivot_table)

Pivot Table:

 region        east  north  south
sales_person                    
a              800   1000   5000
b             1500   2000      0
c                0      0    250
