## How can you pivot data using the pivot and pivot_table functions?

## Why we have to use pandas?

Pandas is a popular **Python library** for **data manipulation and analysis**

**Data Handling:** Pandas provides an easy and efficient way to handle structured data,including 

reading and writing data from various file formats like CSV, Excel, SQL databases, and more.

**Data Exploration:** Pandas offers a wide range of tools for data exploration and analysis. You

can quickly examine the data using functions like head(), tail(), describe(), and info(). 

**Data Cleaning and Preprocessing:** Cleaning and preprocessing data is a crucial step in any

data analysis project. Pandas provides methods for handling missing data, duplications, and 

outliers. 

**Data Analysis and Manipulation:** Pandas excels at data manipulation tasks. You can filter,

sort, group, and aggregate data using functions like groupby(), agg(), pivot_table(), and more. 

**Integration with Other Libraries:** Pandas seamlessly integrates with other data science and 

machine learning libraries like NumPy, Matplotlib, Seaborn, and scikit-learn.

## Pivoting

Pivoting data in Pandas is a common operation used to **reshape data from a "long" format to a "wide" format or vice versa**. 

## Using pivot

The pivot function is used when you have a DataFrame with columns that you want to use as row 

indices, column headers, and values to fill the resulting table.

## Importing pandas library

In [1]:
import pandas as pd

## Reading my file

In [2]:
df = pd.read_csv('tn_tomato_price.csv')
df.head()

Unnamed: 0,date,admin1,admin2,market,price,usdprice
0,2012-04-15,Tamil Nadu,Chennai,Chennai,18.78,0.36
1,2012-04-15,Tamil Nadu,Dindigul,Dindigul,21.61,0.42
2,2012-04-15,Tamil Nadu,Tiruchchirappalli,Thiruchirapalli,21.17,0.41
3,2012-07-15,Tamil Nadu,Chennai,Chennai,25.77,0.47
4,2012-07-15,Tamil Nadu,Dindigul,Dindigul,24.18,0.44


## Pivot the dataframe

Here I need to see the tomato prices in different markets across different dates.For achieving 

this I have used pandas pivot function as seen below

In [3]:
pivot_df = df.pivot(index='date', columns='market', values='price')

In [4]:
pivot_df

market,Chennai,Coimbatore,Cuddalore,Dharmapuri,Dindigul,Ramanathapuram,Thiruchirapalli,Tirunelveli,Vellore
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2012-04-15,18.78,,,,21.61,,21.17,,
2012-07-15,25.77,,,,24.18,,24.55,,
2012-08-15,15.94,,,,12.50,,11.89,,
2012-09-15,14.70,,,,11.80,,12.10,,
2012-10-15,12.50,,,,10.30,,7.95,,
...,...,...,...,...,...,...,...,...,...
2021-03-15,21.84,16.93,15.00,10.71,17.57,24.25,14.94,15.07,15.08
2021-04-15,18.46,10.45,12.88,9.03,12.41,17.42,12.07,11.35,10.00
2021-05-15,14.23,12.74,13.00,9.41,10.65,,12.65,14.09,9.69
2021-06-15,11.75,12.91,18.00,16.29,16.25,19.53,16.38,17.29,12.37


In this example, **the pivot function** takes the **'date'** column as row indices, the **'market'** column as column headers, and the **'prices'** column as the values to fill the table.

## Using pivot_table

The pivot_table function is more versatile and can handle situations where you have duplicate 
entries for the same combination of row and column values. 


It also allows you to specify how to aggregate the values.

In [5]:
pivot_table_df = df.pivot_table(index='date', columns='market', values='price', aggfunc='mean')

In [6]:
pivot_table_df

market,Chennai,Coimbatore,Cuddalore,Dharmapuri,Dindigul,Ramanathapuram,Thiruchirapalli,Tirunelveli,Vellore
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2012-04-15,18.78,,,,21.61,,21.17,,
2012-07-15,25.77,,,,24.18,,24.55,,
2012-08-15,15.94,,,,12.50,,11.89,,
2012-09-15,14.70,,,,11.80,,12.10,,
2012-10-15,12.50,,,,10.30,,7.95,,
...,...,...,...,...,...,...,...,...,...
2021-03-15,21.84,16.93,15.00,10.71,17.57,24.25,14.94,15.07,15.08
2021-04-15,18.46,10.45,12.88,9.03,12.41,17.42,12.07,11.35,10.00
2021-05-15,14.23,12.74,13.00,9.41,10.65,,12.65,14.09,9.69
2021-06-15,11.75,12.91,18.00,16.29,16.25,19.53,16.38,17.29,12.37


In this example, we use pivot_table to handle cases where there are **duplicate entries** for the same **'date' and 'market'** combination. 

We also specify the aggregation function **(aggfunc) as 'mean'**, which calculates the average **price** for each combination of **date and market**.

Since there are no duplicate values it returns the same table as pivot as we can see from number of rows and columns

## Why pandas pivoting instead excel?

Using Pandas for pivoting data instead of Excel's pivot tables offers several advantages,

especially when dealing with **larger datasets** or when you need to automate data manipulation 

tasks in a programmatic way.**Scalability,Reproducibility,Automation,Customization and manymore**