# Changing the Shape of a Dataset

Similar to a pivot table in Excel, Pandas can also reorientate datasets.

This is useful if you want to create a specific table to analyse from.

## The Dataset...

The data used in these examples is dummy data.

It is developed from a combination of Wikipedia pages and random generated numbers.

Wiki Pages:

#### Products

- https://en.wikipedia.org/wiki/List_of_culinary_fruits
- https://en.wikipedia.org/wiki/List_of_vegetables

#### Store Names (random suburbs in Sydney)

- https://en.wikipedia.org/wiki/List_of_Sydney_suburbs

In [1]:
# Import the dependencies

import pandas as pd
import numpy as np

import warnings
warnings.simplefilter('ignore')

In [2]:
# Import the dataset

sales_data = pd.read_excel(r'../Data/SalesDataset.xlsx')

# Quick view of the data "The Head"
sales_data.head()

Unnamed: 0,Date,Campaign_ID,Customer_Group,Store_ID,Store_Name,Product_Category,Product_Group,Product,Product_ID,Units,Gross_Sales,Discount
0,31/12/2019,1000000.0,A Market That's Super,2001,Berowra Creek,Fruit,Tropical Fruit,Hydnora abyssinica,1100057,990,29.7,0.5
1,30/04/2020,,Super Super Market,1012,Bardia,Fruit,Tropical Fruit,Salak,1100094,630,0.0,0.498927
2,31/07/2020,,Market,3000,Blackett,Fruit,Tropical Fruit,Kola nut,1100062,671,1241.35,0.494303
3,31/10/2020,,A Market That's Super,2011,Bilgola Beach,Fruit,Tropical Fruit,Jackfruit,1100060,611,1283.1,0.493447
4,31/10/2020,,A Market That's Super,2006,Beverly Hills,Fruit,Tropical Fruit,Terap,1100107,684,1026.0,0.492293


In [3]:
# Make the dataset more condensed for the example

condensed_sales_data = sales_data[["Customer_Group", "Product_Category", "Gross_Sales"]]

condensed_sales_data.head()


Unnamed: 0,Customer_Group,Product_Category,Gross_Sales
0,A Market That's Super,Fruit,29.7
1,Super Super Market,Fruit,0.0
2,Market,Fruit,1241.35
3,A Market That's Super,Fruit,1283.1
4,A Market That's Super,Fruit,1026.0


In [4]:
# Make the Product Category Columns

PG_to_columns = condensed_sales_data.pivot_table(values="Gross_Sales",
                                                 index="Customer_Group",
                                                 columns="Product_Category",
                                                aggfunc='sum')
PG_to_columns.head()

Product_Category,Fruit,Vegetables
Customer_Group,Unnamed: 1_level_1,Unnamed: 2_level_1
A Market That's Super,4558983.03,1728720.96
Market,1432134.26,551704.1
Not So Super Market,3092657.94,1114197.61
Super Super Market,5649445.74,2043396.19


In [5]:
unstacking = PG_to_columns.unstack().reset_index().set_index("Customer_Group")

unstacking

Unnamed: 0_level_0,Product_Category,0
Customer_Group,Unnamed: 1_level_1,Unnamed: 2_level_1
A Market That's Super,Fruit,4558983.03
Market,Fruit,1432134.26
Not So Super Market,Fruit,3092657.94
Super Super Market,Fruit,5649445.74
A Market That's Super,Vegetables,1728720.96
Market,Vegetables,551704.1
Not So Super Market,Vegetables,1114197.61
Super Super Market,Vegetables,2043396.19


In [6]:
unstacking.T

Customer_Group,A Market That's Super,Market,Not So Super Market,Super Super Market,A Market That's Super.1,Market.1,Not So Super Market.1,Super Super Market.1
Product_Category,Fruit,Fruit,Fruit,Fruit,Vegetables,Vegetables,Vegetables,Vegetables
0,4.55898e+06,1.43213e+06,3.09266e+06,5.64945e+06,1.72872e+06,551704,1.1142e+06,2.0434e+06


In [7]:
unstacking = PG_to_columns.unstack().reset_index().set_index(['Customer_Group', 'Product_Category'])
unstacking

Unnamed: 0_level_0,Unnamed: 1_level_0,0
Customer_Group,Product_Category,Unnamed: 2_level_1
A Market That's Super,Fruit,4558983.03
Market,Fruit,1432134.26
Not So Super Market,Fruit,3092657.94
Super Super Market,Fruit,5649445.74
A Market That's Super,Vegetables,1728720.96
Market,Vegetables,551704.1
Not So Super Market,Vegetables,1114197.61
Super Super Market,Vegetables,2043396.19


In [8]:
unstacking.T

Customer_Group,A Market That's Super,Market,Not So Super Market,Super Super Market,A Market That's Super,Market,Not So Super Market,Super Super Market
Product_Category,Fruit,Fruit,Fruit,Fruit,Vegetables,Vegetables,Vegetables,Vegetables
0,4558983.03,1432134.26,3092657.94,5649445.74,1728720.96,551704.1,1114197.61,2043396.19
