### 1. Load

Load the clothes.csv and assign to a dataframe variable

| Column Name | Description                   |
|-------------|-------------------------------|
| Date        | Date of transaction           |
| Region      | The region of the transaction |
| Type        | The type of clothing sold     |
| Units       | The number of units sold      |
| Sales       | The cost of the sale          |

In [47]:
import pandas as pd
df = pd.read_csv('clothes.csv')
df.dtypes

Unnamed: 0             int64
Date          datetime64[ns]
Region                object
Type                  object
Units                float64
Sales                  int64
dtype: object

### 2. create 
Create a sales_by_region pivot table with region index and sales and units as values


In [33]:

#Expected output:

#              Sales      Units
# Region                       
# East    408.182482  19.732360
# North   438.924051  19.202643
# South   432.956204  20.423358
# West    452.029412  19.29411

pivot = pd.pivot_table(
    data=df,
    index='Region',
    values =["Sales", "Units"]
)
print(pivot)

             Sales      Units
Region                       
East    408.182482  19.732360
North   438.924051  19.202643
South   432.956204  20.423358
West    452.029412  19.294118


### 3. 

Specifying Aggregation Method in a Pandas Pivot Table
Produce the sum of our sales across all regions


In [34]:
# Specifying the Aggregation Function
pivot = pd.pivot_table(
    data=df,
    index='Region',
    aggfunc='sum'
)

print(pivot)

# Expected:
#          Sales   Units
# Region                
# East    167763  8110.0
# North   138700  4359.0
# South    59315  2798.0
# West     61476  2624.0

         Sales   Units  Unnamed: 0
Region                            
East    167763  8110.0      200657
North   138700  4359.0      158885
South    59315  2798.0       71474
West     61476  2624.0       68484


### 4. 

Multiple Aggregation Method in a Pandas DataFrame
Let’s produce aggregations for both the mean and the sum:


In [35]:
pivot = pd.pivot_table(
    data=df,
    index='Region',
    aggfunc=['mean', 'sum']
)

print(pivot)

# Expected:
#               mean                sum        
#              Sales      Units   Sales   Units
# Region                                       
# East    408.182482  19.732360  167763  8110.0
# North   438.924051  19.202643  138700  4359.0
# South   432.956204  20.423358   59315  2798.0
# West    452.029412  19.294118   61476  2624.0

              mean                            sum                   
             Sales      Units  Unnamed: 0   Sales   Units Unnamed: 0
Region                                                              
East    408.182482  19.732360  488.216545  167763  8110.0     200657
North   438.924051  19.202643  502.800633  138700  4359.0     158885
South   432.956204  20.423358  521.708029   59315  2798.0      71474
West    452.029412  19.294118  503.558824   61476  2624.0      68484


### 5. 
Different Aggregations Per Column 
Calculate the sum of units and the average number of sales:

In [36]:
pivot = pd.pivot_table(
    data=df,
    index='Region',
    aggfunc={'Sales': 'mean', 'Units': 'sum'}
)
pivot.columns = ["Sales(mean)", "Units(sum)"]

print(pivot)

# Expected:
#        Sales(mean)  Units(sum)
# Region                    
# East    408.182482      8110.0
# North   438.924051      4359.0
# South   432.956204      2798.0
# West    452.029412      2624.0

        Sales(mean)  Units(sum)
Region                         
East     408.182482      8110.0
North    438.924051      4359.0
South    432.956204      2798.0
West     452.029412      2624.0


split the data by the Type column.

In [40]:
pivot = pd.pivot_table(
    data=df,
    index='Region',
    columns='Type',
    values='Sales'
)

print(pivot)

# Expected:
# Type    Children's Clothing  Men's Clothing  Women's Clothing
# Region                                                       
# East             405.743363      423.647541        399.028409
# North            438.894118      449.157303        432.528169
# South            412.666667      475.435897        418.924528
# West             480.523810      465.292683        419.188679

Type    Children's Clothing  Men's Clothing  Women's Clothing
Region                                                       
East             405.743363      423.647541        399.028409
North            438.894118      449.157303        432.528169
South            412.666667      475.435897        418.924528
West             480.523810      465.292683        419.188679


Add a date dimension to our pivot table as part of multiple index.

In [45]:
pivot = pd.pivot_table(
    data=df,
    index=['Region','Date'],
    columns='Type',
    values='Sales'
)

print(pivot.head())

# Returns:
# Type         Children's Clothing  Men's Clothing  Women's Clothing
# Region Date                                                       
# East   1              423.241379      369.250000        428.948718
#        2              274.800000      445.425000        456.816327
#        3              425.382353      506.421053        342.386364
#        4              453.866667      405.666667        364.795455
# North  1              394.727273      450.869565        489.944444

Type               Children's Clothing  Men's Clothing  Women's Clothing
Region Date                                                             
East   2020-01-01                  NaN           234.0             322.0
       2020-01-02                204.0             NaN             374.0
       2020-01-04                  NaN           330.0               NaN
       2020-01-07                  NaN           320.0               NaN
       2020-01-09                  NaN           352.0               NaN


Create multiple index. 

auto_awesome
Next to the regions, display the period of sales broken down by quarter of years

In [54]:
pivot = pd.pivot_table(
    data=df,
    index=['Region',df['Date'].dt.quarter],
    columns='Type',
    values='Sales'
)

print(pivot.head())

# Expected:
# Type         Children's Clothing  Men's Clothing  Women's Clothing
# Region Date                                                       
# East   1              423.241379      369.250000        428.948718
#        2              274.800000      445.425000        456.816327
#        3              425.382353      506.421053        342.386364
#        4              453.866667      405.666667        364.795455
# North  1              394.727273      450.869565        489.944444

Type         Children's Clothing  Men's Clothing  Women's Clothing
Region Date                                                       
East   1              423.241379      369.250000        428.948718
       2              274.800000      445.425000        456.816327
       3              425.382353      506.421053        342.386364
       4              453.866667      405.666667        364.795455
North  1              394.727273      450.869565        489.944444


List only the intersection of East Region, Quarter 2, and Men’s clothing

In [56]:
print(pivot.loc[('East', 1), "Men's Clothing"])

369.25


Add totals to pivot table. Total column and row should be named as Total

In [58]:
pivot = pd.pivot_table(
    data=df,
    index='Region',
    columns='Type',
    values='Sales',
    margins=True,
    margins_name="Total"
)

pivot

Type,Children's Clothing,Men's Clothing,Women's Clothing,Total
Region,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
East,405.743363,423.647541,399.028409,408.182482
North,438.894118,449.157303,432.528169,438.924051
South,412.666667,475.435897,418.924528,432.956204
West,480.52381,465.292683,419.188679,452.029412
Total,427.74386,444.257732,415.254717,427.254
