### Prepping Data Challenge: Bike Accessory Sales (week 3)

This challenge introduces two main types of reshaping data; pivoting and aggregation. It will also touch on Unions, where data sets that have similar structures will be stack on top of each other. 

#### Requirement:
 1. Input the data source by pulling together all the tables
 2. Pivot 'New' columns and 'Existing' columns
 3. Split the former column headers to form: 
    - Customer Type
    - Product
 4. Rename the measure created by the Pivot as 'Products Sold'
 5. Create a Store column from the data
 6. Remove any unnecessary data fields
 7. Turn Date into Quarter 
 8. Aggregate to form two separate outputs of the number of products sold by: 
    - Product, Quarter
    - Store, Customer Type, Product
 9. Output each data set as a csv file 

### 1 & 5. Input the data source by pulling together all the tables & Create a Store Column from the data

In [1]:
#import libraries
import pandas as pd

In [2]:
xlsx = pd.ExcelFile('WK3-Bike Accessory Sales.xlsx')

df = None
for sheet_name in xlsx.sheet_names:
    df1 = xlsx.parse(sheet_name)
    df1['Store'] = sheet_name
    df = pd.concat([df,df1])

In [3]:
df.head(10)

Unnamed: 0,Date,New - Saddles,New - Mudguards,New - Wheels,New - Bags,Existing - Saddles,Existing - Mudguards,Existing - Wheels,Existing - Bags,Store
0,2021-01-21,13.0,42.0,19.0,38.0,17.0,48.0,19.0,13.0,Manchester
1,2021-02-21,1.0,9.0,14.0,6.0,2.0,4.0,19.0,24.0,Manchester
2,2021-03-21,8.0,22.0,6.0,35.0,0.0,48.0,17.0,16.0,Manchester
3,2021-04-21,3.0,9.0,8.0,16.0,18.0,50.0,18.0,25.0,Manchester
4,2021-05-21,2.0,8.0,5.0,34.0,17.0,3.0,12.0,19.0,Manchester
5,2021-06-21,11.0,2.0,6.0,8.0,2.0,8.0,3.0,1.0,Manchester
6,2021-07-21,16.0,5.0,15.0,37.0,19.0,1.0,7.0,28.0,Manchester
7,2021-08-21,10.0,7.0,18.0,27.0,10.0,4.0,8.0,9.0,Manchester
8,2021-09-21,15.0,25.0,1.0,38.0,18.0,9.0,0.0,23.0,Manchester
9,2021-10-21,9.0,11.0,11.0,0.0,18.0,10.0,17.0,7.0,Manchester


### 2 & 4. Pivot 'New' columns and 'Existing' columns & Rename the measure created as 'Product Sold'

__Pandas.melt()__ unpivots a DataFrame from wide format to long format. _melt()_ function is useful to massage a DataFrame into a format where one or more columns are identifier variables, while all other columns, considered measured variables, are unpivoted to the row axis, leaving just two non-identifier columns, variable and value.

In [4]:
df = pd.melt(df, id_vars=['Date','Store'], var_name='Customer', value_name='Product Sold')
df

Unnamed: 0,Date,Store,Customer,Product Sold
0,2021-01-21,Manchester,New - Saddles,13.0
1,2021-02-21,Manchester,New - Saddles,1.0
2,2021-03-21,Manchester,New - Saddles,8.0
3,2021-04-21,Manchester,New - Saddles,3.0
4,2021-05-21,Manchester,New - Saddles,2.0
...,...,...,...,...
475,2021-08-21,Birmingham,Existing - Bags,11.0
476,2021-09-21,Birmingham,Existing - Bags,24.0
477,2021-10-21,Birmingham,Existing - Bags,16.0
478,2021-11-21,Birmingham,Existing - Bags,12.0


###  3. Split the former column headers to form: 
   - Customer Type
   - Product

In [5]:
df[['Customer Type','Product']] = df['Customer'].str.split('-', expand=True)
#df

### 6. Remove any unnecessary data fields

In [6]:
df.drop('Customer', axis = 'columns', inplace=True)

### 7. Turn Date into Quarter 

In [7]:
df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y')
df['Quarter'] = df['Date'].dt.quarter

###  8. Aggregate to form two separate outputs of the number of products sold by: 
   - Product, Quarter
   - Store, Customer Type, Product

In [8]:
Prod_Qt = df.groupby(['Product', 'Quarter'])['Product Sold'].sum().reset_index()
Cust_prod_Qt = df.groupby(['Store', 'Customer Type','Product'])['Product Sold'].sum().reset_index()

### 9. Output each data set as a csv file 

In [9]:
Prod_Qt.to_csv('WK3-Product Quarter Output.csv', index=False)
Cust_prod_Qt.to_csv('WK3-Customer Product Quarter output.csv', index=False)

In [10]:
df.head()

Unnamed: 0,Date,Store,Product Sold,Customer Type,Product,Quarter
0,2021-01-21,Manchester,13.0,New,Saddles,1
1,2021-02-21,Manchester,1.0,New,Saddles,1
2,2021-03-21,Manchester,8.0,New,Saddles,1
3,2021-04-21,Manchester,3.0,New,Saddles,2
4,2021-05-21,Manchester,2.0,New,Saddles,2
