### Prepping Data Challenge: Multi Sheets of Madness (Week 21)
There are 12 sheets from different shops reporting the Key Metrics that we are interested in. There are Additional Metrics in a table below that are not of interest to us for this challenge. 
 
### Requirements
- Connect to the data
- Bring together the Key Metrics tables from each Shop
- You'll notice that we have fields which report the quarter in addition to the monthly values. We only wish to keep the monthly values
- Reshape the data so that we have a Date field
- For Orders and Returns, we are only interested in reporting % values, whilst for Complaints we are only interested in the # Received
- We wish to update the Breakdown field to include the Department to make the Measure Name easier to interpret
- We wish to have a field for each of the measures rather than a row per measure
- We wish to have the targets for each measure as field that we can compare each measure to
- Output the data

In [1]:
import pandas as pd
import numpy as np

In [2]:
df = None
with pd.ExcelFile("WK21-Input.xlsx") as xl:
    for s in xl.sheet_names:
        df_new = pd.read_excel(xl, s, header=3)
        df_new['Shop'] = s
        df = pd.concat([df, df_new])

In [3]:
#Bring together the Key Metrics tables from each Shop
df = df[(df['Department'] != 'HR') & (df['Department'] != 'Additonal Metrics') & (df['Department'] !='Department')]

In [4]:
df['Department'] = df['Department'].ffill()
df['Target'] = df['Target'].ffill()

In [5]:
#We only wish to keep the monthly values
df.drop(columns=['FY22 Q1 ','FY22 Q2','FY22 Q3','FY22 Q4','Comments'], inplace=True, axis=1)

In [6]:
#For Orders and Returns, we are only interested in reporting % values, 
#whilst for Complaints we are only interested in the # Received 
df = df[(df['Breakdown'].isin(['% Shipped in 3 days','% Shipped in 5 days','% Processed in 3 days',
                               '% Processed in 5 days','# Received']))]

In [7]:
#Reshape the data so that we have a Date field
#df_pivot = pd.melt(df, id_vars=['Department','Target','Breakdown','Shop'], var_name='Date', value_name='values')

In [8]:
df.head(7)

Unnamed: 0,Department,Target,Breakdown,2021-07-01 00:00:00,2021-08-01 00:00:00,2021-09-01 00:00:00,2021-10-01 00:00:00,2021-11-01 00:00:00,2021-12-01 00:00:00,2022-01-01 00:00:00,2022-02-01 00:00:00,2022-03-01 00:00:00,2022-04-01 00:00:00,2022-05-01 00:00:00,2022-06-01 00:00:00,Shop
1,Orders,>95%,% Shipped in 3 days,0.91,0.88,0.85,0.87,0.86,0.86,0.88,0.8,0.92,0.94,NaT,NaT,Bath
3,Orders,>99%,% Shipped in 5 days,0.99,0.94,0.95,0.89,0.91,0.86,0.96,0.8,0.94,1.0,NaT,NaT,Bath
5,Returns,>80%,% Processed in 3 days,0.88,0.83,0.89,0.75,0.77,0.84,0.67,0.67,0.99,0.97,NaT,NaT,Bath
7,Returns,>95%,% Processed in 5 days,0.91,0.85,0.91,0.85,0.79,0.91,0.7,0.76,1.0,1.0,NaT,NaT,Bath
8,Complaints,0,# Received,25.0,6.0,10.0,28.0,11.0,13.0,9.0,3.0,14.0,33.0,NaT,NaT,Bath
1,Orders,>95%,% Shipped in 3 days,0.84,0.84,0.96,0.84,0.89,0.97,0.9,0.83,0.8,0.8,NaT,NaT,Torquay
3,Orders,>99%,% Shipped in 5 days,0.92,0.89,0.98,0.85,0.93,1.0,0.92,0.93,0.9,0.8,NaT,NaT,Torquay


In [None]:
#We wish to update the Breakdown field to include the Department to make the Measure Name easier to interpret


In [None]:
#We wish to have a field for each of the measures rather than a row per measure


In [None]:
#We wish to have the targets for each measure as field that we can compare each measure to


In [None]:
#

In [None]:
not_able2.head()

In [None]:
output.head(10)

In [None]:
#output the data 
output.to_excel('wk21-output.xlsx', index=False)