### Prepping Data Challenge:  Excelling with lookups (week 34)

Our scenario this week is looking at Employee Sales at Allchains for each month of the year so far and we want to compare this to their Monthly Targets, stored on another sheet.

### Requirements
- Input data
- Calculate the Average Monthly Sales for each employee
- In the Targets sheet the Store Name needs cleaning up
- Filter the data so that only employees who are below 90% of their target on average remain
- For these employees, we also want to know the % of months that they met/exceeded their target
- Output the data

In [1]:
import pandas as pd
import numpy as np

In [2]:
#Input the data
df = None
with pd.ExcelFile('Wk34-Input.xlsx') as xl:
    sales = pd.read_excel(xl,'Employee Sales')
    target = pd.read_excel(xl, 'Employee Targets')

In [3]:
sales = sales.melt(id_vars = ['Store', 'Employee'], var_name='Month', value_name='Sales')

In [4]:
#Calculate the Average Monthly Sales for each employee
sales['Avg monthly Sales'] = sales.groupby(['Store','Employee'])['Sales'].transform('mean').round()

In [5]:
sales.head()

Unnamed: 0,Store,Employee,Month,Sales,Avg monthly Sales
0,Stratford,Julie,2021-01-01,3302,5005.0
1,Stratford,Pete,2021-01-01,4052,5485.0
2,Stratford,Jose,2021-01-01,5226,4073.0
3,Stratford,Andre,2021-01-01,9369,5908.0
4,Stratford,Edward,2021-01-01,7854,6055.0


In [6]:
target['Store'].unique()

array(['Stratfod', 'Stratford', 'Stratfodd', 'Statford', 'Straford',
       'Wimbledan', 'Wimbledon', 'Vimbledon', 'Wimbledone', 'Bristoll',
       'Bristol', 'Bristal', 'Bristole', 'York', 'Yor', 'Yorkk', 'Yark'],
      dtype=object)

In [7]:
#In the Targets sheet the Store Name needs cleaning up
spellcheck = {'Stratford':'^S.*','Wimbledon':'^[WV]?im.*','Bristol':'^B.*','York':'^Y.*'}
   
target['Store'] = target['Store'].replace(list(spellcheck.values()), list(spellcheck.keys()), regex = True)

In [8]:
target['Store'].unique()

array(['Stratford', 'Wimbledon', 'Bristol', 'York'], dtype=object)

In [9]:
target.head()

Unnamed: 0,Store,Employee,Monthly Target
0,Stratford,Julie,5000
1,Stratford,Pete,5000
2,Stratford,Jose,5000
3,Stratford,Andre,6000
4,Stratford,Edward,6000


In [10]:
df = sales.merge(target, on=['Store','Employee'], how='left')
df['met target'] = np.where(df['Sales'] >= df['Monthly Target'], 1, 0)

In [11]:
#For these employees, we also want to know the % of months that they met/exceeded their target
df = df.groupby(['Store','Employee']).agg(a_m_s = ('Sales','mean'),
                                         p_m_t =('met target','mean'),
                                         m_t = ('Monthly Target','mean'))

In [12]:
#Filter the data so that only employees who are below 90% of their target on average remain
df = df.loc[df['a_m_s'] < df['m_t']*0.9]

In [13]:
df['p_m_t'] = (df['p_m_t']*100).round(0)
df['a_m_s'] = df['a_m_s'].round(0)

In [14]:
df.rename(columns={'a_m_s':'Avg monthly Sales','p_m_t':'% of months target met','m_t':'Monthly Target'}, inplace=True)

In [15]:
df.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Avg monthly Sales,% of months target met,Monthly Target
Store,Employee,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Stratford,Jose,4073.0,57.0,5000.0
Wimbledon,Edward,4391.0,29.0,5000.0
Wimbledon,Francis,4447.0,43.0,5000.0
Wimbledon,Quentin,3387.0,43.0,4000.0


In [16]:
#output the data
df.to_csv('wk34-output.csv', index=False)