### Prepping Data Challenge: C&BSCo Preppin' Parameters (Week 31)
 
### Requirements
- Input the data
- Split the Product Name field into Product Type and Size
- Only keep the Liquid products
- Total up the sales for each Product Size and Scent for each Store
- Rank each of the Product Size and Scent combinations for each Store
- Only leave the top 10 based on total sales value calculated above
- Round the Sales Values to the nearest 10 value (ie 1913 becomes 1910)
- Create a parameter to select the store
- Ensure the output only contains the chosen store
- Output the data and include the Store Name in the file name

In [1]:
import pandas as pd

In [2]:
#Input the data
df = pd.read_csv('wk27-input.csv', parse_dates=['Sale Date'], dayfirst=True)

In [3]:
df.head()

Unnamed: 0,Sale Date,Order ID,Sale Value,Product Name,Store Name,Region,Scent Name
0,2022-12-12,937,109.84,Liquid - 25ml,Lewisham,East,Rose
1,2022-10-14,427,207.61,Liquid - 25ml,Lewisham,East,Rose
2,2022-09-09,135,111.96,Liquid - 25ml,Lewisham,East,Rose
3,2022-12-11,791,170.68,Liquid - 25ml,Wimbledon,West,Rose
4,2022-09-08,270,214.12,Liquid - 25ml,Wimbledon,West,Rose


In [4]:
#Split the Product Name field into Product Type and Size
df[['Product Type','Size']] = df['Product Name'].str.split(' - ', expand=True)

In [5]:
#Only keep the Liquid products
df = df[df['Product Type'] == 'Liquid']

In [6]:
#Total up the sales for each Product Size and Scent for each Store
df['Total Sales'] = df.groupby(['Size','Scent Name','Store Name'])['Sale Value'].transform('sum')

In [7]:
#Rank each of the Product Size and Scent combinations for each Store
df["Rank of Product Size & Scent by Store"] = df.groupby(['Store Name'])['Total Sales'].rank(method ='dense',ascending=False).astype(int)

In [8]:
#Only leave the top 10 based on total sales value calculated above
df = df[df["Rank of Product Size & Scent by Store"] <= 10]

In [9]:
#Round the Sales Values to the nearest 10 value (ie 1913 becomes 1910)
df['Total Sales'] = df['Total Sales'].round(-1)

In [10]:
df = df[["Store Name","Rank of Product Size & Scent by Store",'Scent Name','Size','Total Sales']]

In [11]:
df.rename(columns = {'Total Sales':'Sale Value'}, inplace=True)
df.drop_duplicates(subset =['Store Name','Scent Name','Size'], keep = 'first', inplace=True)

In [12]:
#Create a parameter to select the store
#Ensure the output only contains the chosen store

select_store = input('Enter a Store Name: ')

# Filter data based on selected Store Name 
store_name = (df[df['Store Name'] == select_store]).sort_values(by = "Rank of Product Size & Scent by Store")

store_name.head(10)

Enter a Store Name: Notting Hill


Unnamed: 0,Store Name,Rank of Product Size & Scent by Store,Scent Name,Size,Sale Value
1473,Notting Hill,1,Lavender,250ml,1850.0
391,Notting Hill,2,Rose,50ml,1830.0
91,Notting Hill,3,Apricot,25ml,1790.0
1981,Notting Hill,4,Mint,500ml,1770.0
2705,Notting Hill,5,Strawberry,1L,1530.0
2654,Notting Hill,6,Lemongrass,1L,1530.0
1717,Notting Hill,7,Watermelon,500ml,1500.0
2257,Notting Hill,8,Lemongrass,750ml,1380.0
554,Notting Hill,9,Rosemary,50ml,1350.0
2620,Notting Hill,10,Rosemary,1L,1330.0


In [13]:
##Output the data and include the Store Name in the file name
store_name.to_csv(f'wk31 Top 10 Products for {select_store}.csv', index=False)