## 2021: Week 11 - Cocktail Profit Margins

This week's challenge has been put together by Vivien to challenge a lot of the fundamental prep skills you have built up so far this year. 

Vivien's challenge is looking at cocktail pricing (a common trend over the years at Preppin' HQ) and whether you can determine how much profit you can make from certain cocktails because who doesn't talk about data preparation when you are in a bar?

### Input
- Cocktails: names, prices and their recipe with measurements 
- Sourcing: ingredient prices, quantity per bottle, currency of price 
- Conversion rates: currencies and their conversion rates (e.g. 1.14 euros = 1 pound)\

1. Cocktails
![img](https://lh3.googleusercontent.com/0XsxMQh14ynnZzFEU9YLVXSMl3eyvPaN_J7qrZHbdw8iZ0x1QLCpZkcGte83T9Rs1qXaFTnCZAb0kBrum_gN8wJBfQYu6unaLGuLGIFoak1g44oXwUKNyNQlSdD9NwEjB5cWz1Gx)

2. Sourcing
![img](https://lh5.googleusercontent.com/Xnlcgu-L5AcUTvcQy97d7a0UAdjS4ht22qg17dJ9KnClbCv6BysNQ1EZgu5ajFi-ZikVpKMWWLrdKHoJc8p0-tnKmpipuCpAnIUnpHnkl9STU7uQNxjV6BorPKUU_kJRZJKRn19X)

3. Conversion rates
![img](https://lh6.googleusercontent.com/qFyoGWgUT19MhsMKhrvE2pOgX3V3hamS8sB6_r-6fvJqKz4Q20ZIgWCXL-RMy-6L91Unm8Nfz7xZQUgZibon8AFKhm6eW6InbOmEeRhwsg-h44L4h6DaX3PRjMSDWaaXvJGJq1Vb)

### Requirment

- Input the dataset 
- Split out the recipes into the different ingredients and their measurements
- Calculate the price in pounds, for the required measurement of each ingredient
- Join the ingredient costs to their relative cocktails
- Find the total cost of each cocktail 
- Include a calculated field for the profit margin i.e. the difference between each cocktail's price and it's overall cost 
- Round all numeric fields to 2 decimal places 
- Output the data

### Output
4 fields: 
- Cocktail 
- Price
- Cost
- Margin 

5 Rows (6 including headers)

![img](https://lh4.googleusercontent.com/CRZx72Sa1QvjEtgSSR_LY51NAZ_LH6EJSnMr-Z1Mtbjz6W5YYS0Y9UUezi4CufTfBZEbKHoL4qtyXoFZRAQcEWN-tioQLOvOS7_nWF2Rkb8ZEJd2tUTfS-_3PDEt5WeMTKnspIBh)

In [509]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### Input the dataset 

In [510]:
data = pd.read_excel("./data/Cocktails Dataset.xlsx", sheet_name=["Cocktails", "Sourcing", "Conversion Rates"])

In [511]:
cocktail = data["Cocktails"].copy()
sourcing = data["Sourcing"].copy()
conversion_rate = data["Conversion Rates"].copy()

### Split out the recipes into the different ingredients and their measurements

In [512]:
cocktail

Unnamed: 0,Cocktail,Price (£),Recipe (ml)
0,Raspberry Lemon Drop,8.5,Citroen Vodka:45ml; Chambord:20ml; Triple Sec:...
1,Bay Breeze,7.2,Plain Vodka:60ml; Cranberry Juice:90ml; Pineap...
2,Alabama Slammer,8.25,Southern Comfort:15ml; Sloe Gin:15ml; Amaretto...
3,Watermelon Man,7.0,Plain Vodka:60ml; Watermelon Schapps:30ml; Coi...
4,Orange Blossom,8.7,London Dry Gin:30ml; Cointreau:10ml; Orange Ju...


In [513]:
import re
ingre_amount = cocktail["Recipe (ml)"].map(lambda x: x.split(";")).apply(pd.Series)
ingre_amount = ingre_amount.fillna('0')
result = pd.DataFrame()
for i in ingre_amount.columns:
    tmp = ingre_amount[i].map(lambda x: x.split(":")).apply(pd.Series)
    result = pd.concat([result, tmp], axis=0)
result = result.dropna()
result[1] = result[1].map(lambda x: re.sub(r"[^0-9]", "", x))
result.columns = ["Ingredient", "ml"]
result = result.reset_index(drop=True)
result

Unnamed: 0,Ingredient,ml
0,Citroen Vodka,45
1,Plain Vodka,60
2,Southern Comfort,15
3,Plain Vodka,60
4,London Dry Gin,30
5,Chambord,20
6,Cranberry Juice,90
7,Sloe Gin,15
8,Watermelon Schapps,30
9,Cointreau,10


In [514]:
sourcing = sourcing.merge(conversion_rate, how="left", on="Currency")
sourcing

Unnamed: 0,Ingredient,Price,ml per Bottle,Currency,Conversion Rate £
0,Citroen Vodka,19.25,500.0,Euro,1.14
1,Chambord,22.85,450.0,Euro,1.14
2,Triple Sec,12.0,400.0,Dollar,1.38
3,Plain Vodka,15.24,500.0,Euro,1.14
4,Cranberry Juice,1.33,1000.0,Pound,1.0
5,Pineapple Juice,1.8,1000.0,Pound,1.0
6,Southern Comfort,20.99,750.0,Dollar,1.38
7,Sloe Gin,22.99,500.0,Euro,1.14
8,Amaretto,16.6,500.0,Euro,1.14
9,Orange Juice,1.42,1000.0,Pound,1.0


In [515]:
sourcing["Price in pound"] = sourcing["Price"] / sourcing["Conversion Rate £"]
sourcing

Unnamed: 0,Ingredient,Price,ml per Bottle,Currency,Conversion Rate £,Price in pound
0,Citroen Vodka,19.25,500.0,Euro,1.14,16.885965
1,Chambord,22.85,450.0,Euro,1.14,20.04386
2,Triple Sec,12.0,400.0,Dollar,1.38,8.695652
3,Plain Vodka,15.24,500.0,Euro,1.14,13.368421
4,Cranberry Juice,1.33,1000.0,Pound,1.0,1.33
5,Pineapple Juice,1.8,1000.0,Pound,1.0,1.8
6,Southern Comfort,20.99,750.0,Dollar,1.38,15.210145
7,Sloe Gin,22.99,500.0,Euro,1.14,20.166667
8,Amaretto,16.6,500.0,Euro,1.14,14.561404
9,Orange Juice,1.42,1000.0,Pound,1.0,1.42


In [516]:
sourcing["Cost per ml"] = sourcing["Price in pound"] / sourcing["ml per Bottle"]
sourcing = sourcing.drop(["Currency", "Conversion Rate £", "Price in pound"], axis=1)
sourcing = sourcing[["Ingredient", "Cost per ml"]]
sourcing

Unnamed: 0,Ingredient,Cost per ml
0,Citroen Vodka,0.033772
1,Chambord,0.044542
2,Triple Sec,0.021739
3,Plain Vodka,0.026737
4,Cranberry Juice,0.00133
5,Pineapple Juice,0.0018
6,Southern Comfort,0.02028
7,Sloe Gin,0.040333
8,Amaretto,0.029123
9,Orange Juice,0.00142


In [517]:
ingre_amount = cocktail["Recipe (ml)"].map(lambda x: x.split(";")).apply(pd.Series)
ingre_amount = ingre_amount.fillna('0')
result = pd.DataFrame()
for i in ingre_amount.columns:
    tmp = ingre_amount[i].map(lambda x: x.split(":")).apply(pd.Series)
    result = pd.concat([result, tmp], axis=1)
cocktail = pd.concat([cocktail, result], axis=1)
cocktail = cocktail.drop(["Recipe (ml)"], axis=1)
cocktail

Unnamed: 0,Cocktail,Price (£),0,1,0.1,1.1,0.2,1.2,0.3,1.3
0,Raspberry Lemon Drop,8.5,Citroen Vodka,45ml,Chambord,20ml,Triple Sec,20ml,0,
1,Bay Breeze,7.2,Plain Vodka,60ml,Cranberry Juice,90ml,Pineapple Juice,30ml,0,
2,Alabama Slammer,8.25,Southern Comfort,15ml,Sloe Gin,15ml,Amaretto,15ml,Orange Juice,120ml
3,Watermelon Man,7.0,Plain Vodka,60ml,Watermelon Schapps,30ml,Cointreau,30ml,Lime Soda,200ml
4,Orange Blossom,8.7,London Dry Gin,30ml,Cointreau,10ml,Orange Juice,30ml,0,


In [518]:
ingredient = cocktail.melt(id_vars=["Cocktail", "Price (£)"], value_vars=0, value_name="Ingredient")[["Cocktail", "Price (£)", "Ingredient"]]
amount = cocktail.melt(id_vars=["Cocktail", "Price (£)"], value_vars=1, value_name="Amount")["Amount"]
cocktail = pd.concat([ingredient, amount], axis=1).dropna()
cocktail

Unnamed: 0,Cocktail,Price (£),Ingredient,Amount
0,Raspberry Lemon Drop,8.5,Citroen Vodka,45ml
1,Bay Breeze,7.2,Plain Vodka,60ml
2,Alabama Slammer,8.25,Southern Comfort,15ml
3,Watermelon Man,7.0,Plain Vodka,60ml
4,Orange Blossom,8.7,London Dry Gin,30ml
5,Raspberry Lemon Drop,8.5,Chambord,20ml
6,Bay Breeze,7.2,Cranberry Juice,90ml
7,Alabama Slammer,8.25,Sloe Gin,15ml
8,Watermelon Man,7.0,Watermelon Schapps,30ml
9,Orange Blossom,8.7,Cointreau,10ml


In [519]:
cocktail["Amount"] = cocktail["Amount"].str.replace("ml", "")
cocktail["Amount"] = cocktail["Amount"].astype(int)

In [520]:
cocktail["Ingredient"] = cocktail["Ingredient"].str.strip().str.upper()
cocktail

Unnamed: 0,Cocktail,Price (£),Ingredient,Amount
0,Raspberry Lemon Drop,8.5,CITROEN VODKA,45
1,Bay Breeze,7.2,PLAIN VODKA,60
2,Alabama Slammer,8.25,SOUTHERN COMFORT,15
3,Watermelon Man,7.0,PLAIN VODKA,60
4,Orange Blossom,8.7,LONDON DRY GIN,30
5,Raspberry Lemon Drop,8.5,CHAMBORD,20
6,Bay Breeze,7.2,CRANBERRY JUICE,90
7,Alabama Slammer,8.25,SLOE GIN,15
8,Watermelon Man,7.0,WATERMELON SCHAPPS,30
9,Orange Blossom,8.7,COINTREAU,10


In [521]:
sourcing["Ingredient"] = sourcing["Ingredient"].str.strip().str.upper()
sourcing

Unnamed: 0,Ingredient,Cost per ml
0,CITROEN VODKA,0.033772
1,CHAMBORD,0.044542
2,TRIPLE SEC,0.021739
3,PLAIN VODKA,0.026737
4,CRANBERRY JUICE,0.00133
5,PINEAPPLE JUICE,0.0018
6,SOUTHERN COMFORT,0.02028
7,SLOE GIN,0.040333
8,AMARETTO,0.029123
9,ORANGE JUICE,0.00142


In [522]:
cocktail = cocktail.merge(sourcing, how="left", on="Ingredient")
cocktail["Total cost"] = (cocktail["Cost per ml"] * cocktail["Amount"]).round(3)

In [523]:
cocktail

Unnamed: 0,Cocktail,Price (£),Ingredient,Amount,Cost per ml,Total cost
0,Raspberry Lemon Drop,8.5,CITROEN VODKA,45,0.033772,1.52
1,Bay Breeze,7.2,PLAIN VODKA,60,0.026737,1.604
2,Alabama Slammer,8.25,SOUTHERN COMFORT,15,0.02028,0.304
3,Watermelon Man,7.0,PLAIN VODKA,60,0.026737,1.604
4,Orange Blossom,8.7,LONDON DRY GIN,30,0.020387,0.612
5,Raspberry Lemon Drop,8.5,CHAMBORD,20,0.044542,0.891
6,Bay Breeze,7.2,CRANBERRY JUICE,90,0.00133,0.12
7,Alabama Slammer,8.25,SLOE GIN,15,0.040333,0.605
8,Watermelon Man,7.0,WATERMELON SCHAPPS,30,0.034719,1.042
9,Orange Blossom,8.7,COINTREAU,10,0.022594,0.226


In [529]:
grouped = cocktail.groupby(["Cocktail", "Price (£)"])["Total cost"].sum().reset_index()
grouped["Margin"] = (grouped["Price (£)"] - grouped["Total cost"]).round(2)
# grouped["Margin"] = grouped["Margin"].round(2)
grouped

Unnamed: 0,Cocktail,Price (£),Total cost,Margin
0,Alabama Slammer,8.25,1.516,6.73
1,Bay Breeze,7.2,1.778,5.42
2,Orange Blossom,8.7,0.881,7.82
3,Raspberry Lemon Drop,8.5,2.846,5.65
4,Watermelon Man,7.0,3.585,3.42
