# McDonalds and Starbucks: what would be better as meals for teens?
<b>2. Starbucks (SB) breakfasts & dinners menu-maker: a combinatorial approach</b>

The main idea is to create "meaningful" (no repetitions, should not contain only food or only drinks) combinations from the menu of each restaurant. Second step is selection of combinations that meet certain criteria. 

<b>Advantages:</b> 
- Allows not only answer the question "which restaurant is better" but also offer the most diverse menu, regardless of the selected restaurant. 
- Criteria (both statistical and individual indicators) could be flexibly setted, changed, results could be quickly saved and compared. 
- This approach can be considered as a sketch, proof of concept for further development of the solution.

DONE:
0. Insert cells with official norm values and their processings from <b>data_processing</b> notebook.
1. Normalize values of all menu items by dividing them on <b>meal_norm</b> values dict.
2. Restrict whole dataset to <b>only McD items</b>
3. Select item groups of breakfast / dinner (according to format [1-st food item, 2-nd food item, 3-rd drink item]).
4. Build a <b>meal-maker</b> row-applied function (args: item groups) which returns a dataset of combitations of items from passed item groups as argument.
5. Calculating a <b>total values</b> (calories, fat... e t.c.): sum of values for each item in certain meal combination across all generated meal combinations.
6. Calculating <b>MEAN</b> and <b>STDEV</b> as metrics of caloric and nutrition balace for each meal combination across totals (calories, fat... e t.c.). If <b>MEAN</b> ≈ 1 and <b>STDEV</b> ≈ 0 --> meal combination is well fitted to norms.
7. Append the portion of result (after selection another item groups in p.3 with futher processing) to final <b>'results_all'</b> df.  
8. Write <b>'results_all'</b> in .csv.

In [1]:
# required imports
import pandas as pd
from itertools import product, combinations
import matplotlib
%matplotlib inline

### Official norms

In [2]:
# Dict with actual norms (from official document-1)
norms = {}

f = open ('actual_norms.txt')

for line in f:
    line = line.strip().split(',')
    norms[line[0]] = line[1]
    
del norms['Item']
norms

{'Energy(kcal)': '2900',
 'Fat(g)': '97',
 'Carbohydrates(g)': '421',
 'Fiber(g)': '20',
 'Protein(g)': '87',
 'Sodium(g)': '1.3'}

In [3]:
# (From official document-2): 
# -breakfast + dinner = 20-25% + 30-35% daily energy value respectively --> 
# 25% and 35% (max due to sports competitions)
# -breakfast + dinner = 55-60% total daily nutrients value --> 
# 25% and 35% (max due to sports competitions)
# (only breakfast and dinner mentioned in the task)
# Let's assume that breakfast / dinner share are equal --> 
# meal_norm (30% and 30% respectively)

# calculating weighted norms:

meal_norm = {}

for key, value in norms.items():
    meal_norm[key] = float(value)*0.3
    
# sum_norm dict for data filtering (del items which contain greater values):   
sum_norm = {key:float(value)*0.6 for key, value in norms.items()}
    
print (meal_norm)
print (sum_norm)

{'Energy(kcal)': 870.0, 'Fat(g)': 29.099999999999998, 'Carbohydrates(g)': 126.3, 'Fiber(g)': 6.0, 'Protein(g)': 26.099999999999998, 'Sodium(g)': 0.39}
{'Energy(kcal)': 1740.0, 'Fat(g)': 58.199999999999996, 'Carbohydrates(g)': 252.6, 'Fiber(g)': 12.0, 'Protein(g)': 52.199999999999996, 'Sodium(g)': 0.78}


In [4]:
# Earlier in 'data_processing' notebook we increased the sodium sum 
# (breakfast + dinner) norm from 0.78 to 1.0, i.e. +0.22. 
# In turns it increases sodium meal_norm by 0.11 g. Dicts should be updated: 
meal_norm['Sodium(g)'] += 0.11
sum_norm['Sodium(g)'] += 0.22
print (meal_norm)
print (sum_norm)

{'Energy(kcal)': 870.0, 'Fat(g)': 29.099999999999998, 'Carbohydrates(g)': 126.3, 'Fiber(g)': 6.0, 'Protein(g)': 26.099999999999998, 'Sodium(g)': 0.5}
{'Energy(kcal)': 1740.0, 'Fat(g)': 58.199999999999996, 'Carbohydrates(g)': 252.6, 'Fiber(g)': 12.0, 'Protein(g)': 52.199999999999996, 'Sodium(g)': 1.0}


In [5]:
# df for final results joining
results_all = pd.DataFrame()

### McD & SB menus

In [6]:
# SB and McD combined and processed menu: 
menu_df = pd.read_csv ('combined_processed.csv', sep=',', encoding = 'koi8-r')
menu_df.head()

Unnamed: 0,McD/SB,Category,Kind,Item,Energy(kcal),Fat(g),Carbohydrates(g),Fiber(g),Protein(g),Sodium(g)
0,SB,food,Bakery,Chonga Bagel,300,5.0,50,3.0,12,0.53
1,SB,food,Bakery,8-Grain Roll,380,6.0,70,7.0,10,0.43
2,SB,food,Bakery,Almond Croissant,410,22.0,45,3.0,10,0.39
3,SB,food,Bakery,Banana Nut Bread,420,22.0,52,2.0,6,0.32
4,SB,food,Bakery,Birthday Cake Pop,170,9.0,23,0.0,1,0.11


In [7]:
# normalizing whole dataset for meal_norm --> new features:
for item in menu_df[['Energy(kcal)','Fat(g)', 'Carbohydrates(g)','Fiber(g)', 'Protein(g)', 'Sodium(g)']]:
    for key in meal_norm:
        if key == item:
            menu_df['n_'+item] = menu_df[item] / float (meal_norm[key])

menu_df.head()
# menu_df.to_csv('menu_normalized.csv')

Unnamed: 0,McD/SB,Category,Kind,Item,Energy(kcal),Fat(g),Carbohydrates(g),Fiber(g),Protein(g),Sodium(g),n_Energy(kcal),n_Fat(g),n_Carbohydrates(g),n_Fiber(g),n_Protein(g),n_Sodium(g)
0,SB,food,Bakery,Chonga Bagel,300,5.0,50,3.0,12,0.53,0.344828,0.171821,0.395883,0.5,0.45977,1.06
1,SB,food,Bakery,8-Grain Roll,380,6.0,70,7.0,10,0.43,0.436782,0.206186,0.554236,1.166667,0.383142,0.86
2,SB,food,Bakery,Almond Croissant,410,22.0,45,3.0,10,0.39,0.471264,0.756014,0.356295,0.5,0.383142,0.78
3,SB,food,Bakery,Banana Nut Bread,420,22.0,52,2.0,6,0.32,0.482759,0.756014,0.411718,0.333333,0.229885,0.64
4,SB,food,Bakery,Birthday Cake Pop,170,9.0,23,0.0,1,0.11,0.195402,0.309278,0.182106,0.0,0.038314,0.22


### Preserve SB menu

In [8]:
# selecting SB part of whole df by index:
SB_menu = menu_df.iloc[0:217]

#  drop cols with 'old' (original, non-normalized) values:
SB_menu.drop(['McD/SB', 'Energy(kcal)', 
               'Fat(g)', 'Carbohydrates(g)', 
               'Fiber(g)', 'Protein(g)', 'Sodium(g)'], axis=1, inplace=True)

SB_menu.reset_index(drop=True, inplace=True)
SB_menu.head ()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  errors=errors)


Unnamed: 0,Category,Kind,Item,n_Energy(kcal),n_Fat(g),n_Carbohydrates(g),n_Fiber(g),n_Protein(g),n_Sodium(g)
0,food,Bakery,Chonga Bagel,0.344828,0.171821,0.395883,0.5,0.45977,1.06
1,food,Bakery,8-Grain Roll,0.436782,0.206186,0.554236,1.166667,0.383142,0.86
2,food,Bakery,Almond Croissant,0.471264,0.756014,0.356295,0.5,0.383142,0.78
3,food,Bakery,Banana Nut Bread,0.482759,0.756014,0.411718,0.333333,0.229885,0.64
4,food,Bakery,Birthday Cake Pop,0.195402,0.309278,0.182106,0.0,0.038314,0.22


In [9]:
# all SB item groups
SB_menu['Kind'].unique()

array(['Bakery', 'Salads', 'Cold Sandwiches', 'Protein Boxes & Bowls',
       'Warm Sandwiches', 'Yogurt & Custard', 'Soups', 'Hot Breakfast',
       'Biscotti & Cookies', 'Chocolates & Candy', 'Fruit & Nuts',
       'Popcorn & Chips', 'Meat & Cheese', 'Snack Bars', 'Cold Drinks',
       'Hot Drinks'], dtype=object)

In [10]:
# for SB menu-based let's assume menus following kinds. The groups could be
# combined to obtain 2 and 3-item meal. Lists of item groups:
SB_first = ['Soups', 'Salads', 'Protein Boxes & Bowls', 'Hot Breakfast']
SB_second = ['Bakery', 'Cold Sandwiches', 'Warm Sandwiches']
SB_desserts = ['Chocolates & Candy', 'Fruit & Nuts', 'Biscotti & Cookies', 'Yogurt & Custard']
SB_snacks = ['Meat & Cheese', 'Snack Bars', 'Fruit & Nuts', 'Popcorn & Chips']
SB_drinks = ['Cold Drinks', 'Hot Drinks']

# 3-item 'solid breakfast':
SB_b = [SB_first, SB_second, SB_drinks] 

# 2-item 'light breakfast':
SB_light_b = [SB_first, SB_drinks]

# 3-item 'solid dinner':
SB_d = [SB_second, SB_desserts, SB_drinks]

# 2-item 'light dinner':
SB_light_d = [SB_second, SB_drinks]

# 2-item 'snack breakfast/dinner':
SB_snack = [SB_snacks, SB_drinks]

# all non-repeated element-wise combinatorial trick with Itertools 'product' function
SB_current = list(product(*SB_snack))

In [11]:
# see what (and how many) combinations in certain meal:
SB_current
len (SB_current)

8

Current list name (<b>'SB_snack'</b> as example) should be passed to <b>'SB_current'</b> variable for unpacking and processing:

In [12]:
# Full set of combinations for 2-item breakfast / 3-item dinner formats from above catrgories
# I know... Not full-automated and bad code... 

# 3-item or 2-item meals (manual switch):

# k1, k2, k3 = [], [], []
k1, k2 = [], []

def meal_maker (row, args=SB_current[0]): # iterated manually from 0 to len(McD_current)-1
    
    '''function for making lists of all items that contain in respective item groups in 
    'SB_current' format 3-item or 2-item --> 3 (or 2) lists of items'''
    
    # Well, not so elegant code... But slices cause kernel death

    if args[0] in row[1]:
        k1.append (list ([row[2], row[3], row[4], row[5], row[6], row[7], row[8]]))
    if args[1] in row[1]:
        k2.append (list ([row[2], row[3], row[4], row[5], row[6], row[7], row[8]]))
#     if args[2] in row[1]:
#         k3.append (list ([row[2], row[3], row[4], row[5], row[6], row[7], row[8]]))
    
    return k1,k2  #,k3

SB_menu.apply (meal_maker, axis=1)

# make all non-repeated element-wise combinations of items (by default of 3-item meal)
# and create with the results
arrs = [k1,k2] # ,k3
combinator = list(product(*arrs))
df_combi = pd.DataFrame (combinator, columns = ['first', 'second']) # , 'third'
df_combi.head()

Unnamed: 0,first,second
0,[Creminelli Sopressata Monterey Jack Snack Tra...,"[Cool Lime Starbucks Refreshers Beverage, 0.05..."
1,[Creminelli Sopressata Monterey Jack Snack Tra...,[Strawberry Acai Starbucks Refreshers Beverage...
2,[Creminelli Sopressata Monterey Jack Snack Tra...,"[Vegan Superberry Aeae, 0.39080459770114945, 0..."
3,[Creminelli Sopressata Monterey Jack Snack Tra...,[Very Berry Hibiscus Starbucks Refreshers Beve...
4,[Creminelli Sopressata Monterey Jack Snack Tra...,"[Evolution FreshOrganic Ginger Limeade, 0.1264..."


In [14]:
# Add to all these combinations the metrics of total caloric and nutrition values 
# (Yes, it could be designed with more pythonic and elegant code...)

'''df row-applied functions sums of each value (calories, fat... e t.c.) for each 
    item in meal combination across all generated meal combinations.'''

# ENERGY total of certain meal combination:
def e_sum (row):
    e_sum = row[0][1] + row[1][1] # + row[2][1]
    return e_sum

# FAT total of certain meal combination:
def f_sum (row):   
    f_sum = row[0][2] + row[1][2] # + row[2][2]
    return f_sum

# CARBOHYDRATES total of certain meal combination:
def c_sum (row):       
    c_sum = row[0][3] + row[1][3] # + row[2][3]        
    return c_sum

# FIBER total of certain meal combination:
def fi_sum (row):
    fi_sum = row[0][4] + row[1][4] # + row[2][4]
    return fi_sum

# PROTEIN total of certain meal combination:
def p_sum (row): 
    p_sum = row[0][5] + row[1][5] # + row[2][5]
    return p_sum

# SODIUM total of certain meal combination:
def s_sum (row):
    s_sum = row[0][6] + row[1][6] # + row[2][6]
    return s_sum

df_combi['Energy'] = df_combi.apply (e_sum, axis=1)
df_combi['Fat'] = df_combi.apply (f_sum, axis=1)
df_combi['Carbohydrates'] = df_combi.apply (c_sum, axis=1)
df_combi['Fiber'] = df_combi.apply (fi_sum, axis=1)
df_combi['Protein'] = df_combi.apply (p_sum, axis=1)
df_combi['Sodium'] = df_combi.apply (s_sum, axis=1)

# full item combinations in terms of total normalized caloric and nutrition values.
# Add cols for whole group name and group combination names:
df_combi.insert(0, 'McD/SB', 'SB_snack')
df_combi.insert(1, 'Kind', str(SB_current[0]))

In [15]:
# Finally mean and std for all totals as metrics wether certain item combination
# is well-balanced: MEAN ≈ 1 --> average value across 6 values ≈ norm, 
# STDEV ≈ 0 --> each value of all 6 values ≈ norm 
df_combi['MEAN'] = df_combi.mean(axis=1)
df_combi['STDEV'] = df_combi.iloc[:,:-1].std(axis=1)
df_combi

Unnamed: 0,McD/SB,Kind,first,second,Energy,Fat,Carbohydrates,Fiber,Protein,Sodium,MEAN,STDEV
0,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,"[Cool Lime Starbucks Refreshers Beverage, 0.05...",0.304598,0.584192,0.087094,0.000000,0.574713,1.46,0.501766,0.527811
1,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,[Strawberry Acai Starbucks Refreshers Beverage...,0.344828,0.584192,0.142518,0.166667,0.574713,1.46,0.545486,0.486793
2,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,"[Vegan Superberry Aeae, 0.39080459770114945, 0...",0.643678,0.996564,0.435471,1.333333,0.804598,1.72,0.988941,0.471726
3,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,[Very Berry Hibiscus Starbucks Refreshers Beve...,0.321839,0.584192,0.110847,0.166667,0.574713,1.46,0.536376,0.494090
4,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,"[Evolution FreshOrganic Ginger Limeade, 0.1264...",0.379310,0.584192,0.221694,0.000000,0.574713,1.45,0.534985,0.499969
5,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,"[Iced Espresso Classics - Vanilla Latte, 0.149...",0.402299,0.670103,0.166271,0.000000,0.766284,1.57,0.595826,0.558671
6,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,"[Iced Espresso Classics - Caffe Mocha, 0.16091...",0.413793,0.670103,0.182106,0.000000,0.766284,1.62,0.608714,0.573030
7,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,"[Iced Espresso Classics - Caramel Macchiato, 0...",0.402299,0.670103,0.166271,0.000000,0.766284,1.57,0.595826,0.558671
8,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,"[Shaken Sweet Tea, 0.09195402298850575, 0.0, 0...",0.344828,0.584192,0.150435,0.000000,0.574713,1.46,0.519028,0.515436
9,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,"[Tazoп╠ Bottled Berry Blossom White, 0.0689655...",0.321839,0.584192,0.118765,0.000000,0.574713,1.46,0.509918,0.521682


In [46]:
# results df building
results_all = results_all.append(df_combi)
results_all.head()

Unnamed: 0,McD/SB,Kind,first,second,Energy,Fat,Carbohydrates,Fiber,Protein,Sodium,MEAN,STDEV
0,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,"[Cool Lime Starbucks Refreshers Beverage, 0.05...",0.304598,0.584192,0.087094,0.0,0.574713,1.46,0.501766,0.527811
1,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,[Strawberry Acai Starbucks Refreshers Beverage...,0.344828,0.584192,0.142518,0.166667,0.574713,1.46,0.545486,0.486793
2,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,"[Vegan Superberry Aeae, 0.39080459770114945, 0...",0.643678,0.996564,0.435471,1.333333,0.804598,1.72,0.988941,0.471726
3,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,[Very Berry Hibiscus Starbucks Refreshers Beve...,0.321839,0.584192,0.110847,0.166667,0.574713,1.46,0.536376,0.49409
4,SB_snack,"('Meat & Cheese', 'Cold Drinks')",[Creminelli Sopressata Monterey Jack Snack Tra...,"[Evolution FreshOrganic Ginger Limeade, 0.1264...",0.37931,0.584192,0.221694,0.0,0.574713,1.45,0.534985,0.499969


In [45]:
# # writing results df (when all combinations processed):
# results_all.reset_index(inplace=True, drop=True)
# results_all.to_csv('SB_sncak_results.csv', index=False)