## Objective
Eats4Life would like to update its menu to include wine suggestions with each of its main entrees (defined by the meat selection). The owner would like to take a Data Analytics approach and explore data he collected over the past several years on main courses (meat) and wine that was ordered with it. Eats4Life is open to listing more than one wine for each main entree, but only if the data supports it. The scope of services requested includes:

- Summary information on the main entrees (meat)
- Wine suggestion(s) for **each** main entree along with supporting information as to why this (these) wines are suggested for the entrée (if you have no suggested wine for a given entrée, provide information as to why this is your suggestion)
- Any other information of interest in terms of customer order habits

## Data Provided
The dataset `orderData.csv` has three columns:

- `orderNo` – identifies each table/party that sat at the restaurant
- `seatNo` – indicates which seat at the table ordered each meal
- `item` – the item that was ordered

The data has been cleaned, so that each order contains three items per individual: a meat, a side, and a wine.

In [27]:
import pandas as pd
from mlxtend.frequent_patterns import association_rules
from mlxtend.frequent_patterns import apriori

In [28]:
df = pd.read_csv('https://raw.githubusercontent.com/sjsimmo2/DataMining-Fall/refs/heads/master/orderData.csv')

In [29]:
print(df.shape)
df.head()

(228699, 3)


Unnamed: 0,orderNo,seatNo,item
0,122314,1,Salmon
1,122314,1,Oyster Bay Sauvignon Blanc
2,122314,1,Bean Trio
3,122314,2,Pork Chop
4,122314,2,Three Rivers Red


In [30]:
df['item'].value_counts()

item
Seasonal Veg                          14574
Filet Mignon                          13407
Sea Bass                              12302
Duckhorn Chardonnay                   11723
Bean Trio                             11696
Roasted Root Veg                      11323
Pork Tenderloin                       11138
Pork Chop                             10976
Warm Goat Salad                       10605
Adelsheim Pinot Noir                  10308
Roasted Potatoes                       9847
Salmon                                 9336
Caesar Salad                           9168
Mashed Potatoes                        9020
Blackstone Merlot                      8485
Total Recall Chardonnay                8012
Duck Breast                            7915
Single Vineyard Malbec                 7791
Swordfish                              7439
Innocent Bystander Sauvignon Blanc     6397
Oyster Bay Sauvignon Blanc             4815
Echeverria Gran Syrah                  4600
Brancott Pinot Grigio      

In [31]:
meats = [
    "Filet Mignon",
    "Sea Bass",
    "Pork Tenderloin",
    "Pork Chop",
    "Salmon",
    "Duck Breast",
    "Swordfish",
    "Roast Chicken"
]

sides = [
    "Seasonal Veg",
    "Bean Trio",
    "Roasted Root Veg",
    "Warm Goat Salad",
    "Roasted Potatoes",
    "Caesar Salad",
    "Mashed Potatoes"
]

wines = [
    "Duckhorn Chardonnay",
    "Adelsheim Pinot Noir",
    "Blackstone Merlot",
    "Total Recall Chardonnay",
    "Single Vineyard Malbec",
    "Innocent Bystander Sauvignon Blanc",
    "Oyster Bay Sauvignon Blanc",
    "Echeverria Gran Syrah",
    "Brancott Pinot Grigio",
    "Cantina Pinot Bianco",
    "Louis Rouge",
    "Helben Blanc",
    "Three Rivers Red"
]

## Data Processing

### Meat + Wine

In [32]:
#create a df with only the meat and wine pairings
df_meat_wine = df[df['item'].isin(meats + wines)]

#create a dummy varriable for each item
df_1 = pd.get_dummies(df_meat_wine["item"])*1

#add the original order number to the new df
df_1["orderNo"] = df_meat_wine["orderNo"]
#add the original seat number to the new df
df_1['seatNo'] = df_meat_wine['seatNo']
#df_1["order_seatNo"] = df['orderNo'].astype(str) + "_" + df['seatNo'].astype(str)

#group by orderNo and seatNo, then calculates the maximum value for each col
df_1 = df_1.groupby(['orderNo', 'seatNo']).max()

#convert the dummy vars back to boolean 
preprocessed_df = df_1.map(bool)

preprocessed_df.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Adelsheim Pinot Noir,Blackstone Merlot,Brancott Pinot Grigio,Cantina Pinot Bianco,Duck Breast,Duckhorn Chardonnay,Echeverria Gran Syrah,Filet Mignon,Helben Blanc,Innocent Bystander Sauvignon Blanc,...,Oyster Bay Sauvignon Blanc,Pork Chop,Pork Tenderloin,Roast Chicken,Salmon,Sea Bass,Single Vineyard Malbec,Swordfish,Three Rivers Red,Total Recall Chardonnay
orderNo,seatNo,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
122314,1,False,False,False,False,False,False,False,False,False,False,...,True,False,False,False,True,False,False,False,False,False
122314,2,False,False,False,False,False,False,False,False,False,False,...,False,True,False,False,False,False,False,False,True,False
122314,3,False,False,False,False,False,False,False,False,False,False,...,True,False,False,False,False,True,False,False,False,False
122314,4,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,True,False,False,False,True
122314,5,False,False,False,False,True,False,False,False,False,True,...,False,False,False,False,False,False,False,False,False,False


### Meat + Side

In [33]:
#create a df with only the meat and side pairings
df_meat_side = df[df['item'].isin(meats + sides)]

#create a dummy varriable for each item
df_2 = pd.get_dummies(df_meat_side["item"])*1

#add the original order number to the new df
df_2["orderNo"] = df_meat_side["orderNo"]
#add the original seat number to the new df
df_2['seatNo'] = df_meat_side['seatNo']

#group by orderNo and seatNo, then calculates the maximum value for each col
df_2 = df_2.groupby(['orderNo', 'seatNo']).max()

#convert the dummy vars back to boolean 
preprocessed_sides_df = df_2.map(bool)

preprocessed_sides_df.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Bean Trio,Caesar Salad,Duck Breast,Filet Mignon,Mashed Potatoes,Pork Chop,Pork Tenderloin,Roast Chicken,Roasted Potatoes,Roasted Root Veg,Salmon,Sea Bass,Seasonal Veg,Swordfish,Warm Goat Salad
orderNo,seatNo,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
122314,1,True,False,False,False,False,False,False,False,False,False,True,False,False,False,False
122314,2,False,True,False,False,False,True,False,False,False,False,False,False,False,False,False
122314,3,False,True,False,False,False,False,False,False,False,False,False,True,False,False,False
122314,4,True,False,False,False,False,False,False,False,False,False,False,True,False,False,False
122314,5,True,False,True,False,False,False,False,False,False,False,False,False,False,False,False


### Meat + Side + Wine

In [34]:
#create a dummy varriable for each item
df_1 = pd.get_dummies(df["item"])*1

#add the original order number to the new df
df_1["orderNo"] = df["orderNo"]
#add the original seat number to the new df
df_1['seatNo'] = df['seatNo']

#group by orderNo and seatNo, then calculates the maximum value for each col
df_1 = df_1.groupby(['orderNo', 'seatNo']).max()

#convert the dummy vars back to boolean 
preprocessed_df_all = df_1.map(bool)

preprocessed_df.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Adelsheim Pinot Noir,Blackstone Merlot,Brancott Pinot Grigio,Cantina Pinot Bianco,Duck Breast,Duckhorn Chardonnay,Echeverria Gran Syrah,Filet Mignon,Helben Blanc,Innocent Bystander Sauvignon Blanc,...,Oyster Bay Sauvignon Blanc,Pork Chop,Pork Tenderloin,Roast Chicken,Salmon,Sea Bass,Single Vineyard Malbec,Swordfish,Three Rivers Red,Total Recall Chardonnay
orderNo,seatNo,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
122314,1,False,False,False,False,False,False,False,False,False,False,...,True,False,False,False,True,False,False,False,False,False
122314,2,False,False,False,False,False,False,False,False,False,False,...,False,True,False,False,False,False,False,False,True,False
122314,3,False,False,False,False,False,False,False,False,False,False,...,True,False,False,False,False,True,False,False,False,False
122314,4,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,True,False,False,False,True
122314,5,False,False,False,False,True,False,False,False,False,True,...,False,False,False,False,False,False,False,False,False,False


## Apriori Algorithm

### Support
- **Definition**: Support represents how frequently the combination of antecedent (meat) and consequent (wine) appears together in the dataset.
- **Interpretation**: It tells you the proportion of transactions (orders) where both the specified meat and wine were ordered together.

### Confidence
- **Definition**: Confidence measures how often the wine (consequent) is purchased when the meat (antecedent) is purchased.
- **Interpretation**: It indicates the likelihood that if a customer orders a specific meat, they will also order the specific wine. High confidence means that the wine is frequently bought with the meat.

### Lift
- **Definition**: Lift measures how much more likely a customer is to buy a particular wine when they've bought a particular meat, compared to buying that wine independently of the meat.
- **Interpretation**: Lift tells you if there is a strong positive association between a specific meat and wine. A lift value greater than 1 indicates that the wine is more likely to be bought with the meat than randomly. A lift value less than 1 means they are less likely to be bought together than expected by chance.

### Meat and Wine Pairings Only

In [35]:
#apriori algorithm
food_wine_assoc = apriori(preprocessed_df, min_support = 0.001, use_colnames = True)

#association rules
out_rules = association_rules(food_wine_assoc,metric = "confidence", min_threshold = 0.1)

#select columns from the output
out_rules2=out_rules[['antecedents','consequents','support','confidence','lift']]

out_rules2

Unnamed: 0,antecedents,consequents,support,confidence,lift
0,(Duck Breast),(Adelsheim Pinot Noir),0.016476,0.158686,1.173565
1,(Adelsheim Pinot Noir),(Duck Breast),0.016476,0.121847,1.173565
2,(Filet Mignon),(Adelsheim Pinot Noir),0.049650,0.282315,2.087867
3,(Adelsheim Pinot Noir),(Filet Mignon),0.049650,0.367191,2.087867
4,(Adelsheim Pinot Noir),(Pork Chop),0.013826,0.102251,0.710175
...,...,...,...,...,...
80,(Three Rivers Red),(Sea Bass),0.004854,0.260931,1.616936
81,(Total Recall Chardonnay),(Sea Bass),0.015990,0.152147,0.942823
82,(Three Rivers Red),(Swordfish),0.001928,0.103667,1.062355
83,(Total Recall Chardonnay),(Swordfish),0.017447,0.166001,1.701136


Lets only looking at meats as antecedents and wines as consequents

In [36]:
#filtering for antecedents to be meats only
meats_antecedents = out_rules2[out_rules2['antecedents'].apply(lambda x: list(x)[0] in meats)]
#sort by lift
meats_antecedents.sort_values(by='lift', ascending=False)


Unnamed: 0,antecedents,consequents,support,confidence,lift
58,(Pork Chop),(Louis Rouge),0.016555,0.114978,3.397336
9,(Filet Mignon),(Blackstone Merlot),0.062729,0.356679,3.204565
62,(Roast Chicken),(Oyster Bay Sauvignon Blanc),0.009471,0.194086,3.072847
12,(Roast Chicken),(Brancott Pinot Grigio),0.008448,0.173118,2.947147
7,(Duck Breast),(Blackstone Merlot),0.032886,0.31674,2.845736
53,(Sea Bass),(Innocent Bystander Sauvignon Blanc),0.036651,0.227118,2.706558
18,(Swordfish),(Brancott Pinot Grigio),0.014705,0.150692,2.56537
63,(Salmon),(Oyster Bay Sauvignon Blanc),0.018365,0.149957,2.374181
43,(Filet Mignon),(Echeverria Gran Syrah),0.024648,0.140151,2.322632
21,(Salmon),(Cantina Pinot Bianco),0.013328,0.108826,2.30833


Lets see what meat and wine pairings have the highest lift values.  
I'm picking lift as, lift controls for the popularity of both items individually. Even if a wine is popular, a high lift means it’s more likely to be chosen because of the meat rather than general popularity.

In [37]:
#finding the highest lift meat and wine pairing
meats_antecedents.loc[meats_antecedents.groupby('antecedents')['lift'].idxmax()].sort_values(by='lift', ascending=False)

Unnamed: 0,antecedents,consequents,support,confidence,lift
58,(Pork Chop),(Louis Rouge),0.016555,0.114978,3.397336
9,(Filet Mignon),(Blackstone Merlot),0.062729,0.356679,3.204565
62,(Roast Chicken),(Oyster Bay Sauvignon Blanc),0.009471,0.194086,3.072847
7,(Duck Breast),(Blackstone Merlot),0.032886,0.31674,2.845736
53,(Sea Bass),(Innocent Bystander Sauvignon Blanc),0.036651,0.227118,2.706558
18,(Swordfish),(Brancott Pinot Grigio),0.014705,0.150692,2.56537
63,(Salmon),(Oyster Bay Sauvignon Blanc),0.018365,0.149957,2.374181
5,(Pork Tenderloin),(Adelsheim Pinot Noir),0.043734,0.299336,2.213742


Lets take a look at the top 3 wines (based on their lift values) for each type of meat.

In [54]:
top3 = meats_antecedents.groupby('antecedents', group_keys=False).apply(lambda x: x.nlargest(3, 'lift')).sort_values(by=['antecedents', 'lift'], ascending=[True, False])

top3.to_csv('Top3_Wines_By_Meat')

top3

  top3 = meats_antecedents.groupby('antecedents', group_keys=False).apply(lambda x: x.nlargest(3, 'lift')).sort_values(by=['antecedents', 'lift'], ascending=[True, False])


Unnamed: 0,antecedents,consequents,support,confidence,lift
7,(Duck Breast),(Blackstone Merlot),0.032886,0.31674,2.845736
0,(Duck Breast),(Adelsheim Pinot Noir),0.016476,0.158686,1.173565
29,(Duck Breast),(Single Vineyard Malbec),0.010533,0.101453,0.992692
9,(Filet Mignon),(Blackstone Merlot),0.062729,0.356679,3.204565
43,(Filet Mignon),(Echeverria Gran Syrah),0.024648,0.140151,2.322632
2,(Filet Mignon),(Adelsheim Pinot Noir),0.04965,0.282315,2.087867
5,(Pork Tenderloin),(Adelsheim Pinot Noir),0.043734,0.299336,2.213742
72,(Pork Tenderloin),(Single Vineyard Malbec),0.026117,0.178757,1.749097
74,(Pork Tenderloin),(Total Recall Chardonnay),0.026052,0.178308,1.696579
62,(Roast Chicken),(Oyster Bay Sauvignon Blanc),0.009471,0.194086,3.072847


### Meat and Side Pairings Only

In [39]:
#apriori algorithm
food_assoc = apriori(preprocessed_sides_df, min_support = 0.001, use_colnames = True)

#association rules
out_rules_side = association_rules(food_assoc,metric = "confidence", min_threshold = 0.1)

#select columns from the output
out_rules2_side=out_rules_side[['antecedents','consequents','support','confidence','lift']]

out_rules2_side

Unnamed: 0,antecedents,consequents,support,confidence,lift
0,(Duck Breast),(Bean Trio),0.018129,0.174605,1.138054
1,(Bean Trio),(Duck Breast),0.018129,0.118160,1.138054
2,(Filet Mignon),(Bean Trio),0.030079,0.171030,1.114752
3,(Bean Trio),(Filet Mignon),0.030079,0.196050,1.114752
4,(Pork Chop),(Bean Trio),0.024543,0.170463,1.111054
...,...,...,...,...,...
84,(Warm Goat Salad),(Sea Bass),0.027718,0.199246,1.234685
85,(Sea Bass),(Warm Goat Salad),0.027718,0.171761,1.234685
86,(Swordfish),(Seasonal Veg),0.017512,0.179460,0.938709
87,(Warm Goat Salad),(Swordfish),0.016712,0.120132,1.231083


In [40]:
#filtering for antecedents to be meats only
meats_antecedents_2 = out_rules2_side[out_rules2_side['antecedents'].apply(lambda x: list(x)[0] in meats)]
#sort by lift
meats_antecedents_2.sort_values(by='lift', ascending=False)

Unnamed: 0,antecedents,consequents,support,confidence,lift
18,(Pork Tenderloin),(Caesar Salad),0.032401,0.221763,1.843988
65,(Roast Chicken),(Roasted Potatoes),0.009576,0.196237,1.519214
45,(Roast Chicken),(Mashed Potatoes),0.008487,0.173925,1.469934
61,(Pork Tenderloin),(Seasonal Veg),0.036402,0.249147,1.303227
7,(Salmon),(Bean Trio),0.023717,0.193659,1.262244
85,(Sea Bass),(Warm Goat Salad),0.027718,0.171761,1.234685
88,(Swordfish),(Warm Goat Salad),0.016712,0.17126,1.231083
72,(Salmon),(Roasted Root Veg),0.021893,0.17877,1.203586
59,(Pork Tenderloin),(Roasted Potatoes),0.022208,0.152002,1.176762
46,(Salmon),(Mashed Potatoes),0.017027,0.139032,1.175034


In [41]:
#finding the highest lift meat and wine pairing
meats_antecedents_2.loc[meats_antecedents_2.groupby('antecedents')['lift'].idxmax()].sort_values(by='lift', ascending=False)

Unnamed: 0,antecedents,consequents,support,confidence,lift
18,(Pork Tenderloin),(Caesar Salad),0.032401,0.221763,1.843988
65,(Roast Chicken),(Roasted Potatoes),0.009576,0.196237,1.519214
7,(Salmon),(Bean Trio),0.023717,0.193659,1.262244
85,(Sea Bass),(Warm Goat Salad),0.027718,0.171761,1.234685
88,(Swordfish),(Warm Goat Salad),0.016712,0.17126,1.231083
0,(Duck Breast),(Bean Trio),0.018129,0.174605,1.138054
2,(Filet Mignon),(Bean Trio),0.030079,0.17103,1.114752
4,(Pork Chop),(Bean Trio),0.024543,0.170463,1.111054


### Meat + Side + Wine

In [42]:
#apriori algorithm
food_side_wine_assoc = apriori(preprocessed_df_all, min_support = 0.001, use_colnames = True)

#association rules
out_rules_all = association_rules(food_side_wine_assoc,metric = "confidence", min_threshold = 0.1)

#select columns from the output
out_rules2_all = out_rules_all[['antecedents','consequents','support','confidence','lift']]

out_rules2_all

Unnamed: 0,antecedents,consequents,support,confidence,lift
0,(Bean Trio),(Adelsheim Pinot Noir),0.017919,0.116792,0.863738
1,(Adelsheim Pinot Noir),(Bean Trio),0.017919,0.132518,0.863738
2,(Caesar Salad),(Adelsheim Pinot Noir),0.020516,0.170593,1.261626
3,(Adelsheim Pinot Noir),(Caesar Salad),0.020516,0.151727,1.261626
4,(Duck Breast),(Adelsheim Pinot Noir),0.016476,0.158686,1.173565
...,...,...,...,...,...
1014,"(Swordfish, Total Recall Chardonnay)",(Seasonal Veg),0.002873,0.164662,0.861305
1015,"(Swordfish, Seasonal Veg)",(Total Recall Chardonnay),0.002873,0.164045,1.560863
1016,"(Warm Goat Salad, Total Recall Chardonnay)",(Swordfish),0.003043,0.196610,2.014812
1017,"(Warm Goat Salad, Swordfish)",(Total Recall Chardonnay),0.003043,0.182104,1.732689


Lets see what pairings have the highest lift values 

In [43]:
out_rules2_all.sort_values(by='lift', ascending=False)

Unnamed: 0,antecedents,consequents,support,confidence,lift
892,"(Roasted Potatoes, Oyster Bay Sauvignon Blanc)",(Roast Chicken),0.001863,0.249561,5.114182
524,"(Roasted Potatoes, Brancott Pinot Grigio)",(Roast Chicken),0.001718,0.223932,4.588973
861,"(Mashed Potatoes, Oyster Bay Sauvignon Blanc)",(Roast Chicken),0.001732,0.208531,4.273368
509,"(Mashed Potatoes, Brancott Pinot Grigio)",(Roast Chicken),0.001364,0.192237,3.939455
851,"(Louis Rouge, Roasted Root Veg)",(Pork Chop),0.002873,0.515294,3.578937
...,...,...,...,...,...
339,"(Mashed Potatoes, Adelsheim Pinot Noir)",(Pork Chop),0.001600,0.105172,0.730467
613,"(Caesar Salad, Total Recall Chardonnay)",(Sea Bass),0.001456,0.116109,0.719503
10,(Adelsheim Pinot Noir),(Pork Chop),0.013826,0.102251,0.710175
344,"(Roasted Potatoes, Adelsheim Pinot Noir)",(Pork Chop),0.001810,0.100364,0.697068


#### consequents = meats

In [44]:
#filtering for consequents to be meats only
meats_consequents_all = out_rules2_all[out_rules2_all['consequents'].apply(lambda x: list(x)[0] in meats)]
#sort by lift
meats_consequents_all.sort_values(by='lift', ascending=False)

Unnamed: 0,antecedents,consequents,support,confidence,lift
892,"(Roasted Potatoes, Oyster Bay Sauvignon Blanc)",(Roast Chicken),0.001863,0.249561,5.114182
524,"(Roasted Potatoes, Brancott Pinot Grigio)",(Roast Chicken),0.001718,0.223932,4.588973
861,"(Mashed Potatoes, Oyster Bay Sauvignon Blanc)",(Roast Chicken),0.001732,0.208531,4.273368
509,"(Mashed Potatoes, Brancott Pinot Grigio)",(Roast Chicken),0.001364,0.192237,3.939455
851,"(Louis Rouge, Roasted Root Veg)",(Pork Chop),0.002873,0.515294,3.578937
...,...,...,...,...,...
339,"(Mashed Potatoes, Adelsheim Pinot Noir)",(Pork Chop),0.001600,0.105172,0.730467
613,"(Caesar Salad, Total Recall Chardonnay)",(Sea Bass),0.001456,0.116109,0.719503
10,(Adelsheim Pinot Noir),(Pork Chop),0.013826,0.102251,0.710175
344,"(Roasted Potatoes, Adelsheim Pinot Noir)",(Pork Chop),0.001810,0.100364,0.697068


In [45]:
meats_consequents_all.loc[meats_consequents_all.groupby('consequents')['lift'].idxmax()].sort_values(by='lift', ascending=False)

Unnamed: 0,antecedents,consequents,support,confidence,lift
892,"(Roasted Potatoes, Oyster Bay Sauvignon Blanc)",(Roast Chicken),0.001863,0.249561,5.114182
851,"(Louis Rouge, Roasted Root Veg)",(Pork Chop),0.002873,0.515294,3.578937
470,"(Caesar Salad, Blackstone Merlot)",(Filet Mignon),0.008159,0.622623,3.540269
308,"(Caesar Salad, Adelsheim Pinot Noir)",(Pork Tenderloin),0.009733,0.474425,3.247155
589,"(Caesar Salad, Innocent Bystander Sauvignon Bl...",(Sea Bass),0.004237,0.521809,3.233547
499,(Blackstone Merlot),"(Filet Mignon, Seasonal Veg)",0.011557,0.10383,3.156019
485,"(Warm Goat Salad, Blackstone Merlot)",(Duck Breast),0.004631,0.323853,3.119179
623,"(Mashed Potatoes, Cantina Pinot Bianco)",(Salmon),0.002046,0.362791,2.962363
560,"(Warm Goat Salad, Brancott Pinot Grigio)",(Swordfish),0.002505,0.287651,2.947771


#### consequents = sides

In [46]:
#filtering for consequents to be sides only
sides_consequents_all = out_rules2_all[out_rules2_all['consequents'].apply(lambda x: list(x)[0] in sides)]
#sort by lift
sides_consequents_all.sort_values(by='lift', ascending=False)

Unnamed: 0,antecedents,consequents,support,confidence,lift
570,"(Pork Tenderloin, Duckhorn Chardonnay)",(Caesar Salad),0.008172,0.230144,1.913675
606,"(Pork Tenderloin, Single Vineyard Malbec)",(Caesar Salad),0.005929,0.227022,1.887711
307,"(Pork Tenderloin, Adelsheim Pinot Noir)",(Caesar Salad),0.009733,0.222555,1.850575
86,(Pork Tenderloin),(Caesar Salad),0.032401,0.221763,1.843988
609,"(Pork Tenderloin, Total Recall Chardonnay)",(Caesar Salad),0.005614,0.215509,1.791979
...,...,...,...,...,...
160,(Filet Mignon),(Warm Goat Salad),0.020241,0.115089,0.827307
979,"(Roast Chicken, Total Recall Chardonnay)",(Roasted Root Veg),0.001207,0.121693,0.819309
764,"(Echeverria Gran Syrah, Filet Mignon)",(Warm Goat Salad),0.002755,0.111762,0.803387
338,"(Filet Mignon, Adelsheim Pinot Noir)",(Warm Goat Salad),0.005300,0.106737,0.767269


In [47]:
sides_consequents_all.loc[sides_consequents_all.groupby('consequents')['lift'].idxmax()].sort_values(by='lift', ascending=False)

Unnamed: 0,antecedents,consequents,support,confidence,lift
570,"(Pork Tenderloin, Duckhorn Chardonnay)",(Caesar Salad),0.008172,0.230144,1.913675
977,"(Roast Chicken, Total Recall Chardonnay)",(Roasted Potatoes),0.00202,0.203704,1.577023
862,"(Oyster Bay Sauvignon Blanc, Roast Chicken)",(Mashed Potatoes),0.001732,0.182825,1.545159
1009,"(Sea Bass, Three Rivers Red)",(Warm Goat Salad),0.00101,0.208108,1.495965
439,"(Salmon, Oyster Bay Sauvignon Blanc)",(Bean Trio),0.003948,0.215,1.401342
769,"(Pork Tenderloin, Echeverria Gran Syrah)",(Seasonal Veg),0.002965,0.257697,1.347948
539,"(Salmon, Brancott Pinot Grigio)",(Roasted Root Veg),0.002204,0.199525,1.343318


#### consequents = wines

In [48]:
#filtering for consequents to be wines only
wines_consequents_all = out_rules2_all[out_rules2_all['consequents'].apply(lambda x: list(x)[0] in wines)]
#sort by lift
wines_consequents_all.sort_values(by='lift', ascending=False)

Unnamed: 0,antecedents,consequents,support,confidence,lift
853,"(Roasted Root Veg, Pork Chop)",(Louis Rouge),0.002873,0.120928,3.573131
856,"(Pork Chop, Seasonal Veg)",(Louis Rouge),0.003070,0.120247,3.553009
848,"(Roasted Potatoes, Pork Chop)",(Louis Rouge),0.002230,0.118467,3.500421
858,"(Warm Goat Salad, Pork Chop)",(Louis Rouge),0.002164,0.115708,3.418910
183,(Pork Chop),(Louis Rouge),0.016555,0.114978,3.397336
...,...,...,...,...,...
19,(Warm Goat Salad),(Adelsheim Pinot Noir),0.016869,0.121264,0.896807
0,(Bean Trio),(Adelsheim Pinot Noir),0.017919,0.116792,0.863738
304,"(Caesar Salad, Pork Chop)",(Adelsheim Pinot Noir),0.001692,0.108953,0.805762
126,(Roast Chicken),(Duckhorn Chardonnay),0.005982,0.122581,0.797124


In [49]:
wines_consequents_all.loc[wines_consequents_all.groupby('consequents')['lift'].idxmax()].sort_values(by='lift', ascending=False)

Unnamed: 0,antecedents,consequents,support,confidence,lift
853,"(Roasted Root Veg, Pork Chop)",(Louis Rouge),0.002873,0.120928,3.573131
490,"(Roasted Potatoes, Filet Mignon)",(Blackstone Merlot),0.008238,0.364481,3.274655
860,"(Mashed Potatoes, Roast Chicken)",(Oyster Bay Sauvignon Blanc),0.001732,0.204019,3.230103
526,"(Roast Chicken, Roasted Root Veg)",(Brancott Pinot Grigio),0.001246,0.188492,3.208869
588,"(Caesar Salad, Sea Bass)",(Innocent Bystander Sauvignon Blanc),0.004237,0.251166,2.993148
621,"(Salmon, Mashed Potatoes)",(Cantina Pinot Bianco),0.002046,0.120185,2.549264
579,"(Caesar Salad, Filet Mignon)",(Echeverria Gran Syrah),0.003345,0.146216,2.423142
354,"(Pork Tenderloin, Roasted Root Veg)",(Adelsheim Pinot Noir),0.003502,0.312646,2.312182
785,"(Warm Goat Salad, Filet Mignon)",(Single Vineyard Malbec),0.004696,0.232016,2.270215
980,"(Roast Chicken, Seasonal Veg)",(Total Recall Chardonnay),0.001797,0.218501,2.079003
