# Intro to Recommender Systems Lab

Complete the exercises below to solidify your knowledge and understanding of recommender systems.

For this lab, we are going to be putting together a user similarity based recommender system in a step-by-step fashion. Our data set contains customer grocery purchases, and we will use similar purchase behavior to inform our recommender system. Our recommender system will generate 5 recommendations for each customer based on the purchases they have made.

In [1]:
import pandas as pd
from scipy.spatial.distance import pdist, squareform

In [5]:
data = pd.read_csv('/Users/paksivatkinavaleria/Desktop/bootcamp_ironhack/Modulo 3/lab_intro_recommender_systems-main/data/customer_product_sales.csv')

In [6]:
data.head()

Unnamed: 0,CustomerID,FirstName,LastName,SalesID,ProductID,ProductName,Quantity
0,61288,Rosa,Andersen,134196,229,Bread - Hot Dog Buns,16
1,77352,Myron,Murray,6167892,229,Bread - Hot Dog Buns,20
2,40094,Susan,Stevenson,5970885,229,Bread - Hot Dog Buns,11
3,23548,Tricia,Vincent,6426954,229,Bread - Hot Dog Buns,6
4,78981,Scott,Burch,819094,229,Bread - Hot Dog Buns,20


## Step 1: Create a data frame that contains the total quantity of each product purchased by each customer.

You will need to group by CustomerID and ProductName and then sum the Quantity field.

In [8]:
# Group by CustomerID and ProductName, then sum the Quantity
grouped_df = data.groupby(['CustomerID', 'ProductName']).agg({'Quantity': 'sum'}).reset_index()

grouped_df

Unnamed: 0,CustomerID,ProductName,Quantity
0,33,Apricots - Dried,1
1,33,Assorted Desserts,1
2,33,Bandage - Flexible Neon,1
3,33,"Bar Mix - Pina Colada, 355 Ml",1
4,33,"Beans - Kidney, Canned",1
...,...,...,...
63623,98200,Vol Au Vents,50
63624,98200,Wasabi Powder,25
63625,98200,Wine - Fume Blanc Fetzer,25
63626,98200,Wine - Hardys Bankside Shiraz,25


## Step 2: Use the `pivot_table` method to create a product by customer matrix.

The rows of the matrix should represent the products, the columns should represent the customers, and the values should be the quantities of each product purchased by each customer. You will also need to replace nulls with zeros, which you can do using the `fillna` method.

In [9]:
# Create a product by customer matrix using pivot_table
product_customer_matrix = grouped_df.pivot_table(index='ProductName', columns='CustomerID', values='Quantity', fill_value=0)

product_customer_matrix

CustomerID,33,200,264,356,412,464,477,639,649,669,...,97697,97753,97769,97793,97900,97928,98069,98159,98185,98200
ProductName,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Anchovy Paste - 56 G Tube,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,...,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"Appetizer - Mini Egg Roll, Shrimp",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,25.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Appetizer - Mushroom Tart,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,...,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,0.0
Appetizer - Sausage Rolls,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,25.0,25.0,25.0,0.0,25.0,0.0
Apricots - Dried,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Yeast Dry - Fermipan,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0
Yoghurt Tubes,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,0.0,0.0
"Yogurt - Blueberry, 175 Gr",0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,25.0,0.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0
Yogurt - French Vanilla,1.0,0.0,0.0,1.0,0.0,0.0,2.0,0.0,0.0,1.0,...,0.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0


## Step 3: Create a customer similarity matrix using `squareform` and `pdist`. For the distance metric, choose "euclidean."

In [10]:
# Create the customer similarity matrix using Euclidean distance
distance_matrix = pdist(product_customer_matrix.T, metric='euclidean')
customer_similarity_matrix = squareform(distance_matrix)

# Create a DataFrame for better readability
customer_similarity_df = pd.DataFrame(customer_similarity_matrix, index=product_customer_matrix.columns, columns=product_customer_matrix.columns)

customer_similarity_df

CustomerID,33,200,264,356,412,464,477,639,649,669,...,97697,97753,97769,97793,97900,97928,98069,98159,98185,98200
CustomerID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
33,0.000000,11.916375,10.488088,11.224972,11.401754,11.090537,12.409674,11.045361,11.269428,11.489125,...,206.871941,213.180675,225.656819,198.232187,230.913404,220.501701,217.188858,228.628520,239.000000,229.773802
200,11.916375,0.000000,11.747340,12.083046,12.569805,12.288206,12.165525,12.083046,11.874342,12.000000,...,206.310446,212.635839,224.697575,197.139544,230.952376,220.202180,215.728997,228.010965,239.037654,229.704158
264,10.488088,11.747340,0.000000,11.489125,11.224972,11.445523,12.000000,11.401754,11.180340,11.747340,...,206.387984,212.946003,225.435135,197.600607,230.371439,219.136943,216.612557,228.081126,238.266657,229.773802
356,11.224972,12.083046,11.489125,0.000000,12.083046,11.789826,12.328828,11.135529,11.958261,12.165525,...,206.649462,213.082144,225.452878,197.494304,231.038958,219.952268,217.437347,228.098663,238.493186,229.464594
412,11.401754,12.569805,11.224972,12.083046,0.000000,11.704700,12.328828,11.135529,11.789826,11.747340,...,206.900942,211.679002,225.572605,197.630969,230.614397,219.733930,217.446545,227.997807,238.396728,228.927936
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
97928,220.501701,220.202180,219.136943,219.952268,219.733930,219.599636,219.538152,219.924987,219.827205,220.070443,...,283.945417,283.945417,302.076149,272.717803,278.388218,0.000000,273.861279,291.547595,306.186218,307.205143
98069,217.188858,215.728997,216.612557,217.437347,217.446545,217.425849,216.903204,217.294731,217.080630,216.751009,...,283.945417,283.945417,295.803989,283.945417,285.043856,273.861279,0.000000,287.228132,297.909382,294.745653
98159,228.628520,228.010965,228.081126,228.098663,227.997807,228.197283,228.028507,228.181945,227.868383,228.103047,...,283.945417,279.508497,300.000000,290.473751,300.000000,291.547595,287.228132,0.000000,304.138127,305.163890
98185,239.000000,239.037654,238.266657,238.493186,238.396728,239.006276,238.949786,238.468027,238.692271,239.334494,...,301.039864,315.238005,306.186218,292.617498,314.245127,306.186218,297.909382,304.138127,0.000000,303.108891


## Step 4: Check your results by generating a list of the top 5 most similar customers for a specific CustomerID.

In [11]:
# Select a specific CustomerID, for example 61288
specific_customer_id = 61288

# Get the distances for the specific customer
distances = customer_similarity_df[specific_customer_id]

# Sort distances to find the most similar customers (excluding itself)
most_similar_customers = distances.sort_values().iloc[1:6]

most_similar_customers

CustomerID
15166    125.411323
5968     127.310644
8539     127.561750
5224     127.687118
5104     128.031246
Name: 61288, dtype: float64

## Step 5: From the data frame you created in Step 1, select the records for the list of similar CustomerIDs you obtained in Step 4.

In [12]:
# Select the records for these similar CustomerIDs from the grouped DataFrame
similar_customer_ids = most_similar_customers.index
similar_customers_records = grouped_df[grouped_df['CustomerID'].isin(similar_customer_ids)]

print(similar_customers_records)

       CustomerID                        ProductName  Quantity
3570         5104                  Apricots - Halves         2
3571         5104         Bacardi Breezer - Tropical         2
3572         5104                           Bay Leaf         2
3573         5104             Beans - Kidney, Canned         2
3574         5104                        Beans - Wax         2
...           ...                                ...       ...
10120       15166     Wine - Magnotta, Merlot Sr Vqa         4
10121       15166    Wine - Red, Harrow Estates, Cab         4
10122       15166            Wine - Redchard Merritt         4
10123       15166             Wine - Ruffino Chianti         4
10124       15166  Wine - Vineland Estate Semi - Dry         4

[316 rows x 3 columns]


## Step 6: Aggregate those customer purchase records by ProductName, sum the Quantity field, and then rank them in descending order by quantity.

This will give you the total number of each product purchased by the 5 most similar customers to the customer you selected in order from most purchased to least.

In [13]:
# Aggregate by ProductName and sum the Quantity field
aggregated_products = similar_customers_records.groupby('ProductName').agg({'Quantity': 'sum'}).reset_index()

# Rank them in descending order by quantity
ranked_products = aggregated_products.sort_values(by='Quantity', ascending=False)

print(ranked_products)

                    ProductName  Quantity
19           Beef - Rib Eye Aaa        12
100                Jagermeister        12
54          Chicken - Soup Base        12
113        Lime Cordial - Roses        10
108        Lamb - Pieces, Diced        10
..                          ...       ...
161            Salsify, Organic         2
160   Salmon Steak - Cohoe 8 Oz         2
159        Salmon - Sockeye Raw         2
158  Salmon - Atlantic, Skin On         2
46           Cheese - Camembert         2

[231 rows x 2 columns]


## Step 7: Filter the list for products that the chosen customer has not yet purchased and then recommend the top 5 products with the highest quantities that are left.

- Merge the ranked products data frame with the customer product matrix on the ProductName field.
- Filter for records where the chosen customer has not purchased the product.
- Show the top 5 results.

In [14]:
# Filter for products the chosen customer has not yet purchased
chosen_customer_purchases = product_customer_matrix[specific_customer_id]
unpurchased_products = ranked_products[~ranked_products['ProductName'].isin(chosen_customer_purchases[chosen_customer_purchases > 0].index)]

# Recommend the top 5 products
recommendations = unpurchased_products.head(5)

print(recommendations)

                   ProductName  Quantity
100               Jagermeister        12
54         Chicken - Soup Base        12
113       Lime Cordial - Roses        10
115  Macaroons - Two Bite Choc         9
80         Flavouring - Orange         8


## Step 8: Now that we have generated product recommendations for a single user, put the pieces together and iterate over a list of all CustomerIDs.

- Create an empty dictionary that will hold the recommendations for all customers.
- Create a list of unique CustomerIDs to iterate over.
- Iterate over the customer list performing steps 4 through 7 for each and appending the results of each iteration to the dictionary you created.

In [18]:
# Step 8: Generate recommendations for all customers
recommendations_dict = {}
customer_list = product_customer_matrix.columns

for specific_customer_id in customer_list:
    # Step 4: Get the top 5 most similar customers for a specific CustomerID
    distances = customer_similarity_df[specific_customer_id]
    most_similar_customers = distances.sort_values().iloc[1:6]
    
    # Step 5: Select the records for these similar CustomerIDs from the grouped DataFrame
    similar_customer_ids = most_similar_customers.index
    similar_customers_records = grouped_df[grouped_df['CustomerID'].isin(similar_customer_ids)]
    
    # Step 6: Aggregate by ProductName and sum the Quantity field
    aggregated_products = similar_customers_records.groupby('ProductName').agg({'Quantity': 'sum'}).reset_index()
    
    # Rank them in descending order by quantity
    ranked_products = aggregated_products.sort_values(by='Quantity', ascending=False)
    
    # Step 7: Filter for products the chosen customer has not yet purchased
    chosen_customer_purchases = product_customer_matrix[specific_customer_id]
    unpurchased_products = ranked_products[~ranked_products['ProductName'].isin(chosen_customer_purchases[chosen_customer_purchases > 0].index)]
    
    # Recommend the top 5 products
    recommendations = unpurchased_products.head(5)['ProductName'].tolist()
    
    # Ensure exactly 5 recommendations (pad with None if necessary)
    while len(recommendations) < 5:
        recommendations.append(None)
    
    # Append the recommendations to the dictionary
    recommendations_dict[specific_customer_id] = recommendations

##  Step 9: Store the results in a Pandas data frame. The data frame should a column for Customer ID and then a column for each of the 5 product recommendations for each customer.

In [19]:
# Step 9: Store the results in a DataFrame
recommendations_df = pd.DataFrame.from_dict(recommendations_dict, orient='index', columns=[f'Recommendation_{i+1}' for i in range(5)])
recommendations_df.reset_index(inplace=True)
recommendations_df.rename(columns={'index': 'CustomerID'}, inplace=True)

print(recommendations_df)

     CustomerID              Recommendation_1  \
0            33             Butter - Unsalted   
1           200  Soup - Campbells Bean Medley   
2           264       Soupfoamcont12oz 112con   
3           356         Veal - Inside, Choice   
4           412       Olive - Spread Tapenade   
..          ...                           ...   
995       97928      Soup - Campbells, Lentil   
996       98069               Skirt - 29 Foot   
997       98159           Lamb - Whole, Fresh   
998       98185               Crackers - Trio   
999       98200                   Beans - Wax   

                    Recommendation_2                 Recommendation_3  \
0      Wine - Ej Gallo Sierra Valley     Soup - Campbells Bean Medley   
1    Muffin - Carrot Individual Wrap                         Bay Leaf   
2         Wine - Two Oceans Cabernet  Bread - Italian Roll With Herbs   
3                      Lamb - Ground                          Pomello   
4        Sprouts - Baby Pea Tendrils    Wine -

## Step 10: Change the distance metric used in Step 3 to something other than euclidean (correlation, cityblock, consine, jaccard, etc.). Regenerate the recommendations for all customers and note the differences.

In [21]:
# Choose a different metric
metric = 'correlation' 

# Create the customer similarity matrix using the chosen metric
distance_matrix = pdist(product_customer_matrix.T, metric=metric)
customer_similarity_matrix = squareform(distance_matrix)

# Create a DataFrame for better readability
customer_similarity_df2 = pd.DataFrame(customer_similarity_matrix, index=product_customer_matrix.columns, columns=product_customer_matrix.columns)

customer_similarity_df2

CustomerID,33,200,264,356,412,464,477,639,649,669,...,97697,97753,97769,97793,97900,97928,98069,98159,98185,98200
CustomerID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
33,0.000000,1.025211,0.887943,0.959400,0.972579,0.971542,1.051663,1.008305,1.003146,0.990052,...,1.026814,1.064510,1.026400,1.083910,1.000520,1.084359,1.027589,1.055680,1.035068,1.032223
200,1.025211,0.000000,1.001067,1.005715,1.073948,1.075028,0.926265,1.080738,1.003482,0.978399,...,0.981300,1.015439,0.930933,0.964814,1.030125,1.063924,0.870938,0.998990,1.064954,1.046097
264,0.887943,1.001067,0.000000,1.009730,0.947144,1.039546,0.987110,1.079281,0.991922,1.039807,...,0.965398,1.037471,1.001002,1.003357,0.931461,0.903754,0.953715,0.986098,0.938976,1.037179
356,0.959400,1.005715,1.009730,0.000000,1.040222,1.041743,0.992298,0.969268,1.071740,1.056114,...,1.003586,1.054821,1.005365,0.991127,1.023312,1.015781,1.065232,0.991223,0.974719,0.997480
412,0.972579,1.073948,0.947144,1.040222,0.000000,1.009738,0.979762,0.951334,1.024509,0.969367,...,1.048723,0.888568,1.032385,1.019126,0.981172,0.999745,1.077885,0.990375,0.976296,0.941690
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
97928,1.084359,1.063924,0.903754,1.015781,0.999745,0.968645,0.981220,1.002805,0.998928,1.029966,...,1.021932,0.981406,1.050846,0.960573,0.870999,0.000000,0.902651,0.964935,1.032316,1.070734
98069,1.027589,0.870938,0.953715,1.065232,1.077885,1.065183,1.012509,1.039542,1.018858,0.976137,...,1.043515,1.001031,1.027032,1.063468,0.930410,0.902651,0.000000,0.954242,0.995756,1.004420
98159,1.055680,0.998990,0.986098,0.991223,0.990375,1.003224,0.999040,0.991565,0.959505,0.991348,...,0.979906,0.913048,0.997792,1.044248,0.974952,0.964935,0.954242,0.000000,0.982682,1.017836
98185,1.035068,1.064954,0.938976,0.974719,0.976296,1.042204,1.049346,0.956425,0.999623,1.083135,...,1.067277,1.125484,1.009437,1.020435,1.040386,1.032316,0.995756,0.982682,0.000000,0.976622


In [23]:
# Select a specific CustomerID, for example 61288
specific_customer_id = 61288

# Get the distances for the specific customer
distances = customer_similarity_df2[specific_customer_id]

# Sort distances to find the most similar customers (excluding itself)
most_similar_customers2 = distances.sort_values().iloc[1:6]

most_similar_customers2

CustomerID
15166    0.822959
38405    0.827130
42611    0.830939
25624    0.843641
90069    0.850675
Name: 61288, dtype: float64

In [25]:
# Select the records for these similar CustomerIDs from the grouped DataFrame
similar_customer_ids = most_similar_customers2.index
similar_customers_records2 = grouped_df[grouped_df['CustomerID'].isin(similar_customer_ids)]

print(similar_customers_records2)

       CustomerID                    ProductName  Quantity
10063       15166            Arizona - Green Tea         4
10064       15166  Bar Mix - Pina Colada, 355 Ml         4
10065       15166        Beans - Kidney, Red Dry         4
10066       15166             Beef - Rib Eye Aaa         4
10067       15166  Beef - Tenderlion, Center Cut         4
...           ...                            ...       ...
58260       90069     Wine - Two Oceans Cabernet        46
58261       90069  Wine - Vidal Icewine Magnotta        23
58262       90069       Wine - White Cab Sauv.on        23
58263       90069                Wonton Wrappers        23
58264       90069        Yogurt - French Vanilla        46

[328 rows x 3 columns]


In [26]:
# Aggregate by ProductName and sum the Quantity field
aggregated_products2 = similar_customers_records2.groupby('ProductName').agg({'Quantity': 'sum'}).reset_index()

# Rank them in descending order by quantity
ranked_products2 = aggregated_products2.sort_values(by='Quantity', ascending=False)

print(ranked_products2)

                          ProductName  Quantity
113  Longos - Grilled Salmon With Bbq        92
98                       Jagermeister        57
230        Wine - Two Oceans Cabernet        57
220          Wine - Fume Blanc Fetzer        54
205                       Tofu - Firm        51
..                                ...       ...
86                     Fond - Neutral         4
204                 Tilapia - Fillets         4
83                Flavouring - Orange         4
70                 Crush - Cream Soda         4
199                   Tea - Earl Grey         4

[238 rows x 2 columns]


In [27]:
# Filter for products the chosen customer has not yet purchased
chosen_customer_purchases2 = product_customer_matrix[specific_customer_id]
unpurchased_products = ranked_products[~ranked_products['ProductName'].isin(chosen_customer_purchases[chosen_customer_purchases > 0].index)]

# Recommend the top 5 products
recommendations2 = unpurchased_products.head(5)

print(recommendations2)

                       ProductName  Quantity
10                     Beans - Wax        27
177                   Sherry - Dry        26
153  Pork - Hock And Feet Attached        22
143                   Pears - Bosc        22
182     Soup - Campbells, Cream Of        22


In [28]:
recommendations_dict2 = {}
customer_list2 = product_customer_matrix.columns

for specific_customer_id in customer_list2:
    # Step 4: Get the top 5 most similar customers for a specific CustomerID
    distances = customer_similarity_df2[specific_customer_id]
    most_similar_customers2 = distances.sort_values().iloc[1:6]
    
    # Step 5: Select the records for these similar CustomerIDs from the grouped DataFrame
    similar_customer_ids = most_similar_customers2.index
    similar_customers_records2 = grouped_df[grouped_df['CustomerID'].isin(similar_customer_ids)]
    
    # Step 6: Aggregate by ProductName and sum the Quantity field
    aggregated_products2 = similar_customers_records2.groupby('ProductName').agg({'Quantity': 'sum'}).reset_index()
    
    # Rank them in descending order by quantity
    ranked_products2 = aggregated_products2.sort_values(by='Quantity', ascending=False)
    
    # Step 7: Filter for products the chosen customer has not yet purchased
    chosen_customer_purchases = product_customer_matrix[specific_customer_id]
    unpurchased_products = ranked_products[~ranked_products['ProductName'].isin(chosen_customer_purchases[chosen_customer_purchases > 0].index)]
    
    # Recommend the top 5 products
    recommendations2 = unpurchased_products.head(5)['ProductName'].tolist()
    
    # Ensure exactly 5 recommendations (pad with None if necessary)
    while len(recommendations2) < 5:
        recommendations2.append(None)
    
    # Append the recommendations to the dictionary
    recommendations_dict2[specific_customer_id] = recommendations2

In [29]:
recommendations_df2 = pd.DataFrame.from_dict(recommendations_dict2, orient='index', columns=[f'Recommendation_{i+1}' for i in range(5)])
recommendations_df2.reset_index(inplace=True)
recommendations_df2.rename(columns={'index': 'CustomerID'}, inplace=True)

print(recommendations_df2)

     CustomerID  Recommendation_1               Recommendation_2  \
0            33  Sea Bass - Whole                    Beans - Wax   
1           200  Sea Bass - Whole                    Beans - Wax   
2           264  Sea Bass - Whole                    Beans - Wax   
3           356  Sea Bass - Whole                    Beans - Wax   
4           412  Sea Bass - Whole              Apricots - Halves   
..          ...               ...                            ...   
995       97928  Sea Bass - Whole                    Beans - Wax   
996       98069  Sea Bass - Whole                   Sherry - Dry   
997       98159  Sea Bass - Whole                    Beans - Wax   
998       98185      Sherry - Dry  Pork - Hock And Feet Attached   
999       98200       Beans - Wax                   Sherry - Dry   

                  Recommendation_3               Recommendation_4  \
0                Apricots - Halves                   Sherry - Dry   
1                Apricots - Halves  Pork - Ho