# Final Model::  Model Building, Implementation, Testing and Feedback

In [None]:
# Import the necessary Libraries
import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
import warnings
# Filter and ignore warnings
warnings.filterwarnings("ignore")
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import linear_kernel
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.preprocessing import OneHotEncoder

In [None]:
# Load the pre-pocessed and cleaned data
model_data = pd.read_csv('model_data.csv')

In [None]:
model_data

Unnamed: 0,food_id,food_name,food_description,ingredients,veg_or_non_veg,allergies,seasonal_preference,dietary_restrictions
0,137739,arriba baked winter squash mexican style,autumn is my favorite time of year to cook! th...,"winter squash , mexican seasoning , mixed spic...",veg,"honey , milk , lactose , olive , squash , dairy",autumn,"low fat, high carb, high caffeine, low lactose"
1,31490,a bit different breakfast pizza,this recipe calls for the crust to be prebaked...,"pizza crust , sausage patty , egg , milk , sa...",non-veg,"gluten , milk , lactose , poultry , dairy",none,"high fat, low carb, high caffeine, high lactose"
2,112140,all in the kitchen chili,this modified version of 'mom's' chili was a h...,"ground beef , yellow onion , tomato , tomato ...",non-veg,"ltp , milk , lactose , nightshade , dairy","autumn, winter","high sugar, high fat, high carb, high caffeine..."
3,59389,alouette potatoes,"this is a super easy, great tasting, make ahea...","cheese garlic and herb , potato , shallot ,...",veg,"hypersensitivity , milk , lactose , olive , da...",none,"medium fat, medium carb, high caffeine"
4,44061,amish tomato ketchup for canning,my dh's amish mother raised him on this recipe...,"tomato juice , apple cider vinegar , sugar , s...",veg,"nightshade , oral , sugar",none,"low fat, high carb, high caffeine"
...,...,...,...,...,...,...,...,...
238758,700250,Til Pitha,,"Glutinous rice , black sesame seed , gur",veg,,none,"low sugar, low fat, low carb, low caffeine, lo..."
238759,700251,Bebinca,,"Coconut milk , egg yolk , clarified butter , a...",veg,"milk , lactose , nut , dairy",none,"low sugar, low fat, low carb, low caffeine, lo..."
238760,700252,Shufta,,"Cottage cheese , date , rose petal , pistach...",veg,"milk , lactose , nut , dairy",none,"low sugar, low fat, low carb, low caffeine, lo..."
238761,700253,Mawa Bati,,"Milk powder , fruit , arrowroot powder , all ...",veg,"milk , lactose , dairy",none,"low sugar, low fat, low carb, low caffeine, lo..."


## Model Features

**Tolerant Filtering:** Instead of filtering out dishes that don't exactly match the user's dietary restrictions, allergies, or seasonal preference, we have ranked them lower. This way, the user still gets recommendations even if there aren't any dishes that perfectly match their preferences. <br>

In this model, we have used scores to rank the food items and the scores carry the weightage of the user preferences. In this model, we have used a pre-defined weightage of each user preferences based on the effect of the preference. As we collect feedback from the user, we can ask the user to rank the importance of each preferences, so that we can map the weightages as per the user defined importances.

**Partial Matching:** Instead of requiring an exact match between the user's preferred ingredients and the ingredients in a dish, we have looked for partial matches. For example, if the user prefers dishes with potatoes, we have recommend a dish that contains potatoes even if it also contains other ingredients that the user didn't specify.

**Multiple Preferences:** We have allowed the user to specify multiple preferences for each category. For example, in this model, the user can specify that they like both "summer" and "winter" dishes, or that they prefer both "veg" and "non-veg" dishes. According, we have recommended dishes that match any of the user's preferences.

**Adjustable Importance of Each Category:** Instead of treating each category (e.g., dietary restrictions, allergies, seasonal preference, veg/non-veg) equally, we have assigned the wightages to each category based on the effect of the user preference. For example, the model automatically gives higher importance for dietary restrictions and allergies as it may have adverse effect if the conditions are not met, but low importance for seasonal preference if the other conditions are met. Accordingly, we have given more weightage to dietary restrictions, allergy and lesser weight to seasonal preference when ranking dishes. <br>

Based on the user feedback, we can give an added feature in the app asking the user to rate the importance of each preference and we can assign the imporatance to the weightage of the respective user preferences and similarity scores.

**Alternative Similarity Measure:** Instead of using cosine similarity based solely on 'ingredients', we have included other factors in our similarity measure, such as 'dietary_restrictions' and 'seasonal preference'. For example, two dishes could be considered similar if they have similar ingredients and also meet similar dietary restrictions.

## Define the Food Recommendation Function

In [None]:
def convert_to_string(lst):
    return ', '.join(lst)

model_data['combined'] = model_data['ingredients'].fillna('') + ', ' + \
                         model_data['dietary_restrictions'].apply(convert_to_string) + ', ' + \
                         model_data['seasonal_preference'].apply(convert_to_string)

def recommend_food(user_input):
    # Create a score for each dish
    scores = np.zeros(len(model_data))

    # Give score to dishes that match veg_or_non_veg preference (high importance)
    for pref in user_input['veg_or_non_veg']:
        scores[model_data["veg_or_non_veg"].isin([pref])] += 2

    # Deduct score for dishes that contain ingredients user is allergic to (high importance)
    for allergen in user_input['allergies']:
        scores[model_data['allergies'].fillna('').str.contains(allergen).values] -= 3

    # Give score to dishes that match dietary restrictions (high importance)
    for diet in user_input['dietary_restrictions']:
        scores[model_data["dietary_restrictions"].isin([diet])] += 3

    # Give score to dishes that match seasonal preference (medium importance)
    for season in user_input['seasonal_preference']:
        scores[model_data["seasonal_preference"].isin([season])] += 0.5

    # Use TF-IDF to vectorize the combined field
    tfidf = TfidfVectorizer(stop_words='english')
    tfidf_matrix = tfidf.fit_transform(model_data['combined'])

    # Vectorize user's combined preference
    user_pref_vector = tfidf.transform([user_input['combined']])

    # Calculate cosine similarity and add to scores
    cosine_similarities = linear_kernel(user_pref_vector, tfidf_matrix).flatten()

    scores += cosine_similarities

    # Get top 5 highest scoring dishes
    top_5_indices = scores.argsort()[:-6:-1]

    # Get corresponding cosine similarity scores
    top_5_scores = cosine_similarities[top_5_indices]

    # Return names of the dishes and their scores
    return model_data.iloc[top_5_indices]['food_name'], top_5_scores


## Model Recommendation Basis the User Inputs

In [None]:
# User inputs
veg_or_non_veg = input("Please enter your preference (veg/non-veg) (separated by comma): ").split(", ")
seasonal_preference = input("Please enter your seasonal preference (separated by comma): ").split(", ")
allergies = input("Please enter your allergies (separated by comma): ").split(", ")
dietary_restrictions = input("Please enter your dietary restrictions (separated by comma): ").split(", ")
ingredients = input("Please enter your preferred ingredients (separated by comma): ")

user_input = {
    'veg_or_non_veg': veg_or_non_veg,
    'seasonal_preference': seasonal_preference,
    'allergies': allergies,
    'dietary_restrictions': dietary_restrictions,
    'ingredients': ingredients,
    'combined': ingredients + ', ' + ', '.join(dietary_restrictions) + ', ' + ', '.join(seasonal_preference)
}

recommendations, scores = recommend_food(user_input)
print("\nTop five food recommendations for you:")
for i in range(5):
    print(f"{i+1}. {recommendations.iloc[i]} with a score of {scores[i]}")


Please enter your preference (veg/non-veg) (separated by comma): veg
Please enter your seasonal preference (separated by comma): winter
Please enter your allergies (separated by comma): dairy
Please enter your dietary restrictions (separated by comma): nut
Please enter your preferred ingredients (separated by comma): rice

Top five food recommendations for you:
1. baked winter squash  au gratin with a score of 0.37962830263718395
2. dr  oz s 2 week rapid weight loss plan  vegetable broth with a score of 0.36635345206234265
3. foolproof  brown rice with a score of 0.20608058894263126
4. bud s spicy nuts with a score of 0.16302320212970264
5. ginger apple dessert with a score of 0.1553962101532634


In [None]:
user_input

{'veg_or_non_veg': ['veg'],
 'seasonal_preference': ['winter'],
 'allergies': ['dairy'],
 'dietary_restrictions': ['nut'],
 'ingredients': 'rice',
 'combined': 'rice, nut, winter'}

### Feedback Loop: After recommending dishes to the user, we will ask them for feedback on the recommendations. Did they like the dishes? Why or why not? We can use this feedback to improve future recommendations for the user.

In [None]:
user_feedback = {}

for i in range(5):
    dish_name = recommendations.iloc[i]
    score = scores[i]

    print(f"{i+1}. {dish_name} with a cosine similarity of {score}")

    like = input(f"Did you like the dish {dish_name}? (yes/no): ")
    reason = input("Why did you or did you not like the dish?: ")

    user_feedback[dish_name] = {'like': like, 'reason': reason}


1. moroccan pork casserole with a cosine similarity of 0.15535451071260567
Did you like the dish moroccan pork casserole? (yes/no): no
Why did you or did you not like the dish?: did not match my dietary preference
2. chicken or turkey breast lunchmeat with a cosine similarity of 0.0792157622143917
Did you like the dish chicken or turkey breast lunchmeat? (yes/no): yes
Why did you or did you not like the dish?: matched my dietary preference
3. cashew chicken stir fry with a cosine similarity of 0.07010531292979737
Did you like the dish cashew chicken stir fry? (yes/no): yes
Why did you or did you not like the dish?: matched my dietary preference
4. lemony pad thai with a cosine similarity of 0.06780757208881762
Did you like the dish lemony pad thai? (yes/no): no
Why did you or did you not like the dish?: did not match my veg_nonveg preference
5. easy seafood stew with a cosine similarity of 0.06495643442193061
Did you like the dish easy seafood stew? (yes/no): yes
Why did you or did you

In [None]:
# Convert to DataFrame
user_feedback_df = pd.DataFrame(user_feedback)

user_feedback_df.T

Unnamed: 0,like,reason
moroccan pork casserole,no,did not match my dietary preference
chicken or turkey breast lunchmeat,yes,matched my dietary preference
cashew chicken stir fry,yes,matched my dietary preference
lemony pad thai,no,did not match my veg_nonveg preference
easy seafood stew,yes,matched my food preference


### Model Performance evaluation

In [None]:
# Convert feedback to DataFrame
user_feedback_df = pd.DataFrame({
    'like': ['no', 'yes', 'yes', 'no', 'yes'],
    'reason': ['did not match my dietary preference', 'matched my dietary preference', 'matched my dietary preference', 'did not match my veg_nonveg preference', 'matched my food preference'],
    'recommendation': ['moroccan pork casserole', 'chicken or turkey breast lunchmeat', 'cashew chicken stir fry', 'lemony pad thai', 'easy seafood stew']
})

# Define true positives and false positives
TP = sum(user_feedback_df['like'] == 'yes')  # user liked the recommended dishes
FP = sum(user_feedback_df['like'] == 'no')   # user didn't like the recommended dishes

# Calculate precision
precision = TP / (TP + FP)

print('Model Precision: ', precision)

Model Precision:  0.6
