## Memory-Based Collaborative Filtering of  Recipes & Ingredients and there ratings 
In this type of  recommendations system we have two types of approachs.                                                                                                          1)Item based :-item_item interaction                                                                                                                                              2)User based :-user_item interaction

## Loading Required Libraries

In [1]:
# Importing all the required libraries
import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.preprocessing import MinMaxScaler

## Loading the datasets 

In [2]:
# Loading the ingredients dataset
ingredients_data = pd.read_csv('../data/ingredients.csv')

# Loading the ratings dataset
ratings_data = pd.read_csv("../data/ratings.csv")

## Data preprocessing 

In [3]:
# Converting 'User_ID' and 'Food_ID' from float to int in ratings.csv after checking 
ratings_data.dropna(subset=['User_ID', 'Food_ID', 'Rating'], inplace=True) 
ratings_data['User_ID'] = ratings_data['User_ID'].astype(int)
ratings_data['Food_ID'] = ratings_data['Food_ID'].astype(int)

In [4]:
# Checking for the duplicate entries in both datasets and clearing them
ratings_data = ratings_data.drop_duplicates()
ingredients_data = ingredients_data.drop_duplicates()

In [5]:
# Ensuring that all Food_IDs in ratings_data are present in ingredients_data
# It's important that every food item rated has information in the ingredients dataset
food_ids_in_ratings = set(ratings_data['Food_ID'])
food_ids_in_ingredients = set(ingredients_data['Food_ID'])
missing_food_ids = food_ids_in_ratings - food_ids_in_ingredients

if missing_food_ids:
    print("Warning: There are Food_IDs in ratings that are missing in the ingredients dataset:", missing_food_ids)
else:
    print("All Food_IDs in ratings are consistent with the ingredient dataset.")

# Checking summary and information of the cleaned datasets
print(ratings_data.info())
print(ingredients_data.info())

All Food_IDs in ratings are consistent with the ingredient dataset.
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 511 entries, 0 to 510
Data columns (total 3 columns):
 #   Column   Non-Null Count  Dtype
---  ------   --------------  -----
 0   User_ID  511 non-null    int64
 1   Food_ID  511 non-null    int64
 2   Rating   511 non-null    int64
dtypes: int64(3)
memory usage: 12.1 KB
None
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 400 entries, 0 to 399
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   Food_ID   400 non-null    int64 
 1   Name      400 non-null    object
 2   C_Type    400 non-null    object
 3   Veg_Non   400 non-null    object
 4   Describe  400 non-null    object
dtypes: int64(1), object(4)
memory usage: 15.8+ KB
None


## Normalizing the data

In [6]:
# Normalizing the ratings to scale between 0 and 1
scaler = MinMaxScaler(feature_range=(0, 1))

# Ensuring the user_item_matrix is prepared by pivoting ratings_data
user_item_matrix = ratings_data.pivot_table(index='User_ID', columns='Food_ID', values='Rating', fill_value=0)

# Applying Min-Max scaling
user_item_matrix_scaled = scaler.fit_transform(user_item_matrix)

# Converting the scaled array back to a DataFrame, maintaining original indices and columns
user_item_matrix_scaled = pd.DataFrame(user_item_matrix_scaled, index=user_item_matrix.index, columns=user_item_matrix.columns)

## Computing Item-Item Similarity Matrix (item based) using the cosine similarity

In [7]:
# Transposing the matrix so that items become rows and users become columns
item_matrix_scaled = user_item_matrix_scaled.T

# Calculating the cosine similarity between items
item_similarity_matrix = cosine_similarity(item_matrix_scaled)
item_similarity_df = pd.DataFrame(item_similarity_matrix, index=item_matrix_scaled.index, columns=item_matrix_scaled.index)

# Displaying the item similarity matrix
print(item_similarity_df.head())

Food_ID  1         2    3    4         5         6         7    8    9    \
Food_ID                                                                    
1        1.0  0.000000  0.0  0.0  0.000000  0.204006  0.452548  0.0  0.0   
2        0.0  1.000000  0.0  0.0  0.489855  0.000000  0.000000  0.0  0.0   
3        0.0  0.000000  1.0  0.0  0.000000  0.064512  0.000000  0.0  0.0   
4        0.0  0.000000  0.0  1.0  0.000000  0.000000  0.000000  0.0  0.0   
5        0.0  0.489855  0.0  0.0  1.000000  0.000000  0.000000  0.0  0.0   

Food_ID       10   ...  300  301  302  303  304  305  306       307  308  309  
Food_ID            ...                                                         
1        0.000000  ...  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.894427  0.0  0.0  
2        0.000000  ...  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.000000  0.0  0.0  
3        0.000000  ...  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.000000  0.0  0.0  
4        0.000000  ...  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.000000

## Function for the item based recommendation system  

In [8]:
def item_based_recommendation_with_names(user_id, user_item_matrix, item_similarity_df, ingredients_data, top_n=5):
    # Checking if the user_id exists in the user_item_matrix
    if user_id not in user_item_matrix.index:
        return f"User ID {user_id} does not exist.", None

    # Getting the user's ratings
    user_ratings = user_item_matrix.loc[user_id]
    
    # Checking if the user has rated any items
    if user_ratings.max() == 0:
        return "This user has not rated any items.", None

    # Calculating a score for each item using a weighted sum of the item similarities and the user's ratings
    item_scores = item_similarity_df.dot(user_ratings).div(item_similarity_df.sum(axis=1))
    
    # Sorting the items based on the scores and filtering out items the user has already rated
    sorted_items = item_scores.sort_values(ascending=False)
    rated_items = user_ratings[user_ratings > 0].index

    # Getting the names of the items the user has already rated
    rated_item_names = ingredients_data.set_index('Food_ID').loc[rated_items]['Name']

    # Getting the top recommended items, excluding the ones the user has already rated
    recommended_item_ids = sorted_items.drop(rated_items).head(top_n).index
    if recommended_item_ids.empty:
        return rated_item_names, "No new items to recommend."

    recommended_item_names = ingredients_data.set_index('Food_ID').loc[recommended_item_ids]['Name']

    return rated_item_names, recommended_item_names

user_id = 3
rated_items, recommended_items = item_based_recommendation_with_names(user_id, user_item_matrix_scaled, item_similarity_df, ingredients_data, top_n=5)
if isinstance(rated_items, str):
    print(rated_items)  # Printing error message if the function returns an error
else:
    print("Items already rated by User:", rated_items)
    print("Recommended items for User:", recommended_items)

Items already rated by User: Food_ID
46                    steam bunny chicken bao
65             almond  white chocolate gujiya
73                              hot chocolate
110              chicken and mushroom lasagna
168                 egg and garlic fried rice
201    mix fruit laccha rabri tortilla crunch
209                      camel milk cake tart
292                             chicken tikka
299                             kolim / jawla
Name: Name, dtype: object
Recommended items for User: Food_ID
152                  prawn fried rice
93     buldak (hot and spicy chicken)
261              chicken shaami kebab
101               crispy herb chicken
138                malabar fish curry
Name: Name, dtype: object


Our item-based recommender system suggests cuisine items to a user according to their past choice of discovering the similar items sharing  common ingredients or flavor profiles.                                                                                                                                         

In this case, the user has already rated a mix of unique and non-vegetarian dishes like steam bunny chicken bao,almond  white chocolate gujiya & hot chocolate. Because of this, the system is able to suggest other meals that match their preferences, such as Prawn Fried Rice, Malabar Fish Curry, and rice kheer. This approach makes it simple to propose new choices based on what they’ve liked in the past, making the recommendation process smoother. However, we’ve hit some bumps along the way due to a lack of data, which makes it tough to validate the model accurately. The more feedback and ratings we gather from users, the better we can improve the precision of our recommendations and continue to fine-tune the results.

## Computing user_Item Similarity Matrix (user based) using the cosine similarity

In [9]:
# Calculating cosine similarity matrix for users from the normalized user-item matrix
user_similarity_matrix = cosine_similarity(user_item_matrix_scaled)
user_similarity_df = pd.DataFrame(user_similarity_matrix, index=user_item_matrix_scaled.index, columns=user_item_matrix_scaled.index)

# Displaying the user similarity matrix
print(user_similarity_df.head())

User_ID       1    2         3    4    5    6         7         8    9    10   \
User_ID                                                                         
1        1.000000  0.0  0.022332  0.0  0.0  0.0  0.000000  0.000000  0.0  0.0   
2        0.000000  1.0  0.000000  0.0  0.0  0.0  0.000000  0.000000  0.0  0.0   
3        0.022332  0.0  1.000000  0.0  0.0  0.0  0.000000  0.055643  0.0  0.0   
4        0.000000  0.0  0.000000  1.0  0.0  0.0  0.016682  0.118858  0.0  0.0   
5        0.000000  0.0  0.000000  0.0  1.0  0.0  0.000000  0.000000  0.0  0.0   

User_ID  ...  91        92   93   94   95   96        97   98        99   \
User_ID  ...                                                               
1        ...  0.0  0.000000  0.0  0.0  0.0  0.0  0.098360  0.0  0.000000   
2        ...  0.0  0.000000  0.0  0.0  0.0  0.0  0.000000  0.0  0.008540   
3        ...  0.0  0.000000  0.0  0.0  0.0  0.0  0.030071  0.0  0.078699   
4        ...  0.0  0.000000  0.0  0.0  0.0  0.0  0.0

## Function for the user based recommendation system  

In [10]:
def user_based_recommendation_with_history(user_id, user_similarity_df, user_item_matrix_scaled, ingredients_data_data, top_n=5):
    # Checking if the user_id exists in the user_item_matrix
    if user_id not in user_item_matrix_scaled.index:
        return f"User ID {user_id} does not exist in the dataset.", None

    # Getting rated items by the user
    user_ratings = user_item_matrix_scaled.loc[user_id]
    
    # Checking if the user has rated any items
    if not user_ratings[user_ratings > 0].any():
        return "This user has not rated any items.", None

    rated_items = user_ratings[user_ratings > 0].index
    rated_item_names = ingredients_data.set_index('Food_ID').loc[rated_items]['Name'] if not rated_items.empty else "No rated items."

    # Similarity scores for the target user with all other users
    similarity_scores = user_similarity_df.loc[user_id]

    # Multiplying the similarity scores by the user-item matrix and summing the results to get a weighted sum of ratings
    weighted_ratings = user_item_matrix_scaled.mul(similarity_scores, axis=0).sum(axis=0)

    # Normalizing by the sum of similarity scores to get an average
    recommendation_scores = weighted_ratings / similarity_scores.sum()

    # Removing items already rated by the target user
    recommendation_scores = recommendation_scores[~recommendation_scores.index.isin(rated_items)]

    # Getting the top N items with the highest recommendation scores
    top_items = recommendation_scores.nlargest(top_n).index

    if top_items.empty:
        return rated_item_names, "No new items to recommend."

    # Mapping Food_ID to names from ingredents_data for recommendations
    recommended_item_names = ingredients_data.set_index('Food_ID').loc[top_items]['Name']

    return rated_item_names, recommended_item_names

rated_items, recommended_items = user_based_recommendation_with_history(3, user_similarity_df, user_item_matrix_scaled, ingredients_data)
if isinstance(rated_items, str):
    print(rated_items)  # Error message
else:
    print("Items already rated by User 3:", rated_items)
    print("Recommended items for User 3:", recommended_items)

Items already rated by User 3: Food_ID
46                    steam bunny chicken bao
65             almond  white chocolate gujiya
73                              hot chocolate
110              chicken and mushroom lasagna
168                 egg and garlic fried rice
201    mix fruit laccha rabri tortilla crunch
209                      camel milk cake tart
292                             chicken tikka
299                             kolim / jawla
Name: Name, dtype: object
Recommended items for User 3: Food_ID
163         red rice vermicelli kheer
69       banana and maple ice lollies
93     buldak (hot and spicy chicken)
152                  prawn fried rice
25                 cashew nut cookies
Name: Name, dtype: object


Our user-based recommender system suggests cuisine items to a user according to their past choice beacause the it helps in understanding the user's taste preferences.

In this case, Ihe user have alredy rated  steam bunny chicken bao,almond  white chocolate gujiya and hot chocolate so Here are the  list of  food items that user haven’t rated yet they are red rice vermicelli kheer,banana and maple ice lollies and buldak (hot and spicy chicken) we think user might really like them based on what similar users have enjoyed. These recommendations come from analyzing the ratings of users who have a lot in common with this user.