# Recipe Recommendation System 
~ TASTY DISHES ~

- Group 3
- Group Members.
    - Cindy Tumaini
    - Margeret Namunyak
    - Faith Wafula
    - Martin Waweru
    - Matthew Karani


## Table Of Contents

- Business Understanding
- Data Understanding
- Data Preparation
- Modelling 
- Evaluation 
  

## Business Understanding

### Business Description 
Tasty Dishes is a web-based culinary platform dedicated to sharing authentic African recipes with the world. Our mission is to enhance the cooking experience of home chefs by providing them with a diverse collection of recipes rooted in African culinary traditions, while also incorporating global influences. Whether you're an experienced cook or just starting, Tasty Dishes offers a wide variety of recipes that empower users to create delicious, flavorful meals from the comfort of their homes.


## Business Goal 
### Objective
The main objective of this project is to develop an item-based recipe recommendation system that suggests recipes to users based on the ingredients they have available. By analyzing the ingredients present in various recipes, the system aims to provide relevant and appealing recommendations that encourage users to explore and cook diverse dishes rooted in African culinary traditions, while also incorporating global flavors.

### Scope

1. Ingredient-Based Recommendations: Develop an algorithm that analyzes user-provided ingredients to recommend recipes based on ingredient similarity, leveraging a diverse dataset that includes recipe_Title, Ingredients, and Instructions for authentic African and global dishes.

2. User-Friendly Interface: Design an intuitive web interface that enables users to input their available ingredients and view tailored recipe recommendations, along with detailed cooking instructions and a feedback mechanism to enhance recommendation accuracy.


### Success Criteria
1. Accuracy:
Achieve at least 80% accuracy in recommending relevant recipes based on user-provided ingredients.

2. Precision:
Ensure that at least 75% of recommended recipes correspond to the user’s input ingredients.

3. Recall:
Aim for a recall rate of at least 70%, indicating the system identifies a significant portion of relevant recipes.

4. F1 Score:
Target an F1 score of 0.75 or higher, balancing precision and recall for comprehensive recommendations.






## Data Understanding

### Data Source:




In [100]:
# Necessary Imports
import pandas as pd

### Data Frame One

In [101]:
# Load the dataframe

df = pd.read_csv("Food Ingredients and Recipe Dataset with Image Name Mapping.csv", index_col=0)


# Display the first columns
display(df.head(10))

#show the shape
print(df.shape)

Unnamed: 0,Title,Ingredients,Instructions,Image_Name,Cleaned_Ingredients
0,Miso-Butter Roast Chicken With Acorn Squash Pa...,"['1 (3½–4-lb.) whole chicken', '2¾ tsp. kosher...","Pat chicken dry with paper towels, season all ...",miso-butter-roast-chicken-acorn-squash-panzanella,"['1 (3½–4-lb.) whole chicken', '2¾ tsp. kosher..."
1,Crispy Salt and Pepper Potatoes,"['2 large egg whites', '1 pound new potatoes (...",Preheat oven to 400°F and line a rimmed baking...,crispy-salt-and-pepper-potatoes-dan-kluger,"['2 large egg whites', '1 pound new potatoes (..."
2,Thanksgiving Mac and Cheese,"['1 cup evaporated milk', '1 cup whole milk', ...",Place a rack in middle of oven; preheat to 400...,thanksgiving-mac-and-cheese-erick-williams,"['1 cup evaporated milk', '1 cup whole milk', ..."
3,Italian Sausage and Bread Stuffing,"['1 (¾- to 1-pound) round Italian loaf, cut in...",Preheat oven to 350°F with rack in middle. Gen...,italian-sausage-and-bread-stuffing-240559,"['1 (¾- to 1-pound) round Italian loaf, cut in..."
4,Newton's Law,"['1 teaspoon dark brown sugar', '1 teaspoon ho...",Stir together brown sugar and hot water in a c...,newtons-law-apple-bourbon-cocktail,"['1 teaspoon dark brown sugar', '1 teaspoon ho..."
5,Warm Comfort,"['2 chamomile tea bags', '1½ oz. reposado tequ...",Place 2 chamomile tea bags in a heatsafe vesse...,warm-comfort-tequila-chamomile-toddy,"['2 chamomile tea bags', '1½ oz. reposado tequ..."
6,Apples and Oranges,"['3 oz. Grand Marnier', '1 oz. Amaro Averna', ...","Add 3 oz. Grand Marnier, 1 oz. Amaro Averna, a...",apples-and-oranges-spiked-cider,"['3 oz. Grand Marnier', '1 oz. Amaro Averna', ..."
7,Turmeric Hot Toddy,"['¼ cup granulated sugar', '¾ tsp. ground turm...","For the turmeric syrup, combine ½ cup hot wate...",turmeric-hot-toddy-claire-sprouse,"['¼ cup granulated sugar', '¾ tsp. ground turm..."
8,Instant Pot Lamb Haleem,"['¾ cup assorted dals (such as chana dal, moon...","Combine dals, rice, and barley in a medium bow...",instant-pot-lamb-haleem,"['¾ cup assorted dals (such as chana dal, moon..."
9,Spiced Lentil and Caramelized Onion Baked Eggs,"['1 (14.5-ounce) can basic lentil soup, like A...","Place an oven rack in the center of the oven, ...",spiced-lentil-and-caramelized-onion-baked-eggs,"['1 (14.5-ounce) can basic lentil soup, like A..."


(13501, 5)


- Check for duplicates


In [102]:
print(f'Number of duplicates: {df.duplicated().sum()}')

Number of duplicates: 0


In [103]:
#drop duplicates
df.drop_duplicates(inplace=True)
print(f'Number of duplicates after dropping: {df.duplicated().sum()}')

Number of duplicates after dropping: 0


- Check for missing values

In [104]:
df.isnull().sum().sort_values(ascending=False)

Instructions           8
Title                  5
Ingredients            0
Image_Name             0
Cleaned_Ingredients    0
dtype: int64

In [105]:
#drop rows with missing values
df.dropna(inplace=True)
print(f'Number of missing values after dropping: {df.isnull().sum().sum()}')

Number of missing values after dropping: 0


- There is the Ingredients and Cleaned Ingredients column, check if there is any difference between the two.

In [106]:
df['Ingredients'][5]

"['2 chamomile tea bags', '1½ oz. reposado tequila', '¾ oz. fresh lemon juice', '1 Tbsp. agave nectar']"

In [107]:
df['Cleaned_Ingredients'][5]

"['2 chamomile tea bags', '1½ oz. reposado tequila', '¾ oz. fresh lemon juice', '1 Tbsp. agave nectar']"

- There's no significant difference between Ingredients and cleaned Ingredients. Thus, we drop the Ingredients column and rename cleaned_ingredients  ingredients.

In [108]:
#move the cleaned_ingredients column to the second column
df = df[['Title', 'Cleaned_Ingredients', 'Ingredients', 'Instructions', 'Image_Name']]

#drop the ingredients column
df = df.drop(columns=['Ingredients','Image_Name'])


In [109]:
# rename the cleaned ingredients column
df = df.rename(columns={'Cleaned_Ingredients':'Ingredients'})
df.head()


Unnamed: 0,Title,Ingredients,Instructions
0,Miso-Butter Roast Chicken With Acorn Squash Pa...,"['1 (3½–4-lb.) whole chicken', '2¾ tsp. kosher...","Pat chicken dry with paper towels, season all ..."
1,Crispy Salt and Pepper Potatoes,"['2 large egg whites', '1 pound new potatoes (...",Preheat oven to 400°F and line a rimmed baking...
2,Thanksgiving Mac and Cheese,"['1 cup evaporated milk', '1 cup whole milk', ...",Place a rack in middle of oven; preheat to 400...
3,Italian Sausage and Bread Stuffing,"['1 (¾- to 1-pound) round Italian loaf, cut in...",Preheat oven to 350°F with rack in middle. Gen...
4,Newton's Law,"['1 teaspoon dark brown sugar', '1 teaspoon ho...",Stir together brown sugar and hot water in a c...


In [110]:
df['Ingredients'][5]

"['2 chamomile tea bags', '1½ oz. reposado tequila', '¾ oz. fresh lemon juice', '1 Tbsp. agave nectar']"

### DataFrame two 

In [111]:
#explore the recipeslmp.csv file
df2 = pd.read_csv("RecipesImp.csv")
display(df2.head())

#display the shape
print(df2.shape)

Unnamed: 0,title,index,page,about,ingridients,preparation,nutrition per 100g of recipe,energy(kcal),fat(g),carbohydrates(g),proteins(g),fibre(g),vitamin A(mcg),iron(mg),zinc(mg),F_factor_est
0,Kaimati(Fried Dumplings),15003,24,Kaimatis get their unique flavour from the sty...,"wheat flour, refined\nwater, vanilla essence,...",Put yeast in a small container.\n Add 50ml of...,"Energy 1,795 kJ/ 429 kcal | Fat 21.8 g | Carbo...",429.0,21.8,52.8,4.6,1.6,30,2.1,0.45,0.4
1,Mahamri\n(Swahili Doughnut),15004,26,This is a typical traditional recipe among the...,"wheat flour,\ncoconut milk\nwhite sugar\ndry y...","Break the coconut shell, drain the water and...","Energy 1,728 kJ/ 413 kcal | Fat 22.1 g | Carbo...",413.0,22.1,46.6,6.0,2.1,41,2.8,0.56,0.4
2,"Enriched Mandazi \n(East African Doughnuts, En...",15124,28,A popular snack among urban dwellers across th...,self-raising wheat flour\neggs\nmargarine\nRin...,"? Put flour, salt, sugar and lemon rind into ...","Energy 1,590 kJ/ 379 kcal | Fat 16.1 g | Carbo...",379.0,16.1,49.9,7.6,2.2,90,3.3,0.66,0.4
3,"Basic Mandazi \n(East African Doughnuts, Basic)",15125,30,You will find this recipe in any home across K...,all-purpose wheat flour\nbaking powder\nsugar\...,"? Put the wheat flour into a bowl, add baking...","Energy 1,430kJ/ 340 kcal | Fat 12.9 g | Carboh...",340.0,12.9,48.7,6.4,2.1,48,3.5,0.52,0.4
4,Meat Samosa\n(Sambusa ya Nyama),15025,32,Nothing more delicious like the Kenyan meaty s...,"minced beef\ncoriander, fresh\nleek\ngarlic\nc...",? Put the meat in a pan over a fire. Stir con...,"Energy 1,854 kJ/ 443 kcal | Fat 22.2 g | Carbo...",443.0,22.2,40.5,18.8,3.1,66,11.5,2.99,0.4


(142, 16)


- Since we want only a few columns to recommend the possible recipes, we need to drop some columns.

In [112]:
df2.columns


Index(['title', 'index', 'page', 'about', 'ingridients', 'preparation',
       'nutrition per 100g of recipe', 'energy(kcal)', 'fat(g)',
       'carbohydrates(g)', 'proteins(g)', 'fibre(g)', 'vitamin A(mcg)',
       'iron(mg)', 'zinc(mg)', 'F_factor_est'],
      dtype='object')

In [113]:
columns_to_keep = ['title','ingridients','preparation']

df2 = df2[columns_to_keep]
df2.head()

Unnamed: 0,title,ingridients,preparation
0,Kaimati(Fried Dumplings),"wheat flour, refined\nwater, vanilla essence,...",Put yeast in a small container.\n Add 50ml of...
1,Mahamri\n(Swahili Doughnut),"wheat flour,\ncoconut milk\nwhite sugar\ndry y...","Break the coconut shell, drain the water and..."
2,"Enriched Mandazi \n(East African Doughnuts, En...",self-raising wheat flour\neggs\nmargarine\nRin...,"? Put flour, salt, sugar and lemon rind into ..."
3,"Basic Mandazi \n(East African Doughnuts, Basic)",all-purpose wheat flour\nbaking powder\nsugar\...,"? Put the wheat flour into a bowl, add baking..."
4,Meat Samosa\n(Sambusa ya Nyama),"minced beef\ncoriander, fresh\nleek\ngarlic\nc...",? Put the meat in a pan over a fire. Stir con...


In [114]:
#clean the column names
#change the ingridient column name to ingredients
df2.rename(columns={'ingridients':'ingredients','preparation':'instructions'}, inplace=True)

#capitalize the column names
df2.columns = df2.columns.str.capitalize()

#store the ingredients in each row as a list
df2['Ingredients'] = df2['Ingredients'].apply(lambda x: x.split(','))
#replace the \n with a space
df2['Ingredients'] = df2['Ingredients'].apply(lambda x: [i.replace('\n',', ') for i in x])
df2.head(10)



Unnamed: 0,Title,Ingredients,Instructions
0,Kaimati(Fried Dumplings),"[ wheat flour, refined, water, vanilla essen...",Put yeast in a small container.\n Add 50ml of...
1,Mahamri\n(Swahili Doughnut),"[wheat flour, , coconut milk, white sugar, dry...","Break the coconut shell, drain the water and..."
2,"Enriched Mandazi \n(East African Doughnuts, En...","[self-raising wheat flour, eggs, margarine, Ri...","? Put flour, salt, sugar and lemon rind into ..."
3,"Basic Mandazi \n(East African Doughnuts, Basic)","[all-purpose wheat flour, baking powder, sugar...","? Put the wheat flour into a bowl, add baking..."
4,Meat Samosa\n(Sambusa ya Nyama),"[minced beef, coriander, fresh, leek, garlic,...",? Put the meat in a pan over a fire. Stir con...
5,Vegetable Samosa\n(Sambusa ya Mboga),"[cabbage, garlic, onions, carrots, ginger, gar...",Preparation 1 hour | Cooking 1 hour 30 minutes...
6,Pancakes\n(Chapati za Maji),"[wheat flour, eggs, cow milk, sugar, water, co...",Preparation 20 minutes | Cooking 30 minutes |\...
7,Drop Scones,"[wheat flour, eggs, cow milk, sugar, white, m...",Preparation 20 minutes | Cooking 30 minutes |\...
8,Qita\n(Maize & Wheat Flour Pancake),"[wheat flour, maize flour, yeast, water, salt,...",Preparation 2 hours 30 minutes | Cooking 40 mi...
9,Mkate Kuta(Ngumu),"[wheat flour, wheat flour, sugar, cooking oil,...",Preparation 20 minutes | Cooking 1 hour | Serv...


- Check for duplicates

In [115]:
# convert the list of ingredients to a string to match the format of the first dataframe
df2['Ingredients'] = df2['Ingredients'].apply(lambda x: ', '.join([ingredient.strip() for ingredient in x]))

print(f'number of duplicates: {df2.duplicated().sum()}')

number of duplicates: 0


In [116]:
df2.head()

Unnamed: 0,Title,Ingredients,Instructions
0,Kaimati(Fried Dumplings),"wheat flour, refined, water, vanilla essence, ...",Put yeast in a small container.\n Add 50ml of...
1,Mahamri\n(Swahili Doughnut),"wheat flour, , coconut milk, white sugar, dry ...","Break the coconut shell, drain the water and..."
2,"Enriched Mandazi \n(East African Doughnuts, En...","self-raising wheat flour, eggs, margarine, Rin...","? Put flour, salt, sugar and lemon rind into ..."
3,"Basic Mandazi \n(East African Doughnuts, Basic)","all-purpose wheat flour, baking powder, sugar,...","? Put the wheat flour into a bowl, add baking..."
4,Meat Samosa\n(Sambusa ya Nyama),"minced beef, coriander, fresh, leek, garlic, c...",? Put the meat in a pan over a fire. Stir con...


- Chek for missing values

In [117]:
print(f'number of missing values: {df2.isnull().sum().sum()}')

number of missing values: 0


- Since both dataframes have no missing values and duplicates we can merge them now

In [118]:
#merge the two dataframes
combined_df = pd.concat([df,df2])
#check the shapes of the three dfs
print(f'Dataframe 1 has a shape of: {df.shape}')
print(f'Dataframe 2 has a shape of: {df2.shape}')
print(f'Combined dataframe has a shape of: {combined_df.shape}')

#reset the index
combined_df = combined_df.reset_index(drop=True)



Dataframe 1 has a shape of: (13493, 3)
Dataframe 2 has a shape of: (142, 3)
Combined dataframe has a shape of: (13635, 3)


### Clean the Combined DataFrame

In [119]:
combined_df.head()

Unnamed: 0,Title,Ingredients,Instructions
0,Miso-Butter Roast Chicken With Acorn Squash Pa...,"['1 (3½–4-lb.) whole chicken', '2¾ tsp. kosher...","Pat chicken dry with paper towels, season all ..."
1,Crispy Salt and Pepper Potatoes,"['2 large egg whites', '1 pound new potatoes (...",Preheat oven to 400°F and line a rimmed baking...
2,Thanksgiving Mac and Cheese,"['1 cup evaporated milk', '1 cup whole milk', ...",Place a rack in middle of oven; preheat to 400...
3,Italian Sausage and Bread Stuffing,"['1 (¾- to 1-pound) round Italian loaf, cut in...",Preheat oven to 350°F with rack in middle. Gen...
4,Newton's Law,"['1 teaspoon dark brown sugar', '1 teaspoon ho...",Stir together brown sugar and hot water in a c...


In [120]:
combined_df[-10:]

Unnamed: 0,Title,Ingredients,Instructions
13625,Vinolo\n(Banana and Maize Flour Ugali),"banana green, maize flour, water",Preparation 5 minutes | Cooking 40 minutes | \...
13626,Finger Millet \nFlour Ugali,"finger millet, water",Preparation time 5 minutes | Cooking time 15 m...
13627,White Chapati,"wheat flour, water, sugar, salt, cooking oil",Preparation 30 minutes | Cooking 30 minutes | ...
13628,Brown Chapati,"wheat flour, water, sugar, , salt, cooking oil",Preparation 30 minutes | Cooking 30 minutes | ...
13629,Roti \n(Indian Chapati),"wheat flour, salt, water, cooking oil, cow ghee",Preparation 3 hours | Cooking 21 minutes | Ser...
13630,Bhature\n (Fried Indian Bread),"wheat flour, salt, sugar, ghee, cooking oil, f...",Preparation 1 hour 15 minutes | Cooking 30 min...
13631,Vimumunya vya \nSukari\n (Sweetened Pumpkin & ...,"pumpkin, cardamon, sugar, coconut milk, water",Preparation 5 minutes | Cooking 45 minutes | \...
13632,Siro\n (Semolina & Nuts),"semolina flour, cow ghee, cow milk, sugar, pis...",Preparation 15 minutes | Cooking 30 minutes | ...
13633,Chaas\n(Diluted Yoghurt),"natural yoghurt, water, salt,",Preparation 5 minutes | Serves 2\n?Add natural...
13634,Groundnut Sauce,"groundnut, salt, sour milk, water",Preparation 5 minutes | Cooking 1 hour 40 minu...


In [121]:
#check for missing values
print(combined_df.isnull().sum().sort_values(ascending=False))


Title           0
Ingredients     0
Instructions    0
dtype: int64


In [122]:
#write a function to clean the columns
"""
This function should:
1. Clean the Title Column: Remove newlines and extra spaces
2. Clean the Ingredients Column: Convert string ingredients to lists: If they are not already in a list format, convert them and clean any extra spaces.
                                Ensure there are no empty strings or duplicates within each list of ingredients.
3. Clean the Instructions Column: Similar to the Title, ensure that the instructions are clean and properly formatted.
"""

def clean_combined_df(df):
    #title column
    df['Title'] = df['Title'].str.replace('\n', '').str.strip()
    #Ingredients column
    # Convert ingredients to a list if they are in a string format
    df['Ingredients'] = df['Ingredients'].apply(lambda x: [ingredient.strip() for ingredient in x.split(',')])

    # Remove empty ingredients
    df['Ingredients'] = df['Ingredients'].apply(lambda x: [ingredient for ingredient in x if ingredient])

    # If there are any duplicate ingredients in each list, remove them
    df['Ingredients'] = df['Ingredients'].apply(lambda x: list(set(x)))

    # Clean the Instructions column
    df['Instructions'] = df['Instructions'].str.replace('\n', '') \
                                       .str.replace('?', '.') \
                                       .str.replace('|', ',') \
                                       .str.strip()
    return df

# Use the function on your DataFrame
combined_cleaned = clean_combined_df(combined_df)
combined_cleaned

Unnamed: 0,Title,Ingredients,Instructions
0,Miso-Butter Roast Chicken With Acorn Squash Pa...,"['Freshly ground black pepper', '2 medium appl...","Pat chicken dry with paper towels, season all ..."
1,Crispy Salt and Pepper Potatoes,"['¾ teaspoon finely ground black pepper', '2 t...",Preheat oven to 400°F and line a rimmed baking...
2,Thanksgiving Mac and Cheese,"['1 tsp. onion powder', '1 lb. elbow macaroni'...",Place a rack in middle of oven; preheat to 400...
3,Italian Sausage and Bread Stuffing,"[lightly beaten', '2 tablespoons olive oil, cu...",Preheat oven to 350°F with rack in middle. Gen...
4,Newton's Law,"['1 ½ oz. bourbon', ['1 teaspoon dark brown su...",Stir together brown sugar and hot water in a c...
...,...,...,...
13630,Bhature (Fried Indian Bread),"[natural yoghurt, coriander, fenugreek leaves,...","Preparation 1 hour 15 minutes , Cooking 30 min..."
13631,Vimumunya vya Sukari (Sweetened Pumpkin & Coco...,"[coconut milk, pumpkin, sugar, cardamon, water]","Preparation 5 minutes , Cooking 45 minutes , S..."
13632,Siro (Semolina & Nuts),"[semolina flour, cow milk, cow ghee, pistachio...","Preparation 15 minutes , Cooking 30 minutes , ..."
13633,Chaas(Diluted Yoghurt),"[natural yoghurt, salt, water]","Preparation 5 minutes , Serves 2.Add natural y..."


In [123]:
combined_cleaned['Instructions'][13634]

'Preparation 5 minutes , Cooking 1 hour 40 minutes , Serves 4.Place a saucepan over fire and let it preheat..Add the groundnuts, salt and 1/2 a cup of water oras desired..Cook until the water evaporates as you stir gradu\x02ally. When ready, the nuts produce a pop sound..Once they pop, turn down the heat and contin\x02ue stirring until the groundnuts are dry (about 13minutes)..Remove from heat and allow it to cool down..Using a blender, blend the nuts into a paste. Apestle and mortar can be used in the absence of ablender..Put the groundnut paste into a bowl, add sour milkand stir into thick paste. Water or fresh milk can beused in place of the sour milk..Once ready, put another pan on the heat, add thepeanut paste and stir..Stir until it is smooth but not too thick..Serve hot with green leafy vegetables of yourchoice, fish, sweet potatoes, green bananas, ugali,etc'

## Exploratory Data Analysis

### Data Cleaning

## Feature Engineering