# Recipe Recommendation System 
~ TASTY DISHES ~

- Group 3
- Group Members.
    - Cindy Tumaini
    - Margret Namunyak
    - Faith Wafula
    - Martin Waweru
    - Matthew Karani


## Table Of Contents

- Business Understanding
- Data Understanding
- Data Preparation
- Modelling 
- Evaluation 
  

## Business Understanding

### Business Description 
Tasty Dishes is a web-based culinary platform dedicated to sharing authentic African recipes with the world. Our mission is to enhance the cooking experience of home chefs by providing them with a diverse collection of recipes rooted in African culinary traditions, while also incorporating global influences. Whether you're an experienced cook or just starting, Tasty Dishes offers a wide variety of recipes that empower users to create delicious, flavorful meals from the comfort of their homes.


## Business Goal 
### Objective
The main objective of this project is to develop an item-based recipe recommendation system that suggests recipes to users based on the ingredients they have available. By analyzing the ingredients present in various recipes, the system aims to provide relevant and appealing recommendations that encourage users to explore and cook diverse dishes rooted in African culinary traditions, while also incorporating global flavors.

### Scope

1. Ingredient-Based Recommendations: Develop an algorithm that analyzes user-provided ingredients to recommend recipes based on ingredient similarity, leveraging a diverse dataset that includes recipe_Title, Ingredients, and Instructions for authentic African and global dishes.

2. User-Friendly Interface: Design an intuitive web interface that enables users to input their available ingredients and view tailored recipe recommendations, along with detailed cooking instructions and a feedback mechanism to enhance recommendation accuracy.


### Success Criteria
1. Accuracy:
Achieve at least 80% accuracy in recommending relevant recipes based on user-provided ingredients.

2. Precision:
Ensure that at least 75% of recommended recipes correspond to the user’s input ingredients.

3. Recall:
Aim for a recall rate of at least 70%, indicating the system identifies a significant portion of relevant recipes.

4. F1 Score:
Target an F1 score of 0.75 or higher, balancing precision and recall for comprehensive recommendations.






## Data Understanding

### Data Source:




In [173]:
# Necessary Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import string
import re
import nltk


### Data Frame One

In [174]:
# Load the dataframe

df = pd.read_csv("Food Ingredients and Recipe Dataset with Image Name Mapping.csv", index_col=0)


# Display the first columns
display(df.head(10))

#show the shape
print(df.shape)

Unnamed: 0,Title,Ingredients,Instructions,Image_Name,Cleaned_Ingredients
0,Miso-Butter Roast Chicken With Acorn Squash Pa...,"['1 (3½–4-lb.) whole chicken', '2¾ tsp. kosher...","Pat chicken dry with paper towels, season all ...",miso-butter-roast-chicken-acorn-squash-panzanella,"['1 (3½–4-lb.) whole chicken', '2¾ tsp. kosher..."
1,Crispy Salt and Pepper Potatoes,"['2 large egg whites', '1 pound new potatoes (...",Preheat oven to 400°F and line a rimmed baking...,crispy-salt-and-pepper-potatoes-dan-kluger,"['2 large egg whites', '1 pound new potatoes (..."
2,Thanksgiving Mac and Cheese,"['1 cup evaporated milk', '1 cup whole milk', ...",Place a rack in middle of oven; preheat to 400...,thanksgiving-mac-and-cheese-erick-williams,"['1 cup evaporated milk', '1 cup whole milk', ..."
3,Italian Sausage and Bread Stuffing,"['1 (¾- to 1-pound) round Italian loaf, cut in...",Preheat oven to 350°F with rack in middle. Gen...,italian-sausage-and-bread-stuffing-240559,"['1 (¾- to 1-pound) round Italian loaf, cut in..."
4,Newton's Law,"['1 teaspoon dark brown sugar', '1 teaspoon ho...",Stir together brown sugar and hot water in a c...,newtons-law-apple-bourbon-cocktail,"['1 teaspoon dark brown sugar', '1 teaspoon ho..."
5,Warm Comfort,"['2 chamomile tea bags', '1½ oz. reposado tequ...",Place 2 chamomile tea bags in a heatsafe vesse...,warm-comfort-tequila-chamomile-toddy,"['2 chamomile tea bags', '1½ oz. reposado tequ..."
6,Apples and Oranges,"['3 oz. Grand Marnier', '1 oz. Amaro Averna', ...","Add 3 oz. Grand Marnier, 1 oz. Amaro Averna, a...",apples-and-oranges-spiked-cider,"['3 oz. Grand Marnier', '1 oz. Amaro Averna', ..."
7,Turmeric Hot Toddy,"['¼ cup granulated sugar', '¾ tsp. ground turm...","For the turmeric syrup, combine ½ cup hot wate...",turmeric-hot-toddy-claire-sprouse,"['¼ cup granulated sugar', '¾ tsp. ground turm..."
8,Instant Pot Lamb Haleem,"['¾ cup assorted dals (such as chana dal, moon...","Combine dals, rice, and barley in a medium bow...",instant-pot-lamb-haleem,"['¾ cup assorted dals (such as chana dal, moon..."
9,Spiced Lentil and Caramelized Onion Baked Eggs,"['1 (14.5-ounce) can basic lentil soup, like A...","Place an oven rack in the center of the oven, ...",spiced-lentil-and-caramelized-onion-baked-eggs,"['1 (14.5-ounce) can basic lentil soup, like A..."


(13501, 5)


- Check for duplicates


In [175]:
print(f'Number of duplicates: {df.duplicated().sum()}')

Number of duplicates: 0


In [176]:
#drop duplicates
df.drop_duplicates(inplace=True)
print(f'Number of duplicates after dropping: {df.duplicated().sum()}')

Number of duplicates after dropping: 0


- Check for missing values

In [177]:
df.isnull().sum().sort_values(ascending=False)

Instructions           8
Title                  5
Ingredients            0
Image_Name             0
Cleaned_Ingredients    0
dtype: int64

In [178]:
#drop rows with missing values
df.dropna(inplace=True)
print(f'Number of missing values after dropping: {df.isnull().sum().sum()}')

Number of missing values after dropping: 0


- There is the Ingredients and Cleaned Ingredients column, check if there is any difference between the two.

In [179]:
df['Ingredients'][5]

"['2 chamomile tea bags', '1½ oz. reposado tequila', '¾ oz. fresh lemon juice', '1 Tbsp. agave nectar']"

In [180]:
df['Cleaned_Ingredients'][5]

"['2 chamomile tea bags', '1½ oz. reposado tequila', '¾ oz. fresh lemon juice', '1 Tbsp. agave nectar']"

- There's no significant difference between Ingredients and cleaned Ingredients. Thus, we drop the Ingredients column and rename cleaned_ingredients  ingredients.

In [181]:
#move the cleaned_ingredients column to the second column
df = df[['Title', 'Cleaned_Ingredients', 'Ingredients', 'Instructions', 'Image_Name']]

#drop the ingredients column
df = df.drop(columns=['Ingredients','Image_Name'])


In [182]:
# rename the cleaned ingredients column
df = df.rename(columns={'Cleaned_Ingredients':'Ingredients'})
df.head()


Unnamed: 0,Title,Ingredients,Instructions
0,Miso-Butter Roast Chicken With Acorn Squash Pa...,"['1 (3½–4-lb.) whole chicken', '2¾ tsp. kosher...","Pat chicken dry with paper towels, season all ..."
1,Crispy Salt and Pepper Potatoes,"['2 large egg whites', '1 pound new potatoes (...",Preheat oven to 400°F and line a rimmed baking...
2,Thanksgiving Mac and Cheese,"['1 cup evaporated milk', '1 cup whole milk', ...",Place a rack in middle of oven; preheat to 400...
3,Italian Sausage and Bread Stuffing,"['1 (¾- to 1-pound) round Italian loaf, cut in...",Preheat oven to 350°F with rack in middle. Gen...
4,Newton's Law,"['1 teaspoon dark brown sugar', '1 teaspoon ho...",Stir together brown sugar and hot water in a c...


In [183]:
df['Ingredients'][5]

"['2 chamomile tea bags', '1½ oz. reposado tequila', '¾ oz. fresh lemon juice', '1 Tbsp. agave nectar']"

### DataFrame two 

In [184]:
#explore the recipeslmp.csv file
df2 = pd.read_csv("RecipesImp.csv")
display(df2.head())

#display the shape
print(df2.shape)

Unnamed: 0,title,index,page,about,ingridients,preparation,nutrition per 100g of recipe,energy(kcal),fat(g),carbohydrates(g),proteins(g),fibre(g),vitamin A(mcg),iron(mg),zinc(mg),F_factor_est
0,Kaimati(Fried Dumplings),15003,24,Kaimatis get their unique flavour from the sty...,"wheat flour, refined\r\nwater, vanilla essenc...",Put yeast in a small container.\r\n Add 50ml ...,"Energy 1,795 kJ/ 429 kcal | Fat 21.8 g | Carbo...",429.0,21.8,52.8,4.6,1.6,30,2.1,0.45,0.4
1,Mahamri\r\n(Swahili Doughnut),15004,26,This is a typical traditional recipe among the...,"wheat flour,\r\ncoconut milk\r\nwhite sugar\r\...","Break the coconut shell, drain the water and...","Energy 1,728 kJ/ 413 kcal | Fat 22.1 g | Carbo...",413.0,22.1,46.6,6.0,2.1,41,2.8,0.56,0.4
2,"Enriched Mandazi \r\n(East African Doughnuts, ...",15124,28,A popular snack among urban dwellers across th...,self-raising wheat flour\r\neggs\r\nmargarine\...,"? Put flour, salt, sugar and lemon rind into ...","Energy 1,590 kJ/ 379 kcal | Fat 16.1 g | Carbo...",379.0,16.1,49.9,7.6,2.2,90,3.3,0.66,0.4
3,"Basic Mandazi \r\n(East African Doughnuts, Basic)",15125,30,You will find this recipe in any home across K...,all-purpose wheat flour\r\nbaking powder\r\nsu...,"? Put the wheat flour into a bowl, add baking...","Energy 1,430kJ/ 340 kcal | Fat 12.9 g | Carboh...",340.0,12.9,48.7,6.4,2.1,48,3.5,0.52,0.4
4,Meat Samosa\r\n(Sambusa ya Nyama),15025,32,Nothing more delicious like the Kenyan meaty s...,"minced beef\r\ncoriander, fresh\r\nleek\r\ngar...",? Put the meat in a pan over a fire. Stir con...,"Energy 1,854 kJ/ 443 kcal | Fat 22.2 g | Carbo...",443.0,22.2,40.5,18.8,3.1,66,11.5,2.99,0.4


(142, 16)


- Since we want only a few columns to recommend the possible recipes, we need to drop some columns.

In [185]:
df2.columns


Index(['title', 'index', 'page', 'about', 'ingridients', 'preparation',
       'nutrition per 100g of recipe', 'energy(kcal)', 'fat(g)',
       'carbohydrates(g)', 'proteins(g)', 'fibre(g)', 'vitamin A(mcg)',
       'iron(mg)', 'zinc(mg)', 'F_factor_est'],
      dtype='object')

In [186]:
columns_to_keep = ['title','ingridients','preparation']

df2 = df2[columns_to_keep]
df2.head()

Unnamed: 0,title,ingridients,preparation
0,Kaimati(Fried Dumplings),"wheat flour, refined\r\nwater, vanilla essenc...",Put yeast in a small container.\r\n Add 50ml ...
1,Mahamri\r\n(Swahili Doughnut),"wheat flour,\r\ncoconut milk\r\nwhite sugar\r\...","Break the coconut shell, drain the water and..."
2,"Enriched Mandazi \r\n(East African Doughnuts, ...",self-raising wheat flour\r\neggs\r\nmargarine\...,"? Put flour, salt, sugar and lemon rind into ..."
3,"Basic Mandazi \r\n(East African Doughnuts, Basic)",all-purpose wheat flour\r\nbaking powder\r\nsu...,"? Put the wheat flour into a bowl, add baking..."
4,Meat Samosa\r\n(Sambusa ya Nyama),"minced beef\r\ncoriander, fresh\r\nleek\r\ngar...",? Put the meat in a pan over a fire. Stir con...


In [187]:
#clean the column names
#change the ingridient column name to ingredients


df2.rename(columns={'ingridients':'ingredients','preparation':'instructions'}, inplace=True)

#capitalize the column names
df2.columns = df2.columns.str.capitalize()

# Function to process the Ingredients column
def process_ingredients(ingredients):
    # Remove unwanted characters and split by commas
    return [ingredient.strip() for ingredient in ingredients.replace('[','').replace(']','').replace("'", "").replace('\n', ',').split(',')]

# Apply the function to each DataFrame
df['Ingredients'] = df['Ingredients'].apply(lambda x: [ingredient.strip() for ingredient in x.split(',')])
df2['Ingredients'] = df2['Ingredients'].apply(process_ingredients)




In [188]:
# Check the processed DataFrames
df[['Title', 'Ingredients']]


Unnamed: 0,Title,Ingredients
0,Miso-Butter Roast Chicken With Acorn Squash Pa...,"[['1 (3½–4-lb.) whole chicken', '2¾ tsp. koshe..."
1,Crispy Salt and Pepper Potatoes,"[['2 large egg whites', '1 pound new potatoes ..."
2,Thanksgiving Mac and Cheese,"[['1 cup evaporated milk', '1 cup whole milk',..."
3,Italian Sausage and Bread Stuffing,"[['1 (¾- to 1-pound) round Italian loaf, cut i..."
4,Newton's Law,"[['1 teaspoon dark brown sugar', '1 teaspoon h..."
...,...,...
13496,Brownie Pudding Cake,"[['1 cup all-purpose flour', '2/3 cup unsweete..."
13497,Israeli Couscous with Roasted Butternut Squash...,"[['1 preserved lemon', '1 1/2 pound butternut ..."
13498,Rice with Soy-Glazed Bonito Flakes and Sesame ...,[['Leftover katsuo bushi (dried bonito flakes)...
13499,Spanakopita,[['1 stick (1/2 cup) plus 1 tablespoon unsalte...


In [189]:
df2[['Title', 'Ingredients']]

Unnamed: 0,Title,Ingredients
0,Kaimati(Fried Dumplings),"[wheat flour, refined, water, vanilla essence,..."
1,Mahamri\r\n(Swahili Doughnut),"[wheat flour, , coconut milk, white sugar, dry..."
2,"Enriched Mandazi \r\n(East African Doughnuts, ...","[self-raising wheat flour, eggs, margarine, Ri..."
3,"Basic Mandazi \r\n(East African Doughnuts, Basic)","[all-purpose wheat flour, baking powder, sugar..."
4,Meat Samosa\r\n(Sambusa ya Nyama),"[minced beef, coriander, fresh, leek, garlic, ..."
...,...,...
137,Bhature\r\n (Fried Indian Bread),"[wheat flour, salt, sugar, ghee, cooking oil, ..."
138,Vimumunya vya \r\nSukari\r\n (Sweetened Pumpki...,"[pumpkin, cardamon, sugar, coconut milk, water]"
139,Siro\r\n (Semolina & Nuts),"[semolina flour, cow ghee, cow milk, sugar, pi..."
140,Chaas\r\n(Diluted Yoghurt),"[natural yoghurt, water, salt, ]"


- Chek for missing values

In [190]:
print(f'number of missing values: {df2.isnull().sum().sum()}')

number of missing values: 0


- Since both dataframes have no missing values and duplicates we can merge them now

In [191]:
#merge the two dataframes
combined_df = pd.concat([df,df2])
#check the shapes of the three dfs
print(f'Dataframe 1 has a shape of: {df.shape}')
print(f'Dataframe 2 has a shape of: {df2.shape}')
print(f'Combined dataframe has a shape of: {combined_df.shape}')

#reset the index
combined_df = combined_df.reset_index(drop=True)



Dataframe 1 has a shape of: (13493, 3)
Dataframe 2 has a shape of: (142, 3)
Combined dataframe has a shape of: (13635, 3)


### Clean the Combined DataFrame

In [192]:
combined_df.head()

Unnamed: 0,Title,Ingredients,Instructions
0,Miso-Butter Roast Chicken With Acorn Squash Pa...,"[['1 (3½–4-lb.) whole chicken', '2¾ tsp. koshe...","Pat chicken dry with paper towels, season all ..."
1,Crispy Salt and Pepper Potatoes,"[['2 large egg whites', '1 pound new potatoes ...",Preheat oven to 400°F and line a rimmed baking...
2,Thanksgiving Mac and Cheese,"[['1 cup evaporated milk', '1 cup whole milk',...",Place a rack in middle of oven; preheat to 400...
3,Italian Sausage and Bread Stuffing,"[['1 (¾- to 1-pound) round Italian loaf, cut i...",Preheat oven to 350°F with rack in middle. Gen...
4,Newton's Law,"[['1 teaspoon dark brown sugar', '1 teaspoon h...",Stir together brown sugar and hot water in a c...


In [193]:
combined_df[-10:]

Unnamed: 0,Title,Ingredients,Instructions
13625,Vinolo\r\n(Banana and Maize Flour Ugali),"[banana green, maize flour, water]",Preparation 5 minutes | Cooking 40 minutes | \...
13626,Finger Millet \r\nFlour Ugali,"[finger millet, water]",Preparation time 5 minutes | Cooking time 15 m...
13627,White Chapati,"[wheat flour, water, sugar, salt, cooking oil]",Preparation 30 minutes | Cooking 30 minutes | ...
13628,Brown Chapati,"[wheat flour, water, sugar, , salt, cooking oil]",Preparation 30 minutes | Cooking 30 minutes | ...
13629,Roti \r\n(Indian Chapati),"[wheat flour, salt, water, cooking oil, cow ghee]",Preparation 3 hours | Cooking 21 minutes | Ser...
13630,Bhature\r\n (Fried Indian Bread),"[wheat flour, salt, sugar, ghee, cooking oil, ...",Preparation 1 hour 15 minutes | Cooking 30 min...
13631,Vimumunya vya \r\nSukari\r\n (Sweetened Pumpki...,"[pumpkin, cardamon, sugar, coconut milk, water]",Preparation 5 minutes | Cooking 45 minutes | \...
13632,Siro\r\n (Semolina & Nuts),"[semolina flour, cow ghee, cow milk, sugar, pi...",Preparation 15 minutes | Cooking 30 minutes | ...
13633,Chaas\r\n(Diluted Yoghurt),"[natural yoghurt, water, salt, ]",Preparation 5 minutes | Serves 2\r\n?Add natur...
13634,Groundnut Sauce,"[groundnut, salt, sour milk, water]",Preparation 5 minutes | Cooking 1 hour 40 minu...


In [194]:
#check for missing values
print(combined_df.isnull().sum().sort_values(ascending=False))


Title           0
Ingredients     0
Instructions    0
dtype: int64


In [195]:
#write a function to clean the columns
"""
This function should:
1. Clean the Title Column: Remove newlines and extra spaces
2. Clean the Ingredients Column: Convert string ingredients to lists: If they are not already in a list format, convert them and clean any extra spaces.
                                Ensure there are no empty strings or duplicates within each list of ingredients.
3. Clean the Instructions Column: Similar to the Title, ensure that the instructions are clean and properly formatted.
"""

def clean_combined_df(df):
    #title column
    df['Title'] = df['Title'].str.replace('\n', '').str.strip()
    #Ingredients column
    # Convert ingredients to a list if they are in a string format
    #df['Ingredients'] = df['Ingredients'].apply(lambda x: [ingredient.strip() for ingredient in x.split(',')])

    # Remove empty ingredients
    df['Ingredients'] = df['Ingredients'].apply(lambda x: [ingredient for ingredient in x if ingredient])

    # If there are any duplicate ingredients in each list, remove them
    df['Ingredients'] = df['Ingredients'].apply(lambda x: list(set(x)))

    # Clean the Instructions column
    df['Instructions'] = df['Instructions'].str.replace('\n', '') \
                                       .str.replace('?', '.') \
                                       .str.replace('|', ',') \
                                       .str.strip()
    return df

# Use the function on your DataFrame
combined_cleaned = clean_combined_df(combined_df)
combined_cleaned

Unnamed: 0,Title,Ingredients,Instructions
0,Miso-Butter Roast Chicken With Acorn Squash Pa...,"[['1 (3½–4-lb.) whole chicken', '¼ cup dry whi...","Pat chicken dry with paper towels, season all ..."
1,Crispy Salt and Pepper Potatoes,"['¾ teaspoon finely ground black pepper', ['2 ...",Preheat oven to 400°F and line a rimmed baking...
2,Thanksgiving Mac and Cheese,"['1 tsp. garlic powder', '1 tsp. smoked paprik...",Place a rack in middle of oven; preheat to 400...
3,Italian Sausage and Bread Stuffing,"['2 tablespoons olive oil, '2 pounds sweet Ita...",Preheat oven to 350°F with rack in middle. Gen...
4,Newton's Law,"[['1 teaspoon dark brown sugar', '1 teaspoon h...",Stir together brown sugar and hot water in a c...
...,...,...,...
13630,Bhature\r (Fried Indian Bread),"[fenugreek leaves, sugar, wheat flour, coriand...","Preparation 1 hour 15 minutes , Cooking 30 min..."
13631,Vimumunya vya \rSukari\r (Sweetened Pumpkin & ...,"[coconut milk, sugar, pumpkin, water, cardamon]","Preparation 5 minutes , Cooking 45 minutes , \..."
13632,Siro\r (Semolina & Nuts),"[sugar, pistachio nut, cow milk, cow ghee, alm...","Preparation 15 minutes , Cooking 30 minutes , ..."
13633,Chaas\r(Diluted Yoghurt),"[natural yoghurt, water, salt]","Preparation 5 minutes , Serves 2\r.Add natural..."


In [196]:
combined_cleaned['Instructions'][13634]

'Preparation 5 minutes , Cooking 1 hour 40 minutes \r, Serves 4\r.Place a saucepan over fire and let it preheat.\r.Add the groundnuts, salt and 1/2 a cup of water or\ras desired.\r.Cook until the water evaporates as you stir gradu\x02ally. When ready, the nuts produce a pop sound.\r.Once they pop, turn down the heat and contin\x02ue stirring until the groundnuts are dry (about 13\rminutes).\r.Remove from heat and allow it to cool down.\r.Using a blender, blend the nuts into a paste. A\rpestle and mortar can be used in the absence of a\rblender.\r.Put the groundnut paste into a bowl, add sour milk\rand stir into thick paste. Water or fresh milk can be\rused in place of the sour milk.\r.Once ready, put another pan on the heat, add the\rpeanut paste and stir.\r.Stir until it is smooth but not too thick.\r.Serve hot with green leafy vegetables of your\rchoice, fish, sweet potatoes, green bananas, ugali,\retc'

In [197]:

def clean_ingredients(ingredients_list):
    # Remove any extra single quotes and fix formatting for each ingredient
    cleaned_list = [re.sub(r"['\"]", "", ingredient) for ingredient in ingredients_list]  # Remove quotes
    cleaned_list = [re.sub(r'\s+', ' ', ingredient) for ingredient in cleaned_list]  # Normalize spaces
    return cleaned_list

# Apply the cleaning function
combined_df['Ingredients'] = combined_df['Ingredients'].apply(clean_ingredients)

import re

# Function to tokenize and normalize ingredients
def tokenize_and_normalize(ingredients_list):
    tokens = []
    for ingredient in ingredients_list:
        # Split ingredient string by commas and strip whitespace
        split_ingredients = [i.strip().lower() for i in ingredient.split(',')]
        
        # Further clean each token: remove unwanted characters
        split_ingredients = [re.sub(r'[^\w\s]', '', i) for i in split_ingredients]  # Remove punctuation
        split_ingredients = [re.sub(r'\s+', ' ', i) for i in split_ingredients]  # Normalize whitespace
        
        # Extend the tokens list with cleaned ingredients
        tokens.extend(split_ingredients)
    
    return tokens


# Apply the function to the Ingredients column
combined_df['Ingredients'] = combined_df['Ingredients'].apply(tokenize_and_normalize)


In [198]:
# Check the output
combined_df[['Title', 'Ingredients']]

Unnamed: 0,Title,Ingredients
0,Miso-Butter Roast Chicken With Acorn Squash Pa...,"[1 3½4lb whole chicken, ¼ cup dry white wine, ..."
1,Crispy Salt and Pepper Potatoes,"[¾ teaspoon finely ground black pepper, 2 larg..."
2,Thanksgiving Mac and Cheese,"[1 tsp garlic powder, 1 tsp smoked paprika, 4 ..."
3,Italian Sausage and Bread Stuffing,"[2 tablespoons olive oil, 2 pounds sweet itali..."
4,Newton's Law,"[1 teaspoon dark brown sugar, 1 teaspoon hot w..."
...,...,...
13630,Bhature\r (Fried Indian Bread),"[fenugreek leaves, sugar, wheat flour, coriand..."
13631,Vimumunya vya \rSukari\r (Sweetened Pumpkin & ...,"[coconut milk, sugar, pumpkin, water, cardamon]"
13632,Siro\r (Semolina & Nuts),"[sugar, pistachio nut, cow milk, cow ghee, alm..."
13633,Chaas\r(Diluted Yoghurt),"[natural yoghurt, water, salt]"


In [199]:
def recommend_recipes(input_ingredients, combined_df):
    # Normalize user input by stripping whitespace and converting to lower case
    input_ingredients = [ingredient.strip().lower() for ingredient in input_ingredients.split(',')]
    
    # Find matching recipes
    matched_recipes = combined_df[combined_df['Ingredients'].apply(lambda x: any(ingredient in x for ingredient in input_ingredients))]
    
    # Check if any recipes were found
    if matched_recipes.empty:
        return "No recipe found. Try again."
    
    return matched_recipes[['Title', 'Instructions']]


In [200]:
def clean_title(title):
    # Normalize titles by removing punctuation, converting to lowercase, and stripping extra spaces
    title = re.sub(r'[^\w\s]', '', title)
    title = re.sub(r'\s+', ' ', title).strip().lower()
    return title

# Apply title cleaning to the Title column
combined_df['Title'] = combined_df['Title'].apply(clean_title)

In [201]:
def get_ingredients_by_title(title_input, combined_df):
    # Normalize the title input
    title_input = title_input.strip().lower()
    # Check if the title matches any recipe titles
    matched_title = combined_df[combined_df['Title'].str.lower().str.contains(title_input)]
    # If there's a match by title, return the recipe ingredients
    if not matched_title.empty:
        return matched_title[['Title', 'Ingredients']]
    
    return "No recipe found with that title. Try again."


In [217]:
# Example user input
user_input = "porridge"

# Get recommendations
recommended_recipes = recommend_recipes(user_input, combined_df)

# Display recommendations
recommended_recipes


Unnamed: 0,Title,Ingredients
452,supersimple overnight porridge,"[white parts kept whole, 2 green onions, green..."
1399,arroz caldo chicken rice porridge,"[2 spring onions scallions, peeled and cut int..."
1705,barley porridge with honeyed plums,"[1 cup almond milk, 1 pound plums, plus more f..."
2418,altgrain porridge with kimchi and jammy eggs,"[4 large eggs, wheat berries, 1 tablespoon veg..."
2426,altgrain porridge with sausages and grapes,"[1 tablespoon unsalted butter, 1 tablespoon re..."
4135,5grain porridge with bee pollen apples and coc...,"[cut into 14 pieces, 1 large sweettart apple s..."
4160,almondbarley porridge with fruit,"[3 ounces white chocolate, 34 cup dried cherri..."
4655,brown rice porridge with hazelnuts and jam,"[14 cup sugar, split lengthwise, or peach, 12 ..."
4853,trail mix porridge,"[34 cup packed light brown sugar, 6 tablespoon..."
5342,chia seed porridge with orange yogurt,"[12 cup 4 fl oz125ml blood orange juice, 1 tbs..."


## Exploratory Data Analysis

### Data Cleaning

## Feature Engineering