##                                          **Preprocessing Recipe Dataset**



### ***1. Importing the libraries and reading the dataset***

In [62]:
import pandas as pd
import numpy as np



This line of code reads the CSV file into a Pandas DataFrame.

*   data_url: This is the URL of the CSV file you want to read.
*   header=0: This indicates that the first row of the CSV file contains the column names.
*   delimiter=',': This specifies that the columns in the CSV file are separated by commas.
*   encoding='utf-8': This specifies the character encoding of the CSV file.

In [63]:
data_url="https://raw.githubusercontent.com/Chavi02/Recipe_Recommendation/main/IndianFoodDatasetCSV.csv"
df = pd.read_csv(data_url, header=0, delimiter=',', encoding='utf-8')
df.head()

Unnamed: 0,Srno,RecipeName,TranslatedRecipeName,Ingredients,TranslatedIngredients,PrepTimeInMins,CookTimeInMins,TotalTimeInMins,Servings,Cuisine,Course,Diet,Instructions,TranslatedInstructions,URL
0,1,Masala Karela Recipe,Masala Karela Recipe,"6 Karela (Bitter Gourd/ Pavakkai) - deseeded,S...","6 Karela (Bitter Gourd/ Pavakkai) - deseeded,S...",15,30,45,6,Indian,Side Dish,Diabetic Friendly,"To begin making the Masala Karela Recipe,de-se...","To begin making the Masala Karela Recipe,de-se...",https://www.archanaskitchen.com/masala-karela-...
1,2,टमाटर पुलियोगरे रेसिपी - Spicy Tomato Rice (Re...,Spicy Tomato Rice (Recipe),"2-1/2 कप चावल - पका ले,3 टमाटर,3 छोटा चमच्च बी...","2-1 / 2 cups rice - cooked, 3 tomatoes, 3 teas...",5,10,15,3,South Indian Recipes,Main Course,Vegetarian,टमाटर पुलियोगरे बनाने के लिए सबसे पहले टमाटर क...,"To make tomato puliogere, first cut the tomato...",http://www.archanaskitchen.com/spicy-tomato-ri...
2,3,Ragi Semiya Upma Recipe - Ragi Millet Vermicel...,Ragi Semiya Upma Recipe - Ragi Millet Vermicel...,"1-1/2 cups Rice Vermicelli Noodles (Thin),1 On...","1-1/2 cups Rice Vermicelli Noodles (Thin),1 On...",20,30,50,4,South Indian Recipes,South Indian Breakfast,High Protein Vegetarian,"To begin making the Ragi Vermicelli Recipe, fi...","To begin making the Ragi Vermicelli Recipe, fi...",http://www.archanaskitchen.com/ragi-vermicelli...
3,4,Gongura Chicken Curry Recipe - Andhra Style Go...,Gongura Chicken Curry Recipe - Andhra Style Go...,"500 grams Chicken,2 Onion - chopped,1 Tomato -...","500 grams Chicken,2 Onion - chopped,1 Tomato -...",15,30,45,4,Andhra,Lunch,Non Vegeterian,To begin making Gongura Chicken Curry Recipe f...,To begin making Gongura Chicken Curry Recipe f...,http://www.archanaskitchen.com/gongura-chicken...
4,5,आंध्रा स्टाइल आलम पचड़ी रेसिपी - Adrak Chutney ...,Andhra Style Alam Pachadi Recipe - Adrak Chutn...,"1 बड़ा चमच्च चना दाल,1 बड़ा चमच्च सफ़ेद उरद दाल,2...","1 tablespoon chana dal, 1 tablespoon white ura...",10,20,30,4,Andhra,South Indian Breakfast,Vegetarian,आंध्रा स्टाइल आलम पचड़ी बनाने के लिए सबसे पहले ...,"To make Andhra Style Alam Pachadi, first heat ...",https://www.archanaskitchen.com/andhra-style-a...


### ***2. Checking null values***

Counting the number of null values in each column of the DataFrame  using the isnull() method and then sums up the counts for each column using sum().


In [64]:
nullvalues = df.isnull().sum()
nullvalues

Srno                      0
RecipeName                0
TranslatedRecipeName      0
Ingredients               6
TranslatedIngredients     6
PrepTimeInMins            0
CookTimeInMins            0
TotalTimeInMins           0
Servings                  0
Cuisine                   0
Course                    0
Diet                      0
Instructions              0
TranslatedInstructions    0
URL                       0
dtype: int64



The dropna() function is used to remove rows with any null (NaN) values from the DataFrame df.

In [65]:
df = df.dropna()

In [66]:
nullvalues = df.isnull().sum()
nullvalues

Srno                      0
RecipeName                0
TranslatedRecipeName      0
Ingredients               0
TranslatedIngredients     0
PrepTimeInMins            0
CookTimeInMins            0
TotalTimeInMins           0
Servings                  0
Cuisine                   0
Course                    0
Diet                      0
Instructions              0
TranslatedInstructions    0
URL                       0
dtype: int64

### ***3. Dropping the non-contributing columns***

In the folllowing two line of codes,
*   removing the redundant columns - RecipieName, Ingredients, Instructions and uninformative column- URl.  

*   translated columns are renamed for better readability



In [67]:
df = df.drop(columns=['RecipeName','Ingredients','Instructions', 'URL'])


In [68]:
df = df.rename(columns={'TranslatedRecipeName': 'RecipeName', 'TranslatedIngredients': 'Ingredients', 'TranslatedInstructions':'Instructions'})

### ***4. Counting Infinite Values in DataFrame***


Using a lambda function with the applymap method on a DataFrame (df) to check if each element in the DataFrame is either an integer or a float and if it is also infinity using NumPy's isinf function. Then, you are summing the occurrences of infinity for each column.

In [69]:
is_infinity = df.applymap(lambda x: isinstance(x, (int, float)) and np.isinf(x))
num_infinity_values = is_infinity.sum()
num_infinity_values

Srno               0
RecipeName         0
Ingredients        0
PrepTimeInMins     0
CookTimeInMins     0
TotalTimeInMins    0
Servings           0
Cuisine            0
Course             0
Diet               0
Instructions       0
dtype: int64

### ***5. Extracting Column Names and Unique Values from DataFrame***

In [70]:
column_names = df.columns.tolist()
column_names

['Srno',
 'RecipeName',
 'Ingredients',
 'PrepTimeInMins',
 'CookTimeInMins',
 'TotalTimeInMins',
 'Servings',
 'Cuisine',
 'Course',
 'Diet',
 'Instructions']

In [71]:
unique_values = df['Cuisine'].unique()
unique_values

array(['Indian', 'South Indian Recipes', 'Andhra', 'Udupi', 'Mexican',
       'Fusion', 'Continental', 'Bengali Recipes', 'Punjabi', 'Chettinad',
       'Tamil Nadu', 'Maharashtrian Recipes', 'North Indian Recipes',
       'Italian Recipes', 'Sindhi', 'Thai', 'Chinese', 'Kerala Recipes',
       'Gujarati Recipes\ufeff', 'Coorg', 'Rajasthani', 'Asian',
       'Middle Eastern', 'Coastal Karnataka', 'European', 'Kashmiri',
       'Karnataka', 'Lucknowi', 'Hyderabadi', 'Side Dish', 'Goan Recipes',
       'Arab', 'Assamese', 'Bihari', 'Malabar', 'Himachal', 'Awadhi',
       'Cantonese', 'North East India Recipes', 'Sichuan', 'Mughlai',
       'Japanese', 'Mangalorean', 'Vietnamese', 'British',
       'North Karnataka', 'Parsi Recipes', 'Greek', 'Nepalese',
       'Oriya Recipes', 'French', 'Indo Chinese', 'Konkan',
       'Mediterranean', 'Sri Lankan', 'Haryana', 'Uttar Pradesh',
       'Malvani', 'Indonesian', 'African', 'Shandong', 'Korean',
       'American', 'Kongunadu', 'Pakistani', 'C

In [72]:
unique_values = df['Course'].unique()
unique_values

array(['Side Dish', 'Main Course', 'South Indian Breakfast', 'Lunch',
       'Snack', 'High Protein Vegetarian', 'Dinner', 'Appetizer',
       'Indian Breakfast', 'Dessert', 'North Indian Breakfast',
       'One Pot Dish', 'World Breakfast', 'Non Vegeterian', 'Vegetarian',
       'Eggetarian', 'No Onion No Garlic (Sattvic)', 'Brunch', 'Vegan',
       'Sugar Free Diet'], dtype=object)

In [73]:
unique_values = df['Diet'].unique()
unique_values


array(['Diabetic Friendly', 'Vegetarian', 'High Protein Vegetarian',
       'Non Vegeterian', 'High Protein Non Vegetarian', 'Eggetarian',
       'Vegan', 'No Onion No Garlic (Sattvic)', 'Gluten Free',
       'Sugar Free Diet'], dtype=object)

### ***6. Cleaning the columns***



*   Define a mapping dictionary for value replacement
*   Replace the values in diet column according to  mapping dictionary for better readability



In [74]:
value_mapping = {
    'Diabetic Friendly':'Sugar Free',
    'Sugar Free Diet' : 'Sugar Free',
    'No Onion No Garlic (Sattvic)' : 'Sattvic',
}
df['Diet'] = df['Diet'].replace(value_mapping)

In [75]:
num_unique_values = df['RecipeName'].nunique()
num_unique_values

6838

In [76]:
df.head()

Unnamed: 0,Srno,RecipeName,Ingredients,PrepTimeInMins,CookTimeInMins,TotalTimeInMins,Servings,Cuisine,Course,Diet,Instructions
0,1,Masala Karela Recipe,"6 Karela (Bitter Gourd/ Pavakkai) - deseeded,S...",15,30,45,6,Indian,Side Dish,Sugar Free,"To begin making the Masala Karela Recipe,de-se..."
1,2,Spicy Tomato Rice (Recipe),"2-1 / 2 cups rice - cooked, 3 tomatoes, 3 teas...",5,10,15,3,South Indian Recipes,Main Course,Vegetarian,"To make tomato puliogere, first cut the tomato..."
2,3,Ragi Semiya Upma Recipe - Ragi Millet Vermicel...,"1-1/2 cups Rice Vermicelli Noodles (Thin),1 On...",20,30,50,4,South Indian Recipes,South Indian Breakfast,High Protein Vegetarian,"To begin making the Ragi Vermicelli Recipe, fi..."
3,4,Gongura Chicken Curry Recipe - Andhra Style Go...,"500 grams Chicken,2 Onion - chopped,1 Tomato -...",15,30,45,4,Andhra,Lunch,Non Vegeterian,To begin making Gongura Chicken Curry Recipe f...
4,5,Andhra Style Alam Pachadi Recipe - Adrak Chutn...,"1 tablespoon chana dal, 1 tablespoon white ura...",10,20,30,4,Andhra,South Indian Breakfast,Vegetarian,"To make Andhra Style Alam Pachadi, first heat ..."




*   List of course values redundant with diet values  to filter
*   Create the Courserecipes DataFrame by filtering based on 'Course' values
*   Extracting  'Recipe' , 'Diet', 'Ingredients' and 'Course' columns from the filtered DataFrame



In [77]:
course_values = ['Eggetarian', 'High Protein Vegetarian', 'Non Vegeterian', 'Vegetarian', 'No Onion No Garlic (Sattvic)', 'Vegan', 'Sugar Free Diet']
Courserecipes = df[df['Course'].isin(course_values)]
recipe_and_course = Courserecipes[['RecipeName', 'Course', 'Diet', 'Ingredients']]


In [78]:
recipe_and_course.head(30)
#print(recipe_and_course.head(30))

Unnamed: 0,RecipeName,Course,Diet,Ingredients
10,Homemade Baked Beans Recipe (Wholesome & Healthy),High Protein Vegetarian,Vegetarian,250 grams Dry beans - (such as cannellini or s...
167,Andaman Style Steamed Garlic Prawns Recipe,Non Vegeterian,Vegetarian,"10 Prawns,1 tablespoon Soy sauce,1 teaspoon Se..."
234,Palak Paneer Bhurji Recipe - Palak Paneer Bhur...,Vegetarian,Vegetarian,"200 grams cheese - pinch, 50 grams spinach - f..."
264,Pomegranate Frozen Margarita Recipe,Vegetarian,Vegetarian,"300 ml Pomegranate juice - freshly squeezed,1/..."
424,How To Boil Eggs At Home - Boiled Eggs Recipe,Eggetarian,Vegetarian,"2 Whole Eggs,Water - (2 inches above eggs in t..."
580,Achari Masala Powder Recipe,Vegetarian,Vegetarian,"2 बड़े चम्मच धनिये के बीज,1 छोटा चम्मच मेथी के ..."
614,Cucumber Honey Limeade Recipe,Vegetarian,High Protein Vegetarian,"1 Cucumber - juiced and strained,2 tablespoons..."
764,Shandong Style Sweet Potato In Spicy Caramel...,No Onion No Garlic (Sattvic),Vegetarian,"1 Sweet Potatoes - cubed,4 tablespoons Sugar,1..."
940,Thalipeeth Bhajani Recipe -Typical Maharashtri...,Vegetarian,Vegetarian,"250 grams Jowar Seeds,250 grams Ragi Seeds,250..."
1060,Tomato Basil Pasta And Pizza Sauce Recipe,Vegetarian,High Protein Non Vegetarian,"1 kg Blanched tomatoes,2 tablespoons Extra Vir..."


Filtering the recipe_and_course DataFrame to include only rows where the 'Course' column is not equal to the 'Diet' column.

In [79]:
result = recipe_and_course[recipe_and_course['Course'] != recipe_and_course['Diet']]

In [80]:
result

Unnamed: 0,RecipeName,Course,Diet,Ingredients
10,Homemade Baked Beans Recipe (Wholesome & Healthy),High Protein Vegetarian,Vegetarian,250 grams Dry beans - (such as cannellini or s...
167,Andaman Style Steamed Garlic Prawns Recipe,Non Vegeterian,Vegetarian,"10 Prawns,1 tablespoon Soy sauce,1 teaspoon Se..."
424,How To Boil Eggs At Home - Boiled Eggs Recipe,Eggetarian,Vegetarian,"2 Whole Eggs,Water - (2 inches above eggs in t..."
614,Cucumber Honey Limeade Recipe,Vegetarian,High Protein Vegetarian,"1 Cucumber - juiced and strained,2 tablespoons..."
764,Shandong Style Sweet Potato In Spicy Caramel...,No Onion No Garlic (Sattvic),Vegetarian,"1 Sweet Potatoes - cubed,4 tablespoons Sugar,1..."
1060,Tomato Basil Pasta And Pizza Sauce Recipe,Vegetarian,High Protein Non Vegetarian,"1 kg Blanched tomatoes,2 tablespoons Extra Vir..."
1236,Pepper Tea Recipe - Kali Mirch Chai,Vegetarian,High Protein Non Vegetarian,"2-1/2 cups Water,1 teaspoon Whole Black Pepper..."
1699,Strawberry Compote Recipe (Coulis),Vegetarian,Non Vegeterian,"500 grams Strawberries - fresh,1/2 cup Sugar,..."
1705,"Ginger, Lemon And Honey Kadha Recipe",Vegetarian,High Protein Non Vegetarian,8 inch Ginger - grated (you will need 5 tables...
2266,Irish Cream Bundt Cake Recipe,Vegetarian,High Protein Non Vegetarian,"2-1/2 cups All Purpose Flour (Maida),3/4 cup B..."


The below function extracts and prints ingredients based on a given recipe name.

Function Name: extract_ingredients

Parameters:
recipe_name (str): The name of the recipe to extract ingredients for.

df (pandas.DataFrame): The DataFrame containing recipe information.

Function Logic:
It filters the DataFrame to get the rows corresponding to the provided recipe_name.

If a matching recipe is found, it prints the ingredients; otherwise, it prints a message indicating that no recipe was found.

In [81]:
def extract_ingredients(recipe_name, df):
    selected_recipe = df[df['RecipeName'] == recipe_name]

    if not selected_recipe.empty:
        ingredients = selected_recipe['Ingredients'].values[0]
        print(f"Ingredients for {recipe_name}: {ingredients}")
    else:
        print(f"No recipe found with the name {recipe_name}")


In [82]:
extract_ingredients('Savory Oatmeal Bowl with Cabbage and Green Peas Stir Fry Recipe', recipe_and_course)

Ingredients for Savory Oatmeal Bowl with Cabbage and Green Peas Stir Fry Recipe: 1/4 cup Rolled Oats Or Instant Oats - 40 grams,1 1/2 cup Water,1/2 teaspoon Cumin seeds (Jeera),1 Mace (Javitri),1 cup Cabbage (Patta Gobi/ Muttaikose) - shredded,1/2 cup Green peas (Matar),1 Green Chilli - chopped,1 Onion - sliced,1 teaspoon Turmeric powder (Haldi),1 teaspoon Black pepper powder,1 teaspoon Red Chilli powder,Salt - to taste


In [83]:
result.shape

(30, 4)

In [84]:
result.index

Int64Index([  10,  167,  424,  614,  764, 1060, 1236, 1699, 1705, 2266, 2413,
            2468, 2469, 2677, 2684, 3118, 3479, 4078, 4194, 4325, 4780, 4968,
            5108, 5261, 5518, 5565, 5727, 5880, 6477, 6585],
           dtype='int64')

Updating 'Diet' and 'Course' Values for Specific Indices based on the correct value we found using extract_ingredients

In [85]:
index_to_change_for_diet=[  10,  167,  424,  614,  1060, 1236, 1699, 1705, 2413, 2677, 2684, 3479, 4194, 4325, 4780, 4968, 5261, 5518, 5565, 5727,  6477, 6585]
df.loc[index_to_change_for_diet, 'Diet'] = df.loc[index_to_change_for_diet, 'Course']
index_to_change_for_course=[2469, 5108, 5880, 764, 2468, 2469, 3118]
df.loc[index_to_change_for_course, 'Course'] = df.loc[index_to_change_for_course, 'Diet']

In [86]:
df.loc[2266, 'Diet'] = 'Non-vegetarian'
df.loc[2266, 'Course'] = 'Non-vegetarian'

In [87]:
df.loc[2266]

Srno                                                            2624
RecipeName                             Irish Cream Bundt Cake Recipe
Ingredients        2-1/2 cups All Purpose Flour (Maida),3/4 cup B...
PrepTimeInMins                                                    15
CookTimeInMins                                                    60
TotalTimeInMins                                                   75
Servings                                                           4
Cuisine                                                      Dessert
Course                                                Non-vegetarian
Diet                                                  Non-vegetarian
Instructions       To begin making the Irish cream bundt cake fir...
Name: 2266, dtype: object

In [88]:
df.loc[5880]

Srno                                                            9125
RecipeName                       Apple, Kiwi, Pineapple Juice Recipe
Ingredients        1 Apple - chopped,1 Kiwi - skin peeled and cho...
PrepTimeInMins                                                    15
CookTimeInMins                                                     0
TotalTimeInMins                                                   15
Servings                                                           4
Cuisine                                              World Breakfast
Course                                                    Vegetarian
Diet                                                      Vegetarian
Instructions       To begin making the Apple, Kiwi, Pineapple Jui...
Name: 5880, dtype: object

In [89]:
df.shape

(6865, 11)

In [90]:
df['RecipeName'].nunique()

6838

*   Create a DataFrame containing duplicate entries based on 'RecipeName'
*   Sorting and Examining Duplicate Recipes






In [91]:
duplicates=df[df.duplicated(subset=['RecipeName'],keep=False)]

In [92]:
duplicates=duplicates.sort_values('RecipeName')
duplicates.shape

(54, 11)

In [93]:
duplicates

Unnamed: 0,Srno,RecipeName,Ingredients,PrepTimeInMins,CookTimeInMins,TotalTimeInMins,Servings,Cuisine,Course,Diet,Instructions
6777,12966,Carrot and Beans Poriyal (Recipe In Hindi),"200 ग्राम हरा बीन्स - काट ले,200 ग्राम गाजर - ...",10,30,40,4,South Indian Recipes,Lunch,Vegetarian,"To make the carrot and beans porridge, first o..."
1211,1351,Carrot and Beans Poriyal (Recipe In Hindi),"200 ग्राम हरा बीन्स - काट ले,200 ग्राम गाजर - ...",10,30,40,4,South Indian Recipes,Lunch,Vegetarian,"To make carrot beans porridge, first cut the c..."
235,251,Badam Halwa Recipe,"1 cup almonds, 1/2 cup ghee, 3/4 cup milk, 1 t...",5,30,35,4,Awadhi,Dessert,Vegetarian,"To make almond pudding, first soak almonds in ..."
237,253,Badam Halwa Recipe,"1 cup Badam (Almond),1/2 cup Ghee,3/4 cup Milk...",5,30,35,4,Awadhi,Dessert,Vegetarian,"To begin making the Badam Halwa Recipe, combin..."
5988,9413,Beetroot Carrot Raita Recipe,"1/2 cup carrots - grated, 1 spoon - boil, 2 cu...",15,10,25,4,Indian,Side Dish,Vegetarian,"To make chakundar and carrot raita, first peel..."
1750,1999,Beetroot Carrot Raita Recipe,"1/2 cup Carrot (Gajjar) - grated,1 Beetroot - ...",15,10,25,4,Indian,Side Dish,Vegetarian,To begin making the Beetroot Carrot Raita Reci...
5450,8051,Bread Halwa Recipe,"10 Whole Wheat Brown Bread,1 cup Sugar - adjus...",10,20,30,2,Mughlai,Dessert,Vegetarian,"To begin making the Bread Halwa Recipe, soak b..."
1634,1853,Bread Halwa Recipe,"10 breads, 1 cup sugar, 1-1 / 2 cup milk, 4 ta...",10,20,30,2,Mughlai,Dessert,Vegetarian,"To make bread pudding, first place the bread s..."
1896,2177,Cabbage Besan Fry Recipe,"1 कप पत्ता गोभी,3 बड़े चमच्च बेसन,1/2 छोटा चमच्...",10,20,30,2,Indian,Dinner,Vegetarian,कैबेज बेसन फ्राई बनाने के लिए सबसे पहले पत्ता ...
1312,1470,Cabbage Besan Fry Recipe,"1 cup Cabbage (Patta Gobi/ Muttaikose),3 table...",10,20,30,2,Indian,Dinner,Vegetarian,"To begin with Cabbage Besan Fry, firstly chop ..."


Identifying rows in the DataFrame where the 'Ingredients' column contains Hindi characters.

*    Function has_hindi_characters to check if a string contains Hindi characters by iterating through its characters.

*    Applies the has_hindi_characters function to each element in the 'Ingredients' column using apply.

*    Creates a new DataFrame (rows_with_hindi_characters) containing rows where the 'Ingredients' column has Hindi characters.

*    Prints the rows with Hindi characters in the 'Ingredients' column.

In [94]:
def has_hindi_characters(s):
    for char in s:
        if '\u0900' <= char <= '\u097F':
            return True
    return False
rows_with_hindi_characters = df[df['Ingredients'].apply(has_hindi_characters)]
print("Rows with Hindi characters in 'Ingredients':")
print(rows_with_hindi_characters)

Rows with Hindi characters in 'Ingredients':
       Srno                                       RecipeName  \
576     624   South Indian Coconut Chutney (Recipe In Hindi)   
580     628                      Achari Masala Powder Recipe   
591     640                           Moong Dal Tikki Recipe   
624     675              Horsegram Chutney (Recipe In Hindi)   
637     690               Noodle Soup Recipe with Vegetables   
...     ...                                              ...   
6855  13928                        Spicy Cabbage Rice Recipe   
6859  13994                      Tadkewali Masoor Dal Recipe   
6866  14073                      Goan Mushroom Xacuti Recipe   
6867  14107      Sweet Potato & Methi Stuffed Paratha Recipe   
6870  14211                               Navrang Dal Recipe   

                                            Ingredients  PrepTimeInMins  \
576   1 कप नारियल - कस ले,1 बड़ा चमच्च रोस्टेड चना दा...              10   
580   2 बड़े चम्मच धनिये के बीज,1 छोट

Filtering out rows from the DataFrame where the 'Ingredients' column contains Hindi characters.
*   has_hindi_characters checks if a string contains Hindi characters by iterating through its characters.
*   Applies the has_hindi_characters function to each element in the 'Ingredients' column using apply.
*  Creates a boolean mask (mask) that is True for rows without Hindi characters and False for rows with Hindi characters.
*  Updates the DataFrame (df) by applying the boolean mask to exclude rows with Hindi characters.
*  Prints the updated DataFrame without rows containing Hindi characters.



In [95]:
def has_hindi_characters(s):
    for char in s:
        if '\u0900' <= char <= '\u097F':
            return True
    return False
mask = ~df['Ingredients'].apply(has_hindi_characters)
df = df[mask]
print("Updated DataFrame without rows with Hindi characters:")
df.head(30)

Updated DataFrame without rows with Hindi characters:


Unnamed: 0,Srno,RecipeName,Ingredients,PrepTimeInMins,CookTimeInMins,TotalTimeInMins,Servings,Cuisine,Course,Diet,Instructions
0,1,Masala Karela Recipe,"6 Karela (Bitter Gourd/ Pavakkai) - deseeded,S...",15,30,45,6,Indian,Side Dish,Sugar Free,"To begin making the Masala Karela Recipe,de-se..."
1,2,Spicy Tomato Rice (Recipe),"2-1 / 2 cups rice - cooked, 3 tomatoes, 3 teas...",5,10,15,3,South Indian Recipes,Main Course,Vegetarian,"To make tomato puliogere, first cut the tomato..."
2,3,Ragi Semiya Upma Recipe - Ragi Millet Vermicel...,"1-1/2 cups Rice Vermicelli Noodles (Thin),1 On...",20,30,50,4,South Indian Recipes,South Indian Breakfast,High Protein Vegetarian,"To begin making the Ragi Vermicelli Recipe, fi..."
3,4,Gongura Chicken Curry Recipe - Andhra Style Go...,"500 grams Chicken,2 Onion - chopped,1 Tomato -...",15,30,45,4,Andhra,Lunch,Non Vegeterian,To begin making Gongura Chicken Curry Recipe f...
4,5,Andhra Style Alam Pachadi Recipe - Adrak Chutn...,"1 tablespoon chana dal, 1 tablespoon white ura...",10,20,30,4,Andhra,South Indian Breakfast,Vegetarian,"To make Andhra Style Alam Pachadi, first heat ..."
5,6,Pudina Khara Pongal Recipe (Rice and Lentils C...,"1 cup Rice - soaked for 20 minutes,1/2 cup Yel...",10,20,30,4,South Indian Recipes,South Indian Breakfast,High Protein Vegetarian,"To begin making Pudina Khara Pongal Recipe, wa..."
6,7,Udupi Style Ash Gourd Coconut Curry Recipe,500 grams Vellai Poosanikai (Ash gourd/White P...,10,30,40,4,Udupi,Lunch,Vegetarian,To begin making the Udupi Style Ash Gourd Coco...
7,8,Mexican Style Black Bean Burrito Recipe,"4 Tortillas,1/4 cup Black beans - soaked overn...",10,30,40,4,Mexican,Lunch,Vegetarian,"To begin making the Black Bean Burrito recipe,..."
8,9,Spicy Crunchy Masala Idli Recipe,"10 Idli - cut into strips,1 cup Green Bell Pep...",10,20,30,3,South Indian Recipes,Snack,Vegetarian,"To prepare Spicy Crunchy Masala Idli Recipe, H..."
9,10,Cauliflower Leaves Chutney (Recipe in Hindi),"1 cup cabbage leaves, 3/4 cup tomatoes, 18 gra...",5,20,25,3,South Indian Recipes,Side Dish,Vegetarian,"To make cauliflower leaf chutney, first of all..."


In [96]:
df['RecipeName'].nunique()

6282

In [97]:
duplicates=df[df.duplicated(subset=['RecipeName'],keep=False)]
duplicates=duplicates.sort_values('RecipeName')
duplicates.shape


(30, 11)

In [98]:
duplicates

Unnamed: 0,Srno,RecipeName,Ingredients,PrepTimeInMins,CookTimeInMins,TotalTimeInMins,Servings,Cuisine,Course,Diet,Instructions
235,251,Badam Halwa Recipe,"1 cup almonds, 1/2 cup ghee, 3/4 cup milk, 1 t...",5,30,35,4,Awadhi,Dessert,Vegetarian,"To make almond pudding, first soak almonds in ..."
237,253,Badam Halwa Recipe,"1 cup Badam (Almond),1/2 cup Ghee,3/4 cup Milk...",5,30,35,4,Awadhi,Dessert,Vegetarian,"To begin making the Badam Halwa Recipe, combin..."
5988,9413,Beetroot Carrot Raita Recipe,"1/2 cup carrots - grated, 1 spoon - boil, 2 cu...",15,10,25,4,Indian,Side Dish,Vegetarian,"To make chakundar and carrot raita, first peel..."
1750,1999,Beetroot Carrot Raita Recipe,"1/2 cup Carrot (Gajjar) - grated,1 Beetroot - ...",15,10,25,4,Indian,Side Dish,Vegetarian,To begin making the Beetroot Carrot Raita Reci...
5450,8051,Bread Halwa Recipe,"10 Whole Wheat Brown Bread,1 cup Sugar - adjus...",10,20,30,2,Mughlai,Dessert,Vegetarian,"To begin making the Bread Halwa Recipe, soak b..."
1634,1853,Bread Halwa Recipe,"10 breads, 1 cup sugar, 1-1 / 2 cup milk, 4 ta...",10,20,30,2,Mughlai,Dessert,Vegetarian,"To make bread pudding, first place the bread s..."
6083,9662,Cabbage Pachadi Recipe,"2 tablespoons til oil, 1 tablespoon white urad...",10,25,35,4,South Indian Recipes,Side Dish,Vegetarian,"To make the cabbage cabbage, first heat the oi..."
5653,8546,Cabbage Pachadi Recipe,"2 tablespoons Sesame (Gingelly) Oil,1 tablespo...",10,25,35,2,South Indian Recipes,Side Dish,Vegetarian,"To prepare cabbage pachadi recipe, heat oil in..."
632,685,Chicken Kathi Roll Recipe,"1 cup Whole Wheat Flour,Sunflower Oil - for kn...",15,25,40,4,Indian,Main Course,Non Vegeterian,To begin making the Chicken Tikka Kathi Roll R...
6130,9868,Chicken Kathi Roll Recipe,"1 cup wheat flour, oil - to knead, salt - as p...",15,25,40,4,Indian,Main Course,Non Vegeterian,चिकन काठी रोल रेसिपी बनाने के लिए सबसे पहले हम...


In [99]:
df.drop_duplicates(subset=['RecipeName'], keep='first', inplace=True)
df

Unnamed: 0,Srno,RecipeName,Ingredients,PrepTimeInMins,CookTimeInMins,TotalTimeInMins,Servings,Cuisine,Course,Diet,Instructions
0,1,Masala Karela Recipe,"6 Karela (Bitter Gourd/ Pavakkai) - deseeded,S...",15,30,45,6,Indian,Side Dish,Sugar Free,"To begin making the Masala Karela Recipe,de-se..."
1,2,Spicy Tomato Rice (Recipe),"2-1 / 2 cups rice - cooked, 3 tomatoes, 3 teas...",5,10,15,3,South Indian Recipes,Main Course,Vegetarian,"To make tomato puliogere, first cut the tomato..."
2,3,Ragi Semiya Upma Recipe - Ragi Millet Vermicel...,"1-1/2 cups Rice Vermicelli Noodles (Thin),1 On...",20,30,50,4,South Indian Recipes,South Indian Breakfast,High Protein Vegetarian,"To begin making the Ragi Vermicelli Recipe, fi..."
3,4,Gongura Chicken Curry Recipe - Andhra Style Go...,"500 grams Chicken,2 Onion - chopped,1 Tomato -...",15,30,45,4,Andhra,Lunch,Non Vegeterian,To begin making Gongura Chicken Curry Recipe f...
4,5,Andhra Style Alam Pachadi Recipe - Adrak Chutn...,"1 tablespoon chana dal, 1 tablespoon white ura...",10,20,30,4,Andhra,South Indian Breakfast,Vegetarian,"To make Andhra Style Alam Pachadi, first heat ..."
...,...,...,...,...,...,...,...,...,...,...,...
6863,14030,Saffron Paneer Peda Recipe,2 cups Paneer (Homemade Cottage Cheese) - crum...,5,15,20,4,Indian,Dessert,Vegetarian,To begin making the Saffron Paneer Peda recipe...
6864,14038,Italian Arancini Rice Balls Recipe With Delici...,1-1/2 cup Risotto - cooked risotto (recipe bel...,10,90,100,3,Italian Recipes,Dinner,Vegetarian,To begin making the Italian Arancini Rice Ball...
6865,14045,Quinoa Phirnee Recipe (Quinoa Milk Pudding),"1 cup Quinoa,3/4 cup Sugar,1 teaspoon Cardamom...",10,25,35,2,Indian,Dessert,Vegetarian,"To begin making Quinoa Phirnee Recipe, place a..."
6868,14165,Ullikadala Pulusu Recipe | Spring Onion Curry,150 grams Spring Onion (Bulb & Greens) - chopp...,5,10,15,2,Andhra,Side Dish,Vegetarian,To begin making Ullikadala Pulusu Recipe | Spr...


*  List of course values to filter
*  Create the Courserecipes DataFrame by filtering based on 'Course' values
*  Extract 'RecipeName','Course', 'Diet', 'Ingredients' columns from the filtered DataFrame

In [100]:
course_values = ['Eggetarian', 'High Protein Vegetarian', 'Non Vegeterian', 'Vegetarian', 'No Onion No Garlic (Sattvic)', 'Vegan', 'Sugar Free Diet']
Courserecipes = df[df['Course'].isin(course_values)]
recipe_and_course = Courserecipes[['RecipeName', 'Course', 'Diet', 'Ingredients']]
recipe_and_course

Unnamed: 0,RecipeName,Course,Diet,Ingredients
10,Homemade Baked Beans Recipe (Wholesome & Healthy),High Protein Vegetarian,High Protein Vegetarian,250 grams Dry beans - (such as cannellini or s...
167,Andaman Style Steamed Garlic Prawns Recipe,Non Vegeterian,Non Vegeterian,"10 Prawns,1 tablespoon Soy sauce,1 teaspoon Se..."
234,Palak Paneer Bhurji Recipe - Palak Paneer Bhur...,Vegetarian,Vegetarian,"200 grams cheese - pinch, 50 grams spinach - f..."
264,Pomegranate Frozen Margarita Recipe,Vegetarian,Vegetarian,"300 ml Pomegranate juice - freshly squeezed,1/..."
424,How To Boil Eggs At Home - Boiled Eggs Recipe,Eggetarian,Eggetarian,"2 Whole Eggs,Water - (2 inches above eggs in t..."
614,Cucumber Honey Limeade Recipe,Vegetarian,Vegetarian,"1 Cucumber - juiced and strained,2 tablespoons..."
764,Shandong Style Sweet Potato In Spicy Caramel...,Vegetarian,Vegetarian,"1 Sweet Potatoes - cubed,4 tablespoons Sugar,1..."
940,Thalipeeth Bhajani Recipe -Typical Maharashtri...,Vegetarian,Vegetarian,"250 grams Jowar Seeds,250 grams Ragi Seeds,250..."
1060,Tomato Basil Pasta And Pizza Sauce Recipe,Vegetarian,Vegetarian,"1 kg Blanched tomatoes,2 tablespoons Extra Vir..."
1236,Pepper Tea Recipe - Kali Mirch Chai,Vegetarian,Vegetarian,"2-1/2 cups Water,1 teaspoon Whole Black Pepper..."


In [101]:
recipe_and_course.shape

(57, 4)

In [102]:
all_indexes = np.array(recipe_and_course.index)
all_indexes

array([  10,  167,  234,  264,  424,  614,  764,  940, 1060, 1236, 1255,
       1689, 1699, 1705, 1943, 2164, 2411, 2413, 2468, 2469, 2677, 2684,
       2746, 2811, 2845, 3266, 3306, 3307, 3311, 3337, 3479, 3795, 4078,
       4194, 4325, 4747, 4777, 4780, 4840, 4968, 5108, 5125, 5261, 5288,
       5341, 5444, 5482, 5518, 5565, 5727, 5810, 5880, 6116, 6437, 6477,
       6524, 6585])

In [103]:
all_indexes.size

57

In [104]:
recipe_and_course.head(5)

Unnamed: 0,RecipeName,Course,Diet,Ingredients
10,Homemade Baked Beans Recipe (Wholesome & Healthy),High Protein Vegetarian,High Protein Vegetarian,250 grams Dry beans - (such as cannellini or s...
167,Andaman Style Steamed Garlic Prawns Recipe,Non Vegeterian,Non Vegeterian,"10 Prawns,1 tablespoon Soy sauce,1 teaspoon Se..."
234,Palak Paneer Bhurji Recipe - Palak Paneer Bhur...,Vegetarian,Vegetarian,"200 grams cheese - pinch, 50 grams spinach - f..."
264,Pomegranate Frozen Margarita Recipe,Vegetarian,Vegetarian,"300 ml Pomegranate juice - freshly squeezed,1/..."
424,How To Boil Eggs At Home - Boiled Eggs Recipe,Eggetarian,Eggetarian,"2 Whole Eggs,Water - (2 inches above eggs in t..."


 Update the 'Course' values in  DataFrame for the  indices which have diet values like 'Eggetarian', 'High Protein Vegetarian', 'Non Vegeterian', 'Vegetarian', 'No Onion No Garlic (Sattvic)', 'Vegan', 'Sugar Free Diet' to their respective
 courses like 'Main Course', 'Side Dish', 'Breakfast', 'Snack'.

In [105]:
x = np.array([10, 167, 234])
df.loc[x, 'Course'] = 'Main Course'

In [106]:
x = np.array([264, 424])
df.loc[x, 'Course'] = 'Side Dish'

In [107]:
recipe_and_course

Unnamed: 0,RecipeName,Course,Diet,Ingredients
10,Homemade Baked Beans Recipe (Wholesome & Healthy),High Protein Vegetarian,High Protein Vegetarian,250 grams Dry beans - (such as cannellini or s...
167,Andaman Style Steamed Garlic Prawns Recipe,Non Vegeterian,Non Vegeterian,"10 Prawns,1 tablespoon Soy sauce,1 teaspoon Se..."
234,Palak Paneer Bhurji Recipe - Palak Paneer Bhur...,Vegetarian,Vegetarian,"200 grams cheese - pinch, 50 grams spinach - f..."
264,Pomegranate Frozen Margarita Recipe,Vegetarian,Vegetarian,"300 ml Pomegranate juice - freshly squeezed,1/..."
424,How To Boil Eggs At Home - Boiled Eggs Recipe,Eggetarian,Eggetarian,"2 Whole Eggs,Water - (2 inches above eggs in t..."
614,Cucumber Honey Limeade Recipe,Vegetarian,Vegetarian,"1 Cucumber - juiced and strained,2 tablespoons..."
764,Shandong Style Sweet Potato In Spicy Caramel...,Vegetarian,Vegetarian,"1 Sweet Potatoes - cubed,4 tablespoons Sugar,1..."
940,Thalipeeth Bhajani Recipe -Typical Maharashtri...,Vegetarian,Vegetarian,"250 grams Jowar Seeds,250 grams Ragi Seeds,250..."
1060,Tomato Basil Pasta And Pizza Sauce Recipe,Vegetarian,Vegetarian,"1 kg Blanched tomatoes,2 tablespoons Extra Vir..."
1236,Pepper Tea Recipe - Kali Mirch Chai,Vegetarian,Vegetarian,"2-1/2 cups Water,1 teaspoon Whole Black Pepper..."


In [108]:
x = np.array([614, 764, 940, 1236])
df.loc[x, 'Course'] = 'Side Dish'
df.loc[1060, 'Course'] = 'Main Course'

In [109]:
x = np.array([1255, 1705, 1699])
df.loc[x, 'Course'] = 'Side Dish'
df.loc[1689, 'Course'] = 'Breakfast'
df.loc[1943, 'Course'] = 'Main Course'

In [110]:
x = np.array([2413, 2468, 2469])
df.loc[x, 'Course'] = 'Main Course'
df.loc[2411, 'Course'] = 'Breakfast'
df.loc[2164, 'Course'] = 'Side Dish"'

In [111]:
x = np.array([2677, 2684, 2746, 2811, 2845, 3266, 3306, 3307, 3311, 3337, 3479, 3795, 4078])
df.loc[x, 'Course'] = 'Side Dish'

In [112]:
x = np.array([4325, 4747, 4777, 4780, 4840, 4968])
y = np.array([4194, 5108])
df.loc[x, 'Course'] = 'Side Dish'
df.loc[y, 'Course'] = 'Breakfast'

In [113]:
x = np.array([5125, 5518, 5810, 6437])
y = np.array([5261, 5288, 5341, 5444, 5482,5565, 5727, 5880, 6116, 6477, 6585])
z = np.array([6524, 3118])
df.loc[x, 'Course'] = 'Main Course'
df.loc[y, 'Course'] = 'Side Dish'
df.loc[z, 'Course'] = 'Snack'

In [114]:
unique_values = df['Course'].unique()
unique_values

array(['Side Dish', 'Main Course', 'South Indian Breakfast', 'Lunch',
       'Snack', 'Dinner', 'Appetizer', 'Indian Breakfast', 'Dessert',
       'North Indian Breakfast', 'One Pot Dish', 'World Breakfast',
       'Brunch', 'Breakfast', 'Side Dish"', 'Non-vegetarian'],
      dtype=object)

 Renaming 'Course' Values like 'North Indian Breakfast', 'South Indian Breakfast', 'Indian Breakfast', 'World Breakfast' in DataFrame to common name 'Breakfast'.

In [115]:
value_to_find = ['North Indian Breakfast', 'South Indian Breakfast', 'Indian Breakfast', 'World Breakfast']

new_names = 'Breakfast'

df['Course'].replace(to_replace=value_to_find, value=new_names, inplace=True)

In [116]:
unique_values = df['Course'].unique()
unique_values

array(['Side Dish', 'Main Course', 'Breakfast', 'Lunch', 'Snack',
       'Dinner', 'Appetizer', 'Dessert', 'One Pot Dish', 'Brunch',
       'Side Dish"', 'Non-vegetarian'], dtype=object)

In [117]:
unique_values = df['Cuisine'].unique()
unique_values

array(['Indian', 'South Indian Recipes', 'Andhra', 'Udupi', 'Mexican',
       'Fusion', 'Continental', 'Bengali Recipes', 'Punjabi', 'Chettinad',
       'Tamil Nadu', 'Maharashtrian Recipes', 'North Indian Recipes',
       'Italian Recipes', 'Sindhi', 'Thai', 'Chinese', 'Kerala Recipes',
       'Gujarati Recipes\ufeff', 'Coorg', 'Rajasthani', 'Asian',
       'Middle Eastern', 'Coastal Karnataka', 'European', 'Kashmiri',
       'Karnataka', 'Lucknowi', 'Hyderabadi', 'Side Dish', 'Goan Recipes',
       'Arab', 'Assamese', 'Bihari', 'Malabar', 'Himachal', 'Awadhi',
       'Cantonese', 'North East India Recipes', 'Sichuan', 'Mughlai',
       'Japanese', 'Mangalorean', 'Vietnamese', 'British',
       'North Karnataka', 'Parsi Recipes', 'Greek', 'Nepalese',
       'Oriya Recipes', 'French', 'Indo Chinese', 'Konkan',
       'Mediterranean', 'Sri Lankan', 'Uttar Pradesh', 'Malvani',
       'Indonesian', 'African', 'Shandong', 'Korean', 'American',
       'Kongunadu', 'Pakistani', 'Caribbean', 

*  List the cuisine values whihc have reduntant values with 'Course' columns.
*  Create the Cuisinerecipes DataFrame by filtering based on 'Course' values.
*  Extract both 'RecipeName', 'Course', 'Diet' , 'Ingredients', 'Cuisine' columns from the filtered DataFrame

In [118]:

cuisine_values = ['Side Dish', 'World Breakfast', 'Appetizer', 'Dessert', 'Lunch', 'Snack' , 'Dinner', 'Brunch']
cuisinerecipes = df[df['Cuisine'].isin(cuisine_values)]
rnc = cuisinerecipes[['RecipeName', 'Course', 'Diet', 'Ingredients', 'Cuisine']]
rnc

Unnamed: 0,RecipeName,Course,Diet,Ingredients,Cuisine
167,Andaman Style Steamed Garlic Prawns Recipe,Main Course,Non Vegeterian,"10 Prawns,1 tablespoon Soy sauce,1 teaspoon Se...",Side Dish
1578,Sweet Potato Balls Recipe Stuffed With Cheesy ...,Breakfast,Eggetarian,"3 Sweet Potatoes - large one,4 sprigs Coriande...",Appetizer
1689,Savory Oatmeal Bowl with Sautéed Spinach and C...,Breakfast,Vegetarian,1/4 cup Rolled Oats Or Instant Oats - 40 grams...,World Breakfast
2266,Irish Cream Bundt Cake Recipe,Non-vegetarian,Non-vegetarian,"2-1/2 cups All Purpose Flour (Maida),3/4 cup B...",Dessert
2411,Savory Oatmeal Bowl with Chettinad Mushroom an...,Breakfast,Vegetarian,1/4 cup Rolled Oats Or Instant Oats - 40 grams...,World Breakfast
2413,Moroccan Baked Fish Recipe,Main Course,Non Vegeterian,500 grams Basa fish - salmon or any white firm...,Dinner
2746,Carrot Ginger Juice Recipe,Side Dish,Vegetarian,"4 Carrots (Gajjar) - roughly diced,1 inch Ging...",World Breakfast
3118,Sweet Potato & Rosemary Crisps/ Chips,Snack,Sugar Free,"1 Sweet Potato,2 tablespoons Extra Virgin Oliv...",Snack
4194,Savory Oatmeal Bowl with Cabbage and Green Pea...,Breakfast,Vegetarian,1/4 cup Rolled Oats Or Instant Oats - 40 grams...,World Breakfast
4392,Red Wine Sangria Cocktail Recipe,Dinner,Vegetarian,"750 ml Red wine,1/2 cup Strawberry Vodka,1/2 c...",Brunch


Update the 'Cuisine' values in  DataFrame for the  indices which have course values like 'Side Dish', 'World Breakfast', 'Appetizer', 'Dessert', 'Lunch', 'Snack' , 'Dinner', 'Brunch' to
 their respecitive cuisine.

In [119]:
y = np.array([1578, 1689, 2266, 2411, 2746, 3118, 4194, 5108, 5880, 6524])
df.loc[167, 'Cuisine'] = 'Indian'
df.loc[y, 'Cuisine'] = 'Continental'
df.loc[5810, 'Cuisine'] = 'Nepali'
df.loc[6466, 'Cuisine'] = 'Chinese'

In [120]:
unique_values = df['Cuisine'].unique()
unique_values

array(['Indian', 'South Indian Recipes', 'Andhra', 'Udupi', 'Mexican',
       'Fusion', 'Continental', 'Bengali Recipes', 'Punjabi', 'Chettinad',
       'Tamil Nadu', 'Maharashtrian Recipes', 'North Indian Recipes',
       'Italian Recipes', 'Sindhi', 'Thai', 'Chinese', 'Kerala Recipes',
       'Gujarati Recipes\ufeff', 'Coorg', 'Rajasthani', 'Asian',
       'Middle Eastern', 'Coastal Karnataka', 'European', 'Kashmiri',
       'Karnataka', 'Lucknowi', 'Hyderabadi', 'Goan Recipes', 'Arab',
       'Assamese', 'Bihari', 'Malabar', 'Himachal', 'Awadhi', 'Cantonese',
       'North East India Recipes', 'Sichuan', 'Mughlai', 'Japanese',
       'Mangalorean', 'Vietnamese', 'British', 'North Karnataka',
       'Parsi Recipes', 'Greek', 'Nepalese', 'Oriya Recipes', 'French',
       'Indo Chinese', 'Konkan', 'Mediterranean', 'Sri Lankan',
       'Uttar Pradesh', 'Malvani', 'Indonesian', 'African', 'Shandong',
       'Korean', 'American', 'Kongunadu', 'Pakistani', 'Caribbean',
       'South Karnat

In [121]:
df.loc[2413, 'Cuisine'] = 'Spanish'
df.loc[4392, 'Cuisine'] = 'Middle Eastern'

### ***6. Converting dataframe to csv file and downloading it.***

In [122]:
df.to_csv('Cleaned_IndianFoodDataset.csv', index=False)
from google.colab import files

files.download('Cleaned_IndianFoodDataset.csv')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>