## Data Analysis Mathematics, Algorithms and Modeling

# AI Powered Recipe Recommendation System 

### Team : Group 3
| Student No  | First Name                  | Last Name     |
|-------------|-----------------------------|---------------|
| 9041129     | Nidhi                       | Ahir          |
| 9016986     | Keerthi                     | Gonuguntla    |
| 9027375     | Khushbu                     | Lad           |

#### Introduction
In the world, where people are more health conscious and work driven, often face difficulty to prepare meals considering time, ingredients availability and dietary preferences. Although there are a millions of recipes available on internet, it requires efforts to find the one with fits with available resources. 

The objective of this project to utilize advanced technologies backed up with large dataset helps to generate customized meal suggestions based on user input.

### Rectangular dataset : Raw_interaction.csv

In [5]:
import pandas as pd 

df = pd.read_csv('./Dataset/RAW_interactions.csv')
df

Unnamed: 0,user_id,recipe_id,date,rating,review
0,38094,40893,2003-02-17,4,Great with a salad. Cooked on top of stove for...
1,1293707,40893,2011-12-21,5,"So simple, so delicious! Great for chilly fall..."
2,8937,44394,2002-12-01,4,This worked very well and is EASY. I used not...
3,126440,85009,2010-02-27,5,I made the Mexican topping and took it to bunk...
4,57222,85009,2011-10-01,5,"Made the cheddar bacon topping, adding a sprin..."
...,...,...,...,...,...
1132362,116593,72730,2003-12-09,0,Another approach is to start making sauce with...
1132363,583662,386618,2009-09-29,5,These were so delicious! My husband and I tru...
1132364,157126,78003,2008-06-23,5,WOW! Sometimes I don't take the time to rate ...
1132365,53932,78003,2009-01-11,4,Very good! I used regular port as well. The ...


### Representing the new data set in classes and methods

In [11]:
import pandas as pd

class RawRecipe:
    def __init__(self):
        self.file_path = './Dataset/RAW_interactions.csv'
        self.data = None
    
    # Loads the data from a CSV file.
    def load_data(self):
        self.data = pd.read_csv(self.file_path)
        print(f"---> STEP 1 : Loads the data from a CSV file. \r\n")
        print(f"RAW_interactions.csv : Data loaded successfully.")
        print(f"Total Records : {self.data.shape[0]} \r\n")
        return self.data
    
    # Data quality : Null Check
    def check_null_values(self):
        print(f"---> STEP 2 : Null Check for data \r\n")
        if self.data is not None:
            nulls = self.data.isnull().sum()
            print(nulls)
            return nulls
        else:
            print("Data not loaded.")
     # Data quality : Duplicate Check
    def check_duplicate_values(self):
        print(f"\r\n---> STEP 3 : Duplicate data Check for name \r\n")
        if self.data is not None:
            counts = self.data["recipe_id"].value_counts()
            dupl = (counts[counts>1]).reset_index()
            dupl.columns = ["recipe_id", "Count"]
            print(dupl)
            return dupl
        else:
            print("Data not loaded.")

if __name__ == "__main__":

    # Create an instance of the DataAnalytics class
    recepeData = RawRecipe()
    
    # Load data
    recepeData.load_data()

    # Check for missing values
    recepeData.check_null_values()

    recepeData.check_duplicate_values()

---> STEP 1 : Loads the data from a CSV file. 

RAW_interactions.csv : Data loaded successfully.
Total Records : 1132367 

---> STEP 2 : Null Check for data 

user_id        0
recipe_id      0
date           0
rating         0
review       169
dtype: int64

---> STEP 3 : Duplicate data Check for name 

        recipe_id  Count
0            2886   1613
1           27208   1601
2           89204   1579
3           39087   1448
4           67256   1322
...           ...    ...
139679     205412      2
139680     207778      2
139681     219118      2
139682     414099      2
139683     253419      2

[139684 rows x 2 columns]


This dataset appears to contain reviews and ratings for various recipes. Here's a breakdown of each column:

**user_id:** Unique identifier for the user who provided the rating/review.

**recipe_id:** Unique identifier for the recipe being rated/reviewed.

**date:** Date when the rating and review were provided.

**rating:** Numerical rating (on a scale of 0 to 5) given to the recipe.

**review:** User's textual review providing additional feedback or modifications to the recipe.