## This notebook will go through the processed csv and perform exploratory data analysis to find any issues that need to be fixed before model creation

In [1]:
#import libraries
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from IPython.display import display
import sweetviz as sv

## Data Dictionary
- FoodID : Unique Identifier for the food (numerical)
- FoodDescription : Name and contents of the food (text)
- FoodGroup : 23 Groups of food (categorical)
- PROTValue : Value of Protein Nutreint in the food in g/100g
- FATValue : Value of Total Fat in the food in g/100g
- CARBValue : Value of Total Carbohydrate in the food in g/100g
- STARValue : Value of Starch in Carbohydrates in g/100g
- TSUGValue : Value of Sugar in Carbohydrates in g/100g
- TDFValue : Value of Dietary Fibre in Carbohydrates in g/100g
- TSATValue : Value of Saturated Fat in Fats in g/100g
- MUFAValue : Value of Monounsaturated Fat in Fats in g/100g
- PUFAValue : Value of Polyunsaturated Fat in Fats in g/100g

## Important Information
- Get the user's age, height in cm, weight in kg and gender for user profile building
- Using Harris-Benedict Equation (Needs.pdf) for Basal Energy Expenditure (BEE), find the calorie intake required per day
- Requirement of Protein in grams = weight in kg
- Calorie of Protein = 4 * Protein in grams
- Find out the percentage of protein by dividing protein calorie by total calorie
- 60% of Calories should be Carbohydrates with more TDF,STAR and less TSUG
- Calculate carbohydrates in grams using CalorieCount/4
- Leftover calories need to be Fat with more MUFA,PUFA and less TSAT
- Calculate fat in grams using CalorieCount/9
- Input will be the requirement of Carbohydrates,Proteins,Fats calculated above
- Output should be Foods that meet the needs with carbohydrates favoring TDF and STAR and Fats favoring MUFA and PUFA
- TSAT should not be more than 10% of Fat intake
- TSUG should not be more than 10% of Carbohydrate intake

In [2]:
food_data_df = pd.read_csv('FoodNutritionData.csv')
display(food_data_df.head(5))

Unnamed: 0,FoodID,FoodDescription,FoodGroup,PROTValue,FATValue,CARBValue,STARValue,TSUGValue,TDFValue,TSATValue,MUFAValue,PUFAValue
0,2,Cheese souffle,Mixed Dishes,9.54,15.7,5.91,0.0,2.66,0.1,5.742,5.82,2.77
1,4,"Chop suey, with meat, canned",Mixed Dishes,4.07,2.8,5.29,0.0,3.4,1.1,0.364,1.54,0.75
2,5,"Chinese dish, chow mein, chicken",Mixed Dishes,6.76,2.8,8.29,3.99,1.74,1.0,0.49,0.613,1.226
3,6,Corn fritter,Baked Products,8.55,21.24,38.62,0.0,2.85,2.0,5.455,8.543,5.564
4,7,"Beef pot roast, with browned potatoes, peas an...",Mixed Dishes,21.29,5.25,10.72,0.0,1.44,1.6,1.872,2.552,0.709


In [3]:
sv_analyzer = sv.analyze(food_data_df)
sv_analyzer.show_html()

:FEATURES DONE:                    |█████████████████████| [100%]   00:05  -> (00:00 left)
:PAIRWISE DONE:                    |█████████████████████| [100%]   00:00  -> (00:00 left)


Creating Associations graph... DONE!
Report SWEETVIZ_REPORT.html was generated! NOTEBOOK/COLAB USERS: no browser will pop up, the report is saved in your notebook/colab files.
