## Data Collection and Preprocessing

- Collect and import the necessary datasets: FoodNutrition.csv and food.csv, which contain information about food nutrition and health conditions.
- Explore the datasets to understand the data, including the features, data types, and missing values.
- Clean and preprocess the datasets by handling missing values, removing irrelevant features, and converting categorical variables into numerical representations if needed.
- Perform data visualization and analysis to gain insights into the data and identify patterns or trends related to food nutrition and health conditions.


In [4]:
import pandas as pd

# Load the FoodNutrition.csv and food.csv datasets
food_nutrition_df = pd.read_csv('../datasets/FoodNutrition.csv')
food_df = pd.read_csv('../datasets/food.csv')


In [5]:
food_nutrition_df.head()

Unnamed: 0,FoodName,GI,Grams,Protein,Carb,DF,TF,GL,CAL,FF
0,Bajra,28.0,100.0,11.6,67.0,1.0,5.0,19.0,359.4,1.87
1,Jowar,62.0,100.0,9.97,24.4,10.22,1.9,15.0,154.58,3.0
2,Dry Maize,56.0,100.0,11.1,21.0,2.4,3.6,12.0,160.8,2.37
3,Ragi,84.0,100.0,7.3,72.0,3.6,1.3,60.0,328.9,1.73
4,Rice,73.0,100.0,6.6,79.0,1.4,0.6,58.0,347.8,1.64


In [6]:
food_df.head()

Unnamed: 0,Category,Description,Nutrient Data Bank Number,Data.Alpha Carotene,Data.Ash,Data.Beta Carotene,Data.Beta Cryptoxanthin,Data.Carbohydrate,Data.Cholesterol,Data.Choline,...,Data.Major Minerals.Potassium,Data.Major Minerals.Sodium,Data.Major Minerals.Zinc,Data.Vitamins.Vitamin A - IU,Data.Vitamins.Vitamin A - RAE,Data.Vitamins.Vitamin B12,Data.Vitamins.Vitamin B6,Data.Vitamins.Vitamin C,Data.Vitamins.Vitamin E,Data.Vitamins.Vitamin K
0,BUTTER,"BUTTER,WITH SALT",1001,0,2.11,158,0,0.06,215,19,...,24,576,0.09,2499,684,0.17,0.003,0.0,2.32,7.0
1,BUTTER,"BUTTER,WHIPPED,WITH SALT",1002,0,2.11,158,0,0.06,219,19,...,26,827,0.05,2499,684,0.13,0.003,0.0,2.32,7.0
2,BUTTER OIL,"BUTTER OIL,ANHYDROUS",1003,0,0.0,193,0,0.0,256,22,...,5,2,0.01,3069,840,0.01,0.001,0.0,2.8,8.6
3,CHEESE,"CHEESE,BLUE",1004,0,5.11,74,0,2.34,75,15,...,256,1395,2.66,763,198,1.22,0.166,0.0,0.25,2.4
4,CHEESE,"CHEESE,BRICK",1005,0,3.18,76,0,2.79,94,15,...,136,560,2.6,1080,292,1.26,0.065,0.0,0.26,2.5


In [7]:

# Perform data preprocessing tasks
# e.g., handling missing values, data integration, etc.
# ...

# Explore and analyze the datasets
# e.g., data visualization, summary statistics, etc.
# ...


## Feature Engineering

- Identify the relevant features for predicting Glycaemic Index (GI) value, sodium, potassium, and iron content of different foods based on the problem statement and objectives.
- Extract and transform the features to create suitable input features for the machine learning models, such as converting units, normalizing data, or creating new features if needed.
- Split the datasets into training and testing sets to prepare for model training and evaluation.

In [8]:
# Identify relevant features for predicting GI value, sodium, potassium, and iron content
# e.g., select appropriate columns from the datasets
selected_features = ['feature1', 'feature2', 'feature3', ...]




In [9]:
# Perform feature engineering tasks
# e.g., feature scaling, encoding categorical variables, etc.
# ...

# Prepare the input features (X) and target variable (y) for the models
X = food_nutrition_df[selected_features]
y_gi = food_nutrition_df['GI']
y_sodium = food_nutrition_df['sodium']
y_potassium = food_nutrition_df['potassium']
y_iron = food_nutrition_df['iron']

KeyError: "None of [Index(['feature1', 'feature2', 'feature3', Ellipsis], dtype='object')] are in the [columns]"

## Model Development

- Implement linear regression to predict the GI value, sodium, potassium, and iron content of different foods based on the input features. Train the model using the training dataset and tune hyperparameters if necessary.
- Implement the K-nearest neighbours (KNN) algorithm to identify the best suitable food samples with low GI values for obese and diabetic patients, low sodium value foods for cardiovascular patients, low potassium food for kidney patients, and high iron food for anaemic patients. Train the model using the training dataset and tune hyperparameters if necessary.
- Implement Case-Based Reasoning (CBR) to store the food items suggested by the system if the patients agree to it, for future recommendations. This may involve creating a database or memory to store and retrieve past cases and their corresponding recommendations.


In [11]:
# Linear Regression

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y_gi, test_size=0.2, random_state=42)

# Initialize the linear regression model
linear_regression_model = LinearRegression()

# Train the linear regression model
linear_regression_model.fit(X_train, y_train)

# Make predictions on the test set
y_pred = linear_regression_model.predict(X_test)

# Evaluate the performance of the linear regression model
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)


NameError: name 'X' is not defined

In [None]:
# KNN

from sklearn.neighbors import KNeighborsRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y_gi, test_size=0.2, random_state=42)

# Initialize the KNN regression model
knn_model = KNeighborsRegressor(n_neighbors=5)

# Train the KNN regression model
knn_model.fit(X_train, y_train)

# Make predictions on the test set
y_pred = knn_model.predict(X_test)

# Evaluate the performance of the KNN model
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)


In [None]:
from sklearn.neighbors import NearestNeighbors

# Perform retrieval based on similarity measures
# e.g., cosine similarity, Euclidean distance, etc.
# ...

# Fit a nearest neighbors model on the feature vectors
# e.g., using cosine similarity
retrieval_model = NearestNeighbors(metric='cosine')
retrieval_model.fit(X)

# Given a query input, find the k-nearest neighbors in the dataset
# e.g., finding the k-nearest neighbors to a given input feature vector
query_feature_vector = ...  # Input feature vector for retrieval
k = ...  # Number of neighbors to retrieve
distances, indices = retrieval_model.kneighbors(query_feature_vector, n_neighbors=k)

# Retrieve the k-nearest neighbors from the dataset
retrieved_cases = food_nutrition_df.iloc[indices[0]]


In [None]:
# Reuse retrieved cases by applying the learned models
# e.g., using the trained linear regression and KNN models
# ...

# Perform prediction using the retrieved cases
retrieved_X = retrieved_cases[selected_features]
retrieved_y_gi = retrieved_cases['GI']
retrieved_y_sodium = retrieved_cases['sodium']
retrieved_y_potassium = retrieved_cases['potassium']
retrieved_y_iron = retrieved_cases['iron']

# Use the trained linear regression model for prediction
linear_regression_pred = linear_regression_model.predict(retrieved_X)

# Use the trained KNN model for prediction
knn_pred = knn_model.predict(retrieved_X)

# Perform other computations or post-processing on the retrieved cases and their predictions
# ...

# Select the most suitable retrieved case and its prediction based on similarity and relevance
# e.g., based on cosine similarity, Euclidean distance, etc.
# ...



In [None]:
# Revise the solution or prediction based on user feedback or other criteria
# e.g., incorporating user preferences, domain-specific rules, etc.
# ...



In [None]:
# Retain the revised solution or prediction for future use
# e.g., storing the updated prediction or solution in a database, file, etc.
# ...



### Model Evaluation and Validation

- Evaluate the performance of the linear regression and KNN models using the testing dataset, using appropriate evaluation metrics such as accuracy, precision, recall, F1 score, or RMSE (Root Mean Squared Error) depending on the type of prediction task.
- Validate the models by applying cross-validation techniques to assess their robustness and generalization performance.
- Interpret the results and analyze the model's performance to identify any limitations or areas for improvement.


## Model Deployment and Application Development

- Once the models are trained, evaluated, and validated, integrate them into the food recommendation application for obese people with a user-friendly interface.
- Develop a user input module where patients can input their health conditions and receive personalized food recommendations based on the trained models.
- Implement the Case-Based Reasoning (CBR) component to store and retrieve past cases and recommendations, if the patients agree to it.
- Test the application thoroughly to ensure its functionality, usability, and security.
- Deploy the application to a suitable environment for real-world usage, considering factors such as scalability, performance, and security.


In [None]:
pip install streamlit
pip install mysql-connector-python
# Install other required libraries


In [None]:
import streamlit as st
import mysql.connector


In [None]:
# Connect to MySQL database
def create_db_connection():
    conn = mysql.connector.connect(
        host='localhost',
        user='root',
        password='your_password',  # Replace with your MySQL password
        database='food_recommendation'  # Replace with your database name
    )
    return conn

conn = create_db_connection()
cursor = conn.cursor()


In [None]:
# Create Streamlit web application
def main():
    st.title('Food Recommendation Application')

    # Get user feedback and preferences
    user_feedback = st.text_area('Enter your feedback:')
    user_preferences = st.text_input('Enter your preferences (e.g., low GI, low sodium):')

    # Store user feedback and preferences in MySQL database
    if st.button('Submit'):
        try:
            # Insert user feedback and preferences into MySQL database
            cursor.execute("INSERT INTO user_feedback (feedback, preferences) VALUES (%s, %s)",
                           (user_feedback, user_preferences))
            conn.commit()
            st.success('User feedback and preferences submitted successfully!')
        except mysql.connector.Error as e:
            st.error(f'Error inserting data into MySQL database: {e}')

    # Display user feedback and preferences from MySQL database
    cursor.execute("SELECT feedback, preferences FROM user_feedback")
    user_feedbacks = cursor.fetchall()

    st.subheader('User Feedback and Preferences')
    for feedback, preferences in user_feedbacks:
        st.write(f'Feedback: {feedback}')
        st.write(f'Preferences: {preferences}')
        st.write('---')


if __name__ == '__main__':
    main()


In [None]:
streamlit run app.py


## Ethical Considerations

- Ensure that all ethical considerations, such as data privacy, security, and informed consent, are taken into account throughout the project.
- Implement necessary measures to protect patients' personal information and comply with relevant regulations and guidelines.
- Document and report any ethical considerations and decisions made during the project.


## Project Conclusion and Documentation
- Summarize the findings, outcomes, and contributions of the project.
- Provide recommendations for future improvements or extensions to the food recommendation application.
- Document the entire project, including the datasets, preprocessing steps, feature engineering, model development, evaluation results, application development, and ethical considerations.
- Prepare a final report.