<a href="https://colab.research.google.com/github/CaloCare/MachineLearning/blob/main/Nutritional_Evaluation_Model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## CaloCare

### Importing Packages/Libraries

In [1]:
# Import required libraries for data manipulation and machine learning
import pandas as pd  # For handling datasets
from sklearn.model_selection import train_test_split  # For splitting datasets
import tensorflow as tf  # For building and training neural networks
import numpy as np  # For numerical computations
import os  # For interacting with the operating system

In [2]:
# Install the Keras Tuner for hyperparameter optimization
!pip install keras-tuner --upgrade
# Install tensorflow
!pip install --upgrade tensorflow



In [3]:
# Import required modules for building and tuning neural network models
from tensorflow.keras.models import Sequential  # For creating sequential neural network models
from tensorflow.keras.layers import Dense  # For adding dense layers to the model
from kerastuner.tuners import RandomSearch  # For performing random search hyperparameter tuning

  from kerastuner.tuners import RandomSearch  # For performing random search hyperparameter tuning


**Insight**

This code sets up the environment for building and optimizing a machine learning model using TensorFlow and Keras. It imports essential libraries such as Pandas for data manipulation, Scikit-learn for dataset splitting, and TensorFlow for constructing and training neural networks. The Keras Tuner is also installed to facilitate hyperparameter optimization, specifically using RandomSearch to find the best model configuration. The purpose of this setup is to allow for efficient model tuning, ultimately leading to improved performance on the given dataset.

### Data Wrangling

In [4]:
# Install Kaggle CLI for downloading datasets
!pip install kaggle



In [5]:
!mkdir ~/.kaggle
!cp kaggle.json ~/.kaggle
!chmod 600 /root/.kaggle/kaggle.json

mkdir: cannot create directory ‘/root/.kaggle’: File exists


In [6]:
# Download the Indonesian food and drink nutrition dataset from Kaggle
!kaggle datasets download -d anasfikrihanif/indonesian-food-and-drink-nutrition-dataset

Dataset URL: https://www.kaggle.com/datasets/anasfikrihanif/indonesian-food-and-drink-nutrition-dataset
License(s): CC0-1.0
indonesian-food-and-drink-nutrition-dataset.zip: Skipping, found more recently modified local copy (use --force to force download)


In [7]:
# Unzip the downloaded dataset
import zipfile
with zipfile.ZipFile('indonesian-food-and-drink-nutrition-dataset.zip', 'r') as zip_ref:
    zip_ref.extractall('indonesian-food-and-drink-nutrition-dataset')  # Extract all files to the specified folder

# Load the dataset into a Pandas DataFrame
data = pd.read_csv('/content/indonesian-food-and-drink-nutrition-dataset/nutrition.csv')  # Read the CSV file

**Insight**

This code demonstrates how to download and extract the Indonesian food and drink nutrition dataset from Kaggle using the `Kaggle CLI`. After installing the necessary packages and configuring the Kaggle API key, the dataset is downloaded and unzipped. The CSV file containing the nutritional data is then loaded into a Pandas DataFrame for further analysis. This process sets up the dataset for potential machine learning tasks, such as building predictive models or performing exploratory data analysis.

#### Splitting Data

In [8]:
# Calculate the total nutrition by summing calories, proteins, fat, and carbohydrates
data['total_nutrition'] = data['calories'] + data['proteins'] + data['fat'] + data['carbohydrate']

# Create nutrition categories based on the total nutrition value
# Define bins and labels for categorization
bins = [0, 200, 400, 600, 800, np.inf]  # Define the nutrition range intervals
labels = [1, 2, 3, 4, 5]  # Assign labels for each bin
data['evaluation'] = pd.cut(data['total_nutrition'], bins=bins, labels=labels)  # Categorize based on total nutrition

# Convert the 'evaluation' column to numeric values (integer codes)
data['evaluation'] = data['evaluation'].cat.codes  # Convert categorical labels to integer codes

# Define the feature set (X) and target variable (y)
# In this case, the target is 'evaluation' and the features are nutritional components
X = data[['calories', 'proteins', 'fat', 'carbohydrate']]  # Features (independent variables)
y = data['evaluation']  # Target variable (dependent variable)

# Split the dataset into training and validation sets
# 80% for training and 20% for validation, ensuring random splitting with a fixed seed (random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)  # Split the data

**Insight**

This code processes the nutritional data by calculating the total nutrition for each food item, which is the sum of calories, proteins, fat, and carbohydrates. Based on the total nutrition value, the data is categorized into five groups using defined bins and labels, and the 'evaluation' column is converted into numeric values for modeling purposes. The features (calories, proteins, fat, and carbohydrates) are separated from the target variable ('evaluation'), which represents the nutrition category. Finally, the dataset is split into training and validation sets `(80% for training, 20% for validation)`, preparing it for machine learning model development.

#### Hyper-Parameter Tuning

In [9]:
# Function to build the model with hyperparameter tuning using Keras Tuner
def build_model(hp):
    # Initialize a Sequential model
    model = Sequential()

    # Add the first dense layer with a tunable number of units
    # The number of units is selected from a range (32 to 512) with a step size of 32
    model.add(Dense(units=hp.Int('units', min_value=32, max_value=512, step=32),
                    activation='relu',  # Use ReLU activation function
                    input_shape=(X_train.shape[1],)))  # Define the input shape based on training data

    # Add the output layer with 5 units, corresponding to the 5 classes in the target variable
    model.add(Dense(5, activation='softmax'))  # Softmax activation for multi-class classification

    # Compile the model with the Adam optimizer and sparse categorical cross-entropy loss
    # Sparse categorical cross-entropy is used as labels are integers, not one-hot encoded
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])  # Use accuracy as the evaluation metric

    # Return the compiled model
    return model

**Insight**

This function defines a neural network model using Keras Tuner for hyperparameter optimization. It creates a Sequential model with a tunable number of units in the first dense layer, allowing the number of neurons to be chosen from a specified range `(32 to 512)`. The model uses the `ReLU activation` function for the hidden layer and `softmax` for the output layer, suitable for multi-class classification. The model is compiled with the `Adam optimizer and sparse categorical cross-entropy loss`, as the target labels are integers rather than one-hot encoded. This setup is designed to facilitate hyperparameter tuning to find the best configuration for the model.

In [10]:
# Initialize the RandomSearch tuner from Keras Tuner to optimize the hyperparameters
tuner = RandomSearch(
    build_model,  # The function used to build the model
    objective='val_accuracy',  # The objective to optimize is validation accuracy
    max_trials=10,  # Set the maximum number of trials (experiments)
    executions_per_trial=3,  # Number of executions per trial to average out results
    directory='my_dir',  # Directory to store the tuner logs and results
    project_name='hyperparam_tuning'  # Name of the project (for organization purposes)
)

# Display the summary of the search space (hyperparameters) to be tuned
tuner.search_space_summary()

# Start the hyperparameter search with the training data (X_train, y_train)
# The search will run for 50 epochs and use the validation data (X_val, y_val) for evaluation
tuner.search(X_train, y_train, epochs=50, validation_data=(X_val, y_val))

# Display the summary of the tuner results after the search
tuner.results_summary()

Trial 10 Complete [00h 00m 29s]
val_accuracy: 0.9012345671653748

Best val_accuracy So Far: 0.9012345671653748
Total elapsed time: 00h 06m 07s
Results summary
Results in my_dir/hyperparam_tuning
Showing 10 best trials
Objective(name="val_accuracy", direction="max")

Trial 09 summary
Hyperparameters:
units: 416
Score: 0.9012345671653748

Trial 05 summary
Hyperparameters:
units: 480
Score: 0.895061731338501

Trial 03 summary
Hyperparameters:
units: 224
Score: 0.89012344678243

Trial 00 summary
Hyperparameters:
units: 256
Score: 0.8888888955116272

Trial 07 summary
Hyperparameters:
units: 288
Score: 0.8876543243726095

Trial 06 summary
Hyperparameters:
units: 384
Score: 0.8864197532335917

Trial 08 summary
Hyperparameters:
units: 448
Score: 0.8802469174067179

Trial 04 summary
Hyperparameters:
units: 96
Score: 0.8753086527188619

Trial 01 summary
Hyperparameters:
units: 192
Score: 0.8728395104408264

Trial 02 summary
Hyperparameters:
units: 32
Score: 0.814814825852712


In [11]:
# Retrieve the best model from the tuner after hyperparameter search
# Get the best model based on the highest validation accuracy
best_model = tuner.get_best_models(num_models=1)[0]  # Select the best model (top 1)

# Evaluate the best model on the validation set
# The evaluate method returns the loss and accuracy metrics
val_loss, val_acc = best_model.evaluate(X_val, y_val)

# Print the evaluation results (loss and accuracy) on the validation set
print(f'Validation Loss: {val_loss}')  # Display the validation loss
print(f'Validation Accuracy: {val_acc}')  # Display the validation accuracy

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
  saveable.load_own_variables(weights_store.get(inner_path))


[1m9/9[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 20ms/step - accuracy: 0.9328 - loss: 0.3083
Validation Loss: 0.34088724851608276
Validation Accuracy: 0.9222221970558167


**Insight**

This code leverages Keras Tuner's `RandomSearch` to optimize the hyperparameters of a neural network model. It defines the objective as maximizing validation accuracy and runs a total of `10 trials`, with `3 executions per trial` to ensure robust results. The tuner explores different configurations of the model, including the number of units in the dense layer, and evaluates performance over `50 epochs` using the validation data. After completing the hyperparameter search, the best model is selected based on the highest validation accuracy and evaluated on the validation set to report the loss and accuracy metrics. This process helps find the optimal model configuration for better performance.

### Model Evaluation


In [12]:
# Function to predict the evaluation category based on nutritional input
def predict_evaluation(calories, proteins, fat, carbohydrate):
    # Format the user input as an array that can be passed to the model for prediction
    input_data = np.array([[calories, proteins, fat, carbohydrate]])

    # Make a prediction using the trained model
    prediction = best_model.predict(input_data)

    # Convert the prediction to the corresponding evaluation category
    # The model output is a probability distribution, so we select the class with the highest probability
    predicted_category = np.argmax(prediction, axis=1)[0] + 1  # Add 1 because categories start from 1

    return predicted_category  # Return the predicted evaluation category

**Insight**

This function takes user inputs for nutritional values (calories, proteins, fat, and carbohydrates), formats them into an array suitable for model prediction, and uses the trained model to predict the evaluation category. The model's output is a probability distribution, and the function selects the class with the highest probability using `np.argmax`. The result is adjusted by adding 1 because the evaluation categories are indexed starting from 1, and the predicted category is returned as the final output. This function allows for real-time nutritional evaluation based on user input.

In [13]:
# Function to take user input for nutritional values and predict the evaluation category
def user_input():
    print("Masukkan nilai-nilai nutrisi untuk mendapatkan evaluasi:")  # Prompt the user to input nutritional values

    # Ask the user to input values for calories, proteins, fat, and carbohydrates
    calories = float(input("Kalori (cal): "))  # Input calories
    proteins = float(input("Protein (g): "))  # Input protein value
    fat = float(input("Lemak (g): "))  # Input fat value
    carbohydrate = float(input("Karbohidrat (g): "))  # Input carbohydrate value

    # Predict the evaluation category based on the provided nutritional values
    evaluation = predict_evaluation(calories, proteins, fat, carbohydrate)

    # Display the predicted evaluation category
    print(f'Nilai evaluasi untuk nutrisi yang diberikan adalah: {evaluation}')  # Output the evaluation result

# Call the user_input function to execute the prediction based on user input
user_input()

Masukkan nilai-nilai nutrisi untuk mendapatkan evaluasi:
Kalori (cal): 513
Protein (g): 23.7
Lemak (g): 37
Karbohidrat (g): 21.3
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 171ms/step
Nilai evaluasi untuk nutrisi yang diberikan adalah: 4


In [14]:
# Function to take user input for nutritional values and predict the evaluation category with a description
def user_input():
    print("Masukkan nilai-nilai nutrisi untuk mendapatkan evaluasi:")  # Prompt the user to input nutritional values

    # Ask the user to input values for calories, proteins, fat, and carbohydrates
    calories = float(input("Kalori (cal): "))  # Input calories
    proteins = float(input("Protein (g): "))  # Input protein value
    fat = float(input("Lemak (g): "))  # Input fat value
    carbohydrate = float(input("Karbohidrat (g): "))  # Input carbohydrate value

    # Predict the evaluation category based on the provided nutritional values
    evaluation = predict_evaluation(calories, proteins, fat, carbohydrate)

    # Dictionary containing descriptions for each evaluation category
    descriptions = {
        1: "buruk untuk Anda",  # Description for category 1
        2: "tidak terlalu baik untuk Anda",  # Description for category 2
        3: "cukup baik untuk Anda",  # Description for category 3
        4: "baik untuk Anda",  # Description for category 4
        5: "sangat baik untuk Anda"  # Description for category 5
    }

    # Display the predicted evaluation category with its description
    print(f'Nilai evaluasi untuk nutrisi yang diberikan adalah: {evaluation} ({descriptions[evaluation]})')

# Call the user_input function to execute the prediction based on user input
user_input()

Masukkan nilai-nilai nutrisi untuk mendapatkan evaluasi:
Kalori (cal): 513
Protein (g): 23.7
Lemak (g): 37
Karbohidrat (g): 21.23
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
Nilai evaluasi untuk nutrisi yang diberikan adalah: 4 (baik untuk Anda)


### Saved Model to TFlite

In [20]:
# Assuming your saved model is in a directory named 'nutrition_model'
export_dir = 'saved_model/1'
tf.saved_model.save(best_model, export_dir)

converter = tf.lite.TFLiteConverter.from_saved_model(export_dir)
converter.optimizations = [tf.lite.Optimize.DEFAULT]

# Ensure FLOAT32 input type
converter.target_spec.supported_types = [tf.float32]

# Add representative dataset if quantization is applied
def representative_data_gen():
    for input_value in tf.data.Dataset.from_tensor_slices(X_train).batch(1).take(100):
        yield [input_value.astype(np.float32)]

converter.representative_dataset = representative_data_gen

# Convert to TFLite
tflite_model = converter.convert()

# Save the converted TFLite model
with open('model.tflite', 'wb') as f:
    f.write(tflite_model)

print("Model successfully converted to TFLite.")

Model successfully converted to TFLite.


### Conclusion

The code defines a system for predicting the nutritional evaluation of food based on user input. It first collects values for calories, proteins, fat, and carbohydrates, and then predicts the evaluation category using a trained model. The model outputs a category, which is enhanced with a description indicating whether the nutritional values are good or bad for the user. The validation results show a strong model performance with a `validation loss of 0.34 and an accuracy of 92%`, indicating that the model is effective in categorizing the nutritional data.