# Linear Regression Model for Food Weight Prediction

## 1. Model Objective
The **Linear Regression** model aims to predict the weight of food (in grams) based on the output from a **Computer Vision (YOLO)** model, which provides predictions such as bounding box (food dimensions in the image) and class label (type of food).

## 2. Logic and Steps for Building the ML Model

### a. Problem to Solve:
We want to predict the weight of food based on the bounding box size provided by the CV model. Since the CV model provides information like width, height, and food label, we can use this data to build a regression model that predicts the food's weight.

### b. Input from CV (Computer Vision) Model:
The **YOLO** model will provide the following output:
- **Bounding Box**:
  - `width`: The width of the food object.
  - `height`: The height of the food object.
- **Class Label**: The food type (e.g., rice, fried chicken, vegetables, etc.).

### c. Variables Used in the Model:
The linear regression model will use the following variables:
- **Bounding box width** (`width`)
- **Bounding box height** (`height`)
- **Food class** (`class label`), which will be one-hot encoded to be used as numerical input in the regression model.

### d. Target to Predict:
The target variable or output is the **weight** of the food (in grams).

### e. Logic Behind Linear Regression:
Linear regression works by identifying a linear relationship between input variables (features) and the target (food weight). For example:
- The larger the `width` and `height`, the more likely the food weighs more.
- The model learns from the data to understand this pattern and provides weight predictions based on the bounding box dimensions and food type.

## 3. Dummy Data for Linear Regression Model

### a. Input Features (Data from CV Model):
We will create dummy data for the input provided by the CV model, including:
- **width**: Bounding box width (e.g., between 50 and 300 pixels).
- **height**: Bounding box height (e.g., between 50 and 300 pixels).
- **class label**: Food type label (e.g., rice, fried chicken, vegetables).

### b. Target Output (Food Weight in Grams):
We will also create dummy data for the food weight based on the bounding box dimensions and food label. Example:
- **weight**: Food weight in grams (e.g., between 50g and 500g).

## 4. Model Building Logic Explanation

### Creating Dummy Data:
- **width** and **height**: These represent the bounding box dimensions of the food detected by the CV model.
- **class_label**: The food label predicted by the CV model, which is one-hot encoded for use in the regression model.
- **weight**: The target variable (food weight) calculated as a function of `width`, `height`, and `class_label` with some logical rules. For example, "rice" is lighter than "fried chicken."

### Linear Regression Model:
- The model takes input in the form of bounding box dimensions (`width`, `height`) and the one-hot encoded food class label.
- The target output is the food weight.

### Training Process:
- The model is trained using training data (`X_train` and `y_train`), and the model will learn the linear relationship between bounding box size, food type, and food weight.

### Prediction:
- Once trained, the model can be used to predict the weight of food from new data.


In [4]:
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

# Membuat dummy data untuk bounding box dan kelas makanan
data = {
    'width': np.random.randint(50, 300, 100),  # lebar bounding box
    'height': np.random.randint(50, 300, 100),  # tinggi bounding box
    'class_label': np.random.choice(['nasi', 'ayam_goreng', 'sayuran'], 100)  # label makanan
}

# Mengubah class_label menjadi one-hot encoding (karena kita tidak bisa memasukkan label secara langsung ke model linier)
df = pd.DataFrame(data)
df = pd.get_dummies(df, columns=['class_label'])

# Menambahkan kolom target (dummy weight), misalnya berat makanan tergantung pada lebar dan tinggi bounding box
# Dengan sedikit variasi berdasarkan kelas makanan
df['weight'] = (0.5 * df['width']) + (0.3 * df['height']) + \
    np.where(df['class_label_nasi'] == 1, 50, 0) + \
    np.where(df['class_label_ayam_goreng'] == 1, 100, 0) + \
    np.where(df['class_label_sayuran'] == 1, 20, 0) + \
    np.random.normal(0, 10, 100)  # noise tambahan untuk membuat data lebih realistis

# Membagi data menjadi input (X) dan target (y)
X = df.drop(columns=['weight'])  # Input model: lebar, tinggi, dan one-hot encoded class labels
y = df['weight']  # Target model: berat makanan

# Split data menjadi data latih dan data uji
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Membuat model regresi linier
model = LinearRegression()

# Melatih model
model.fit(X_train, y_train)

# Melakukan prediksi
predictions = model.predict(X_test)

# Menampilkan hasil
print("Prediksi Berat Makanan: ", predictions[:5])


Prediksi Berat Makanan:  [168.82956855 135.70762741 148.650115   195.80224435 115.64966022]
