# Bangladeshi Local Food Classification & Nutrition Suggestion System
## University Final Year Thesis Project

**Objective:** Classify Bangladeshi local food images and provide nutritional information using Deep Learning

**Dataset Structure:** Food images organized in folders by food type (Alu Vorta, Bakorkhani, Bhapa, Burger, Chicken, etc.)

---

## Project Workflow:
1. **Data Import & Exploration**
2. **Data Preprocessing & Cleaning**
3. **Exploratory Data Analysis (EDA)**
4. **Data Augmentation**
5. **Model Building (Transfer Learning)**
6. **Model Training & Evaluation**
7. **Model Comparison & Selection**
8. **App Development (Gradio/Streamlit)**

## Step 1: Install Required Libraries

**Note:** This notebook is configured for local execution (not Google Colab)

In [4]:
# Install required libraries (uncomment if needed)
# !pip install tensorflow keras pillow matplotlib seaborn scikit-learn pandas numpy
# !pip install gradio

print("‚úì Make sure you have installed: tensorflow, keras, pillow, matplotlib, seaborn, scikit-learn, pandas, numpy, gradio")
print("‚úì You can install them using: pip install tensorflow keras pillow matplotlib seaborn scikit-learn pandas numpy gradio")

‚úì Make sure you have installed: tensorflow, keras, pillow, matplotlib, seaborn, scikit-learn, pandas, numpy, gradio
‚úì You can install them using: pip install tensorflow keras pillow matplotlib seaborn scikit-learn pandas numpy gradio


In [4]:
# Install packages using pip
%pip install --upgrade tensorflow pillow matplotlib seaborn scikit-learn pandas numpy gradio

print("\n" + "="*70)
print("‚ö†Ô∏è‚ö†Ô∏è‚ö†Ô∏è  CRITICAL: RESTART THE KERNEL NOW!  ‚ö†Ô∏è‚ö†Ô∏è‚ö†Ô∏è")
print("="*70)
print("\n1. Click the circular arrow üîÑ button in the toolbar")
print("2. Wait for kernel to restart")
print("3. Then run Step 2 (Import Libraries cell)")
print("\n" + "="*70)

Note: you may need to restart the kernel to use updated packages.Collecting tensorflow
  Using cached tensorflow-2.20.0-cp312-cp312-win_amd64.whl.metadata (4.6 kB)
Collecting pillow
  Downloading pillow-12.0.0-cp312-cp312-win_amd64.whl.metadata (9.0 kB)
Collecting matplotlib
  Downloading matplotlib-3.10.7-cp312-cp312-win_amd64.whl.metadata (11 kB)
Collecting scikit-learn
  Downloading scikit_learn-1.7.2-cp312-cp312-win_amd64.whl.metadata (11 kB)
Collecting pandas
  Downloading pandas-2.3.3-cp312-cp312-win_amd64.whl.metadata (19 kB)
Collecting numpy
  Downloading numpy-2.3.5-cp312-cp312-win_amd64.whl.metadata (60 kB)
     ---------------------------------------- 0.0/60.9 kB ? eta -:--:--
     ---------------------------------------- 60.9/60.9 kB 1.6 MB/s eta 0:00:00
Collecting threadpoolctl>=3.1.0 (from scikit-learn)
  Downloading threadpoolctl-3.6.0-py3-none-any.whl.metadata (13 kB)
Using cached tensorflow-2.20.0-cp312-cp312-win_amd64.whl (331.9 MB)
Downloading pillow-12.0.0-cp312-cp3

  You can safely remove it manually.
  You can safely remove it manually.
  You can safely remove it manually.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
streamlit 1.32.0 requires pillow<11,>=7.1.0, but you have pillow 12.0.0 which is incompatible.
streamlit 1.32.0 requires protobuf<5,>=3.20, but you have protobuf 6.33.1 which is incompatible.


## Step 2: Import Libraries

In [1]:
# Standard Libraries
import os
import zipfile
import shutil
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Image Processing
from PIL import Image

# Deep Learning Libraries
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator, load_img, img_to_array
from tensorflow.keras.applications import (
    VGG16, VGG19, ResNet50, InceptionV3, MobileNetV2, EfficientNetB0
)
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
from tensorflow.keras.utils import to_categorical

# Sklearn
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

# Set random seeds for reproducibility
np.random.seed(42)
tf.random.set_seed(42)

print(f"‚úÖ TensorFlow Version: {tf.__version__}")
print(f"‚úÖ GPU Available: {len(tf.config.list_physical_devices('GPU'))} GPU(s)")
print(f"‚úÖ All libraries imported successfully!")

ImportError: Traceback (most recent call last):
  File "c:\Users\user\anaconda3\Lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 73, in <module>
    from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: DLL load failed while importing _pywrap_tensorflow_internal: A dynamic link library (DLL) initialization routine failed.


Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/errors for some common causes and solutions.
If you need help, create an issue at https://github.com/tensorflow/tensorflow/issues and include the entire stack trace above this error message.

## Step 3: Load Dataset from Local Computer

**Instructions:**
1. Place your `local_food_pre.zip` file in the same directory as this notebook (G:\Food\Code\)
2. Or update the `zip_path` below to point to wherever your ZIP file is located
3. The code will extract the ZIP file and organize the dataset

In [None]:
import os
import sys

# ========================================
# üìù DATASET FOLDER PATH:
# ========================================
# The ZIP file has been extracted locally to this folder
extract_path = r'G:\Food\Code\food_dataset'

print(f"üîç Python executable: {sys.executable}")
print(f"üìÇ Dataset folder: {extract_path}\n")

# Check if folder exists
if not os.path.exists(extract_path):
    print(f"‚ùå ERROR: Dataset folder not found at: {extract_path}")
    print(f"\nüí° The folder doesn't exist. Please make sure:")
    print(f"   1. Extract 'local_food_pre.zip' to 'G:\\Food\\Code\\food_dataset'")
    print(f"   2. Or update the 'extract_path' variable above")
    print(f"\n‚ö†Ô∏è NOTE: If the folder exists on your computer but the notebook can't see it,")
    print(f"         you may be using a REMOTE kernel. Click the kernel selector in the")
    print(f"         top-right and choose a LOCAL Python environment.")
else:
    try:
        # List folders
        items = os.listdir(extract_path)
        print(f"‚úÖ Dataset folder found! ({len(items)} items detected)\n")
        print("üçΩÔ∏è Food categories detected:")
        
        categories = []
        for item in items:
            item_path = os.path.join(extract_path, item)
            if os.path.isdir(item_path):
                num_files = len([f for f in os.listdir(item_path) 
                                if os.path.isfile(os.path.join(item_path, f))])
                categories.append((item, num_files))
                print(f"  ‚Ä¢ {item}: {num_files} images")
        
        print(f"\n‚úÖ Total categories: {len(categories)}")
        if len(categories) == 0:
            print("\n‚ö†Ô∏è WARNING: Folder exists but no categories found!")
            print(f"   This usually means the notebook is using a REMOTE kernel")
            print(f"   that can't access your local G:\\ drive.")
            print(f"\nüí° SOLUTION: Switch to a LOCAL Python kernel:")
            print(f"   1. Click the kernel selector in top-right corner")
            print(f"   2. Select 'Python Environments...'")
            print(f"   3. Choose your local Python installation")
        else:
            print(f"‚úÖ Dataset is ready for training!")
    except Exception as e:
        print(f"‚ùå ERROR accessing folder: {e}")
        print(f"\nüí° This confirms you're using a REMOTE kernel that can't access local files.")

üìÇ Dataset folder: G:\Food\Code\food_dataset

‚úÖ Dataset folder found!

üçΩÔ∏è Food categories detected:

‚úÖ Total categories: 0
‚úÖ Dataset is ready for training!


## Step 4: Exploratory Data Analysis (EDA)

In [None]:
# Function to analyze dataset structure
def analyze_dataset(dataset_path):
    """Analyze the dataset structure and image statistics"""
    
    # Get all class folders
    classes = [d for d in os.listdir(dataset_path) if os.path.isdir(os.path.join(dataset_path, d))]
    classes.sort()
    
    print(f"{'='*60}")
    print(f"DATASET ANALYSIS")
    print(f"{'='*60}")
    print(f"\nüìä Total Food Classes: {len(classes)}\n")
    
    # Count images per class
    class_counts = {}
    image_extensions = ['.jpg', '.jpeg', '.png', '.bmp', '.gif']
    
    for class_name in classes:
        class_path = os.path.join(dataset_path, class_name)
        images = [f for f in os.listdir(class_path) 
                 if os.path.splitext(f)[1].lower() in image_extensions]
        class_counts[class_name] = len(images)
    
    # Create DataFrame for better visualization
    df = pd.DataFrame(list(class_counts.items()), columns=['Food Class', 'Image Count'])
    df = df.sort_values('Image Count', ascending=False)
    
    print("\nüìã Images per Food Class:")
    print(df.to_string(index=False))
    
    total_images = df['Image Count'].sum()
    print(f"\n{'='*60}")
    print(f"Total Images: {total_images}")
    print(f"Average Images per Class: {total_images/len(classes):.2f}")
    print(f"Min Images: {df['Image Count'].min()}")
    print(f"Max Images: {df['Image Count'].max()}")
    print(f"{'='*60}\n")
    
    return df, classes

# Analyze the dataset
dataset_df, food_classes = analyze_dataset(extract_path)

# Store for later use
num_classes = len(food_classes)
print(f"\n‚úì Found {num_classes} food categories")

In [None]:
# Visualize class distribution
plt.figure(figsize=(14, 6))

# Bar plot
plt.subplot(1, 2, 1)
plt.barh(dataset_df['Food Class'], dataset_df['Image Count'], color='steelblue')
plt.xlabel('Number of Images', fontsize=12)
plt.ylabel('Food Class', fontsize=12)
plt.title('Image Distribution Across Food Classes', fontsize=14, fontweight='bold')
plt.tight_layout()

# Pie chart (top 10 classes)
plt.subplot(1, 2, 2)
top_10 = dataset_df.head(10)
plt.pie(top_10['Image Count'], labels=top_10['Food Class'], autopct='%1.1f%%', startangle=90)
plt.title('Top 10 Food Classes Distribution', fontsize=14, fontweight='bold')

plt.tight_layout()
plt.show()

# Check for class imbalance
imbalance_ratio = dataset_df['Image Count'].max() / dataset_df['Image Count'].min()
print(f"\n‚ö†Ô∏è Class Imbalance Ratio: {imbalance_ratio:.2f}")
if imbalance_ratio > 3:
    print("  ‚Üí Dataset is imbalanced. Consider using class weights or data augmentation.")

In [None]:
# Analyze image dimensions and formats
def analyze_images(dataset_path, sample_size=100):
    """Analyze image dimensions, formats, and quality"""
    
    widths, heights, formats, sizes = [], [], [], []
    
    classes = [d for d in os.listdir(dataset_path) if os.path.isdir(os.path.join(dataset_path, d))]
    image_extensions = ['.jpg', '.jpeg', '.png', '.bmp', '.gif']
    
    sample_count = 0
    for class_name in classes:
        class_path = os.path.join(dataset_path, class_name)
        images = [f for f in os.listdir(class_path) 
                 if os.path.splitext(f)[1].lower() in image_extensions]
        
        for img_name in images[:10]:  # Sample 10 from each class
            try:
                img_path = os.path.join(class_path, img_name)
                img = Image.open(img_path)
                widths.append(img.width)
                heights.append(img.height)
                formats.append(img.format)
                sizes.append(os.path.getsize(img_path) / 1024)  # KB
                sample_count += 1
                
                if sample_count >= sample_size:
                    break
            except Exception as e:
                continue
        
        if sample_count >= sample_size:
            break
    
    print(f"\n{'='*60}")
    print("IMAGE CHARACTERISTICS ANALYSIS")
    print(f"{'='*60}\n")
    print(f"Samples Analyzed: {len(widths)}")
    print(f"\nüìê Image Dimensions:")
    print(f"  Width  - Min: {min(widths)}px, Max: {max(widths)}px, Avg: {np.mean(widths):.0f}px")
    print(f"  Height - Min: {min(heights)}px, Max: {max(heights)}px, Avg: {np.mean(heights):.0f}px")
    print(f"\nüìä Image Formats: {set(formats)}")
    print(f"\nüíæ File Sizes:")
    print(f"  Min: {min(sizes):.2f} KB")
    print(f"  Max: {max(sizes):.2f} KB")
    print(f"  Avg: {np.mean(sizes):.2f} KB")
    
    return widths, heights

widths, heights = analyze_images(extract_path)

# Visualize dimensions
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

axes[0].hist(widths, bins=20, color='skyblue', edgecolor='black')
axes[0].set_xlabel('Width (pixels)')
axes[0].set_ylabel('Frequency')
axes[0].set_title('Image Width Distribution')

axes[1].hist(heights, bins=20, color='lightcoral', edgecolor='black')
axes[1].set_xlabel('Height (pixels)')
axes[1].set_ylabel('Frequency')
axes[1].set_title('Image Height Distribution')

plt.tight_layout()
plt.show()

In [None]:
# Display sample images from each class
def display_sample_images(dataset_path, classes, samples_per_class=3):
    """Display sample images from each food class"""
    
    n_classes = len(classes)
    fig, axes = plt.subplots(n_classes, samples_per_class, 
                             figsize=(15, 3 * n_classes))
    
    if n_classes == 1:
        axes = [axes]
    
    image_extensions = ['.jpg', '.jpeg', '.png', '.bmp', '.gif']
    
    for i, class_name in enumerate(classes):
        class_path = os.path.join(dataset_path, class_name)
        images = [f for f in os.listdir(class_path) 
                 if os.path.splitext(f)[1].lower() in image_extensions]
        
        for j in range(min(samples_per_class, len(images))):
            img_path = os.path.join(class_path, images[j])
            img = load_img(img_path)
            
            if n_classes == 1:
                ax = axes[j]
            else:
                ax = axes[i, j]
            
            ax.imshow(img)
            ax.axis('off')
            if j == 0:
                ax.set_title(f"{class_name}", fontsize=12, fontweight='bold')
    
    plt.tight_layout()
    plt.suptitle('Sample Images from Each Food Class', 
                 fontsize=16, fontweight='bold', y=1.001)
    plt.show()

# Display samples (showing first 8 classes to keep it manageable)
print("\nüì∏ Displaying sample images from each food class...\n")
display_sample_images(extract_path, food_classes[:8], samples_per_class=3)

## Step 5: Data Cleaning & Preprocessing

In [None]:
# Function to check and remove corrupted images
def clean_dataset(dataset_path):
    """Remove corrupted or unreadable images"""
    
    print("üîç Checking for corrupted images...\n")
    
    classes = [d for d in os.listdir(dataset_path) if os.path.isdir(os.path.join(dataset_path, d))]
    image_extensions = ['.jpg', '.jpeg', '.png', '.bmp', '.gif']
    
    corrupted_count = 0
    total_checked = 0
    
    for class_name in classes:
        class_path = os.path.join(dataset_path, class_name)
        images = [f for f in os.listdir(class_path) 
                 if os.path.splitext(f)[1].lower() in image_extensions]
        
        for img_name in images:
            img_path = os.path.join(class_path, img_name)
            total_checked += 1
            
            try:
                # Try to open and verify the image
                img = Image.open(img_path)
                img.verify()  # Verify it's a valid image
                
                # Reopen for further checks (verify() closes the file)
                img = Image.open(img_path)
                img.load()  # Load the image data
                
            except Exception as e:
                print(f"  ‚ùå Corrupted: {class_name}/{img_name}")
                os.remove(img_path)
                corrupted_count += 1
    
    print(f"\n{'='*60}")
    print(f"Total Images Checked: {total_checked}")
    print(f"Corrupted Images Removed: {corrupted_count}")
    print(f"Clean Images: {total_checked - corrupted_count}")
    print(f"{'='*60}\n")
    
    return total_checked - corrupted_count

clean_image_count = clean_dataset(extract_path)
print("‚úì Data cleaning completed!")

In [None]:
# Create train/validation/test split
def create_data_splits(dataset_path, output_path, train_ratio=0.7, val_ratio=0.15, test_ratio=0.15):
    """Split dataset into train, validation, and test sets"""
    
    print(f"üìÇ Creating data splits (Train: {train_ratio*100}%, Val: {val_ratio*100}%, Test: {test_ratio*100}%)...\n")
    
    # Create output directories
    for split in ['train', 'val', 'test']:
        os.makedirs(os.path.join(output_path, split), exist_ok=True)
    
    classes = [d for d in os.listdir(dataset_path) if os.path.isdir(os.path.join(dataset_path, d))]
    image_extensions = ['.jpg', '.jpeg', '.png', '.bmp', '.gif']
    
    split_info = {'train': 0, 'val': 0, 'test': 0}
    
    for class_name in classes:
        class_path = os.path.join(dataset_path, class_name)
        images = [f for f in os.listdir(class_path) 
                 if os.path.splitext(f)[1].lower() in image_extensions]
        
        # Shuffle images
        np.random.shuffle(images)
        
        # Calculate split points
        total = len(images)
        train_end = int(total * train_ratio)
        val_end = train_end + int(total * val_ratio)
        
        # Split images
        train_images = images[:train_end]
        val_images = images[train_end:val_end]
        test_images = images[val_end:]
        
        # Copy images to respective directories
        for split, image_list in [('train', train_images), ('val', val_images), ('test', test_images)]:
            split_class_path = os.path.join(output_path, split, class_name)
            os.makedirs(split_class_path, exist_ok=True)
            
            for img_name in image_list:
                src = os.path.join(class_path, img_name)
                dst = os.path.join(split_class_path, img_name)
                shutil.copy2(src, dst)
                split_info[split] += 1
    
    print(f"{'='*60}")
    print(f"Train Images: {split_info['train']}")
    print(f"Validation Images: {split_info['val']}")
    print(f"Test Images: {split_info['test']}")
    print(f"{'='*60}\n")
    
    return output_path

# Create splits
split_data_path = r'G:\Food\Code\food_dataset_split'
split_data_path = create_data_splits(extract_path, split_data_path)

print("‚úì Data split completed!")

## Step 6: Data Augmentation & Generators

In [None]:
# Image parameters
IMG_SIZE = 224  # Standard size for most pre-trained models
BATCH_SIZE = 32

# Data Augmentation for Training Set
train_datagen = ImageDataGenerator(
    rescale=1./255,              # Normalize pixel values
    rotation_range=30,            # Random rotation
    width_shift_range=0.2,        # Horizontal shift
    height_shift_range=0.2,       # Vertical shift
    shear_range=0.2,              # Shear transformation
    zoom_range=0.2,               # Random zoom
    horizontal_flip=True,         # Horizontal flip
    brightness_range=[0.8, 1.2],  # Brightness adjustment
    fill_mode='nearest'           # Fill mode for new pixels
)

# Validation and Test Set (only rescaling, no augmentation)
val_test_datagen = ImageDataGenerator(rescale=1./255)

# Create generators
train_generator = train_datagen.flow_from_directory(
    os.path.join(split_data_path, 'train'),
    target_size=(IMG_SIZE, IMG_SIZE),
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    shuffle=True
)

validation_generator = val_test_datagen.flow_from_directory(
    os.path.join(split_data_path, 'val'),
    target_size=(IMG_SIZE, IMG_SIZE),
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    shuffle=False
)

test_generator = val_test_datagen.flow_from_directory(
    os.path.join(split_data_path, 'test'),
    target_size=(IMG_SIZE, IMG_SIZE),
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    shuffle=False
)

# Store class indices for later use
class_indices = train_generator.class_indices
class_names = {v: k for k, v in class_indices.items()}

print(f"\n‚úì Data generators created successfully!")
print(f"  Training samples: {train_generator.samples}")
print(f"  Validation samples: {validation_generator.samples}")
print(f"  Test samples: {test_generator.samples}")
print(f"  Number of classes: {len(class_indices)}")

In [None]:
# Visualize augmented images
def show_augmented_images(generator, num_images=9):
    """Display augmented images from the generator"""
    
    # Get a batch of images
    images, labels = next(generator)
    
    fig, axes = plt.subplots(3, 3, figsize=(12, 12))
    axes = axes.ravel()
    
    for i in range(min(num_images, len(images))):
        axes[i].imshow(images[i])
        class_idx = np.argmax(labels[i])
        axes[i].set_title(f"{class_names[class_idx]}")
        axes[i].axis('off')
    
    plt.suptitle('Sample Augmented Training Images', fontsize=16, fontweight='bold')
    plt.tight_layout()
    plt.show()

print("\nüì∏ Visualizing augmented images...\n")
show_augmented_images(train_generator)

## Step 7: Model Building with Transfer Learning

We'll test multiple pre-trained models to find the best performer:
- **MobileNetV2** (lightweight, fast)
- **EfficientNetB0** (efficient, good accuracy)
- **ResNet50** (deep, robust)
- **VGG16** (classic, reliable)
- **InceptionV3** (inception modules)

In [None]:
# Function to build transfer learning model
def build_transfer_model(base_model_name='MobileNetV2', num_classes=num_classes, 
                         img_size=IMG_SIZE, trainable_layers=0):
    """
    Build a transfer learning model
    
    Args:
        base_model_name: Name of the pre-trained model
        num_classes: Number of output classes
        img_size: Input image size
        trainable_layers: Number of layers to unfreeze (0 = freeze all base layers)
    """
    
    # Select base model
    if base_model_name == 'MobileNetV2':
        base_model = MobileNetV2(weights='imagenet', include_top=False, 
                                 input_shape=(img_size, img_size, 3))
    elif base_model_name == 'EfficientNetB0':
        base_model = EfficientNetB0(weights='imagenet', include_top=False, 
                                    input_shape=(img_size, img_size, 3))
    elif base_model_name == 'ResNet50':
        base_model = ResNet50(weights='imagenet', include_top=False, 
                              input_shape=(img_size, img_size, 3))
    elif base_model_name == 'VGG16':
        base_model = VGG16(weights='imagenet', include_top=False, 
                           input_shape=(img_size, img_size, 3))
    elif base_model_name == 'InceptionV3':
        base_model = InceptionV3(weights='imagenet', include_top=False, 
                                 input_shape=(img_size, img_size, 3))
    else:
        raise ValueError(f"Unknown model: {base_model_name}")
    
    # Freeze base model layers
    base_model.trainable = False
    
    # Optionally unfreeze last N layers
    if trainable_layers > 0:
        for layer in base_model.layers[-trainable_layers:]:
            layer.trainable = True
    
    # Build the model
    model = models.Sequential([
        base_model,
        layers.GlobalAveragePooling2D(),
        layers.BatchNormalization(),
        layers.Dropout(0.5),
        layers.Dense(512, activation='relu'),
        layers.BatchNormalization(),
        layers.Dropout(0.3),
        layers.Dense(256, activation='relu'),
        layers.Dropout(0.2),
        layers.Dense(num_classes, activation='softmax')
    ])
    
    # Compile model
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=0.001),
        loss='categorical_crossentropy',
        metrics=['accuracy', keras.metrics.TopKCategoricalAccuracy(k=3, name='top_3_accuracy')]
    )
    
    return model

print("‚úì Model building function created!")

## Step 8: Training Callbacks & Configuration

In [None]:
# Training configuration
EPOCHS = 30

# Create callbacks
def get_callbacks(model_name):
    """Create training callbacks"""
    
    # Model checkpoint - save best model
    checkpoint_path = rf'G:\Food\Code\models\best_model_{model_name}.keras'
    os.makedirs(r'G:\Food\Code\models', exist_ok=True)
    checkpoint = ModelCheckpoint(
        checkpoint_path,
        monitor='val_accuracy',
        save_best_only=True,
        mode='max',
        verbose=1
    )
    
    # Early stopping - stop if no improvement
    early_stop = EarlyStopping(
        monitor='val_accuracy',
        patience=5,
        restore_best_weights=True,
        verbose=1
    )
    
    # Reduce learning rate on plateau
    reduce_lr = ReduceLROnPlateau(
        monitor='val_loss',
        factor=0.2,
        patience=3,
        min_lr=1e-7,
        verbose=1
    )
    
    return [checkpoint, early_stop, reduce_lr]

print("‚úì Training callbacks configured!")

## Step 9: Train Multiple Models (Experiment)

In [None]:
# Train and evaluate multiple models
models_to_test = ['MobileNetV2', 'EfficientNetB0', 'ResNet50']
results = {}

for model_name in models_to_test:
    print(f"\n{'='*80}")
    print(f"üöÄ Training {model_name}")
    print(f"{'='*80}\n")
    
    # Build model
    model = build_transfer_model(base_model_name=model_name)
    
    # Display model summary
    print(f"\nüìä Model Architecture:")
    model.summary()
    
    # Get callbacks
    callbacks = get_callbacks(model_name)
    
    # Train model
    history = model.fit(
        train_generator,
        epochs=EPOCHS,
        validation_data=validation_generator,
        callbacks=callbacks,
        verbose=1
    )
    
    # Evaluate on test set
    test_loss, test_acc, test_top3 = model.evaluate(test_generator, verbose=0)
    
    # Store results
    results[model_name] = {
        'history': history.history,
        'test_loss': test_loss,
        'test_accuracy': test_acc,
        'test_top3_accuracy': test_top3,
        'model': model
    }
    
    print(f"\n‚úÖ {model_name} Results:")
    print(f"  Test Accuracy: {test_acc*100:.2f}%")
    print(f"  Test Top-3 Accuracy: {test_top3*100:.2f}%")
    print(f"  Test Loss: {test_loss:.4f}")
    
    # Save history
    history_df = pd.DataFrame(history.history)
    history_df.to_csv(rf'G:\Food\Code\{model_name}_history.csv', index=False)

print("\n" + "="*80)
print("‚úì All models trained successfully!")
print("="*80)

## Step 10: Model Comparison & Visualization

In [None]:
# Compare model performances
comparison_df = pd.DataFrame({
    'Model': list(results.keys()),
    'Test Accuracy (%)': [results[m]['test_accuracy']*100 for m in results.keys()],
    'Top-3 Accuracy (%)': [results[m]['test_top3_accuracy']*100 for m in results.keys()],
    'Test Loss': [results[m]['test_loss'] for m in results.keys()]
}).sort_values('Test Accuracy (%)', ascending=False)

print("\n" + "="*80)
print("üìä MODEL COMPARISON")
print("="*80 + "\n")
print(comparison_df.to_string(index=False))
print("\n" + "="*80)

# Visualize comparison
fig, axes = plt.subplots(1, 2, figsize=(15, 5))

# Bar plot - Test Accuracy
axes[0].bar(comparison_df['Model'], comparison_df['Test Accuracy (%)'], 
            color=['#2ecc71', '#3498db', '#e74c3c'][:len(comparison_df)])
axes[0].set_ylabel('Accuracy (%)', fontsize=12)
axes[0].set_title('Model Comparison - Test Accuracy', fontsize=14, fontweight='bold')
axes[0].set_ylim([0, 100])
for i, v in enumerate(comparison_df['Test Accuracy (%)']):
    axes[0].text(i, v + 2, f'{v:.2f}%', ha='center', fontweight='bold')

# Bar plot - Top-3 Accuracy
axes[1].bar(comparison_df['Model'], comparison_df['Top-3 Accuracy (%)'], 
            color=['#2ecc71', '#3498db', '#e74c3c'][:len(comparison_df)])
axes[1].set_ylabel('Accuracy (%)', fontsize=12)
axes[1].set_title('Model Comparison - Top-3 Accuracy', fontsize=14, fontweight='bold')
axes[1].set_ylim([0, 100])
for i, v in enumerate(comparison_df['Top-3 Accuracy (%)']):
    axes[1].text(i, v + 2, f'{v:.2f}%', ha='center', fontweight='bold')

plt.tight_layout()
plt.show()

# Find best model
best_model_name = comparison_df.iloc[0]['Model']
print(f"\nüèÜ Best Model: {best_model_name}")
print(f"   Accuracy: {comparison_df.iloc[0]['Test Accuracy (%)']:.2f}%")

In [None]:
# Plot training history for best model
def plot_training_history(history, model_name):
    """Plot training and validation accuracy/loss"""
    
    fig, axes = plt.subplots(1, 2, figsize=(15, 5))
    
    # Accuracy plot
    axes[0].plot(history['accuracy'], label='Train Accuracy', linewidth=2)
    axes[0].plot(history['val_accuracy'], label='Val Accuracy', linewidth=2)
    axes[0].set_xlabel('Epoch', fontsize=12)
    axes[0].set_ylabel('Accuracy', fontsize=12)
    axes[0].set_title(f'{model_name} - Training History (Accuracy)', 
                      fontsize=14, fontweight='bold')
    axes[0].legend()
    axes[0].grid(True, alpha=0.3)
    
    # Loss plot
    axes[1].plot(history['loss'], label='Train Loss', linewidth=2)
    axes[1].plot(history['val_loss'], label='Val Loss', linewidth=2)
    axes[1].set_xlabel('Epoch', fontsize=12)
    axes[1].set_ylabel('Loss', fontsize=12)
    axes[1].set_title(f'{model_name} - Training History (Loss)', 
                      fontsize=14, fontweight='bold')
    axes[1].legend()
    axes[1].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()

# Plot for best model
print(f"\nüìà Training History for Best Model: {best_model_name}\n")
plot_training_history(results[best_model_name]['history'], best_model_name)

## Step 11: Detailed Evaluation (Best Model)

In [None]:
# Get best model
best_model = results[best_model_name]['model']

# Generate predictions on test set
print("üîÆ Generating predictions on test set...\n")
test_generator.reset()
predictions = best_model.predict(test_generator, verbose=1)
predicted_classes = np.argmax(predictions, axis=1)

# True labels
true_classes = test_generator.classes

# Classification report
print("\n" + "="*80)
print("üìä CLASSIFICATION REPORT")
print("="*80 + "\n")
print(classification_report(true_classes, predicted_classes, 
                          target_names=list(class_names.values()),
                          digits=4))

# Overall metrics
accuracy = accuracy_score(true_classes, predicted_classes)
print(f"\n‚úÖ Overall Test Accuracy: {accuracy*100:.2f}%")

In [None]:
# Confusion Matrix
cm = confusion_matrix(true_classes, predicted_classes)

plt.figure(figsize=(20, 16))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', 
            xticklabels=list(class_names.values()),
            yticklabels=list(class_names.values()),
            cbar_kws={'label': 'Count'})
plt.xlabel('Predicted Label', fontsize=14)
plt.ylabel('True Label', fontsize=14)
plt.title(f'Confusion Matrix - {best_model_name}', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.show()

# Per-class accuracy
print("\n" + "="*80)
print("üìà PER-CLASS ACCURACY")
print("="*80 + "\n")
for i, class_name in class_names.items():
    class_correct = cm[i, i]
    class_total = cm[i, :].sum()
    class_accuracy = (class_correct / class_total * 100) if class_total > 0 else 0
    print(f"{class_name:25s} : {class_accuracy:6.2f}% ({class_correct}/{class_total})")

## Step 12: Create Nutrition Database for Bangladeshi Foods

In [None]:
# Create a comprehensive nutrition database
# Note: These are approximate values per 100g serving. Adjust based on your research.

nutrition_database = {
    'Alu Vorta': {
        'description': 'Mashed potato dish with mustard oil, onions, and green chilies',
        'calories': 130,
        'protein': 2.5,
        'carbs': 22,
        'fat': 4,
        'fiber': 2.5,
        'benefits': 'Good source of vitamin C, potassium, and dietary fiber'
    },
    'Bakorkhani': {
        'description': 'Traditional thick, spiced flat-bread',
        'calories': 310,
        'protein': 8,
        'carbs': 52,
        'fat': 8,
        'fiber': 2,
        'benefits': 'Energy-rich, contains carbohydrates and some protein'
    },
    'Bhapa': {
        'description': 'Steamed dish, usually fish or vegetables',
        'calories': 150,
        'protein': 18,
        'carbs': 5,
        'fat': 6,
        'fiber': 1,
        'benefits': 'High in protein, omega-3 fatty acids (if fish)'
    },
    'Burger': {
        'description': 'Fast food sandwich with patty and vegetables',
        'calories': 295,
        'protein': 15,
        'carbs': 28,
        'fat': 14,
        'fiber': 2,
        'benefits': 'Provides protein and energy, moderate in fat'
    },
    'Chicken': {
        'description': 'Chicken curry or preparation',
        'calories': 220,
        'protein': 27,
        'carbs': 3,
        'fat': 11,
        'fiber': 0.5,
        'benefits': 'Excellent source of lean protein, B vitamins'
    },
    'Chicken Roast': {
        'description': 'Roasted or grilled chicken with spices',
        'calories': 240,
        'protein': 28,
        'carbs': 2,
        'fat': 13,
        'fiber': 0.3,
        'benefits': 'High protein, lower carb option'
    },
    'Chingri Vuna': {
        'description': 'Fried prawn dish with spices',
        'calories': 180,
        'protein': 20,
        'carbs': 4,
        'fat': 9,
        'fiber': 0.5,
        'benefits': 'Rich in protein, selenium, and omega-3'
    },
    'Chomchom': {
        'description': 'Traditional sweet made from milk',
        'calories': 350,
        'protein': 7,
        'carbs': 50,
        'fat': 14,
        'fiber': 0,
        'benefits': 'Calcium from milk, quick energy source'
    },
    'Chowmein': {
        'description': 'Stir-fried noodles with vegetables',
        'calories': 190,
        'protein': 6,
        'carbs': 28,
        'fat': 6,
        'fiber': 3,
        'benefits': 'Provides carbohydrates, vegetables add vitamins'
    },
    'Dal': {
        'description': 'Lentil soup/curry',
        'calories': 115,
        'protein': 9,
        'carbs': 20,
        'fat': 0.5,
        'fiber': 8,
        'benefits': 'Excellent protein source, high in fiber and iron'
    },
    'Egg Curry': {
        'description': 'Boiled eggs in spicy curry',
        'calories': 210,
        'protein': 14,
        'carbs': 6,
        'fat': 15,
        'fiber': 1,
        'benefits': 'High quality protein, vitamins A, D, E, B12'
    },
    'French Fries': {
        'description': 'Deep-fried potato strips',
        'calories': 312,
        'protein': 3.4,
        'carbs': 41,
        'fat': 15,
        'fiber': 3.8,
        'benefits': 'Energy from carbs, some potassium'
    },
    'Fried Chicken': {
        'description': 'Deep-fried chicken pieces',
        'calories': 320,
        'protein': 24,
        'carbs': 12,
        'fat': 20,
        'fiber': 0.5,
        'benefits': 'High protein, but higher in fat due to frying'
    },
    'Fuchka': {
        'description': 'Crispy hollow puri with spicy water',
        'calories': 140,
        'protein': 4,
        'carbs': 22,
        'fat': 4,
        'fiber': 2,
        'benefits': 'Low calorie snack, provides quick energy'
    },
    'Jalebi': {
        'description': 'Sweet deep-fried dessert',
        'calories': 415,
        'protein': 5,
        'carbs': 65,
        'fat': 16,
        'fiber': 0,
        'benefits': 'Quick energy from sugar, occasional treat'
    },
    'Jhalmuri': {
        'description': 'Puffed rice snack with spices',
        'calories': 325,
        'protein': 7,
        'carbs': 68,
        'fat': 3,
        'fiber': 2,
        'benefits': 'Light snack, low in fat, provides quick energy'
    },
    'Kotkoti': {
        'description': 'Flaky sweet pastry',
        'calories': 450,
        'protein': 6,
        'carbs': 55,
        'fat': 23,
        'fiber': 1,
        'benefits': 'Energy-dense, occasional treat'
    },
    'Morog Polao': {
        'description': 'Chicken pilaf rice dish',
        'calories': 280,
        'protein': 18,
        'carbs': 38,
        'fat': 6,
        'fiber': 1.5,
        'benefits': 'Balanced meal with protein, carbs, and minimal fat'
    },
    'Mutton Leg Roast': {
        'description': 'Roasted mutton leg with spices',
        'calories': 290,
        'protein': 26,
        'carbs': 2,
        'fat': 20,
        'fiber': 0.3,
        'benefits': 'Rich in protein, iron, and B vitamins'
    },
    'Paratha': {
        'description': 'Layered flatbread',
        'calories': 320,
        'protein': 6,
        'carbs': 42,
        'fat': 14,
        'fiber': 2,
        'benefits': 'Energy from carbs, some protein'
    },
    'Pera Sondesh': {
        'description': 'Milk-based sweet',
        'calories': 380,
        'protein': 8,
        'carbs': 52,
        'fat': 16,
        'fiber': 0,
        'benefits': 'Calcium from milk, protein'
    }
}

# Save to JSON for later use
import json
with open(r'G:\Food\Code\nutrition_database.json', 'w') as f:
    json.dump(nutrition_database, f, indent=2)

print("‚úÖ Nutrition database created!")
print(f"   Total food items: {len(nutrition_database)}")

# Display sample
sample_food = list(nutrition_database.keys())[0]
print(f"\nüìä Sample Entry: {sample_food}")
print(json.dumps(nutrition_database[sample_food], indent=2))

## Step 13: Build Gradio Web Application

Now let's create an interactive web app where users can upload food images and get:
- Food name prediction
- Confidence score
- Nutritional information
- Health benefits

In [None]:
import gradio as gr
from PIL import Image
import numpy as np

# Prediction function
def predict_food(image):
    """
    Predict food class and return nutritional information
    """
    try:
        # Preprocess image
        img = Image.fromarray(image.astype('uint8'), 'RGB')
        img = img.resize((IMG_SIZE, IMG_SIZE))
        img_array = img_to_array(img)
        img_array = img_array / 255.0
        img_array = np.expand_dims(img_array, axis=0)
        
        # Make prediction
        predictions = best_model.predict(img_array, verbose=0)
        predicted_class_idx = np.argmax(predictions[0])
        confidence = predictions[0][predicted_class_idx] * 100
        
        # Get class name
        predicted_food = class_names[predicted_class_idx]
        
        # Get top 3 predictions
        top_3_idx = np.argsort(predictions[0])[-3:][::-1]
        top_3_predictions = []
        for idx in top_3_idx:
            food_name = class_names[idx]
            conf = predictions[0][idx] * 100
            top_3_predictions.append(f"{food_name}: {conf:.2f}%")
        
        # Get nutrition info
        if predicted_food in nutrition_database:
            nutrition = nutrition_database[predicted_food]
            
            # Create formatted output
            result = f"""
üçΩÔ∏è **Detected Food: {predicted_food}**
‚úÖ **Confidence: {confidence:.2f}%**

üìù **Description:**
{nutrition['description']}

üìä **Nutritional Information (per 100g):**
‚Ä¢ Calories: {nutrition['calories']} kcal
‚Ä¢ Protein: {nutrition['protein']} g
‚Ä¢ Carbohydrates: {nutrition['carbs']} g
‚Ä¢ Fat: {nutrition['fat']} g
‚Ä¢ Fiber: {nutrition['fiber']} g

üí™ **Health Benefits:**
{nutrition['benefits']}

üéØ **Top 3 Predictions:**
{chr(10).join([f"{i+1}. {pred}" for i, pred in enumerate(top_3_predictions)])}
"""
        else:
            result = f"""
üçΩÔ∏è **Detected Food: {predicted_food}**
‚úÖ **Confidence: {confidence:.2f}%**

‚ö†Ô∏è Nutrition information not available for this food item.

üéØ **Top 3 Predictions:**
{chr(10).join([f"{i+1}. {pred}" for i, pred in enumerate(top_3_predictions)])}
"""
        
        return result
    
    except Exception as e:
        return f"‚ùå Error: {str(e)}"

# Create Gradio interface
demo = gr.Interface(
    fn=predict_food,
    inputs=gr.Image(label="Upload Food Image"),
    outputs=gr.Textbox(label="Prediction & Nutrition Info", lines=20),
    title="üáßüá© Bangladeshi Food Classification & Nutrition System",
    description="""
    Upload an image of Bangladeshi food, and the AI will identify it and provide nutritional information!
    
    **Supported Foods:** Alu Vorta, Bakorkhani, Bhapa, Chicken, Dal, Paratha, Fuchka, Jalebi, and many more!
    """,
    examples=None,  # You can add example images here
    theme="soft",
    allow_flagging="never"
)

# Launch the app
print("\n" + "="*80)
print("üöÄ Launching Gradio App...")
print("="*80 + "\n")

demo.launch(share=True, debug=True)

## Step 14: Save Best Model & Assets for Deployment

In [None]:
# Create a models directory to save everything
models_dir = r'G:\Food\Code\models'
os.makedirs(models_dir, exist_ok=True)

# Save the best model
final_model_path = os.path.join(models_dir, 'bangladeshi_food_classifier_final.keras')
best_model.save(final_model_path)
print(f"‚úÖ Best model saved: {final_model_path}")

# Save class names mapping
class_names_path = os.path.join(models_dir, 'class_names.json')
with open(class_names_path, 'w') as f:
    json.dump(class_names, f, indent=2)
print(f"‚úÖ Class names saved: {class_names_path}")

# Save nutrition database
nutrition_db_path = os.path.join(models_dir, 'nutrition_database.json')
with open(nutrition_db_path, 'w') as f:
    json.dump(nutrition_database, f, indent=2)
print(f"‚úÖ Nutrition database saved: {nutrition_db_path}")

# Save results summary
summary = {
    'best_model': best_model_name,
    'test_accuracy': float(results[best_model_name]['test_accuracy']),
    'test_top3_accuracy': float(results[best_model_name]['test_top3_accuracy']),
    'test_loss': float(results[best_model_name]['test_loss']),
    'num_classes': num_classes,
    'total_training_samples': train_generator.samples,
    'total_validation_samples': validation_generator.samples,
    'total_test_samples': test_generator.samples,
    'img_size': IMG_SIZE,
    'all_models_comparison': {
        model: {
            'test_accuracy': float(results[model]['test_accuracy']),
            'test_top3_accuracy': float(results[model]['test_top3_accuracy']),
            'test_loss': float(results[model]['test_loss'])
        }
        for model in results.keys()
    }
}

summary_path = os.path.join(models_dir, 'model_summary.json')
with open(summary_path, 'w') as f:
    json.dump(summary, f, indent=2)
print(f"‚úÖ Model summary saved: {summary_path}")

print("\n" + "="*80)
print("üéâ PROJECT COMPLETED SUCCESSFULLY!")
print("="*80)
print(f"\nüìÅ All files saved to: {models_dir}")
print("\nGenerated files:")
print(f"  - {os.path.basename(final_model_path)}")
print(f"  - {os.path.basename(class_names_path)}")
print(f"  - {os.path.basename(nutrition_db_path)}")
print(f"  - {os.path.basename(summary_path)}")

---

## üìã Next Steps & Recommendations

### For Your Thesis:

1. **Data Collection Improvements:**
   - Collect more images per class (aim for 500+ per class)
   - Include variations: different lighting, angles, plates, backgrounds
   - Add more Bangladeshi food varieties

2. **Model Improvements:**
   - Try fine-tuning (unfreeze last layers of base model)
   - Experiment with ensemble methods
   - Try newer architectures: EfficientNetV2, Vision Transformers

3. **Advanced Features:**
   - Add serving size estimation
   - Implement multi-food detection (if multiple items on plate)
   - Add regional food variations

4. **Deployment Options:**
   - **Gradio** (current) - Easy, shareable link
   - **Streamlit** - More customizable UI
   - **Mobile App** - Using TensorFlow Lite
   - **Web API** - Using FastAPI/Flask

5. **Thesis Documentation:**
   - Literature review on food classification
   - Methodology section (data collection, preprocessing, models)
   - Results & Discussion (compare models, confusion matrix analysis)
   - Conclusion & Future work

### To Run This Notebook:

1. **Upload** `local_food_pre.zip` to Google Drive
2. **Update** the ZIP path in Step 3
3. **Run all cells** sequentially
4. **Wait** for training (may take 1-2 hours)
5. **Test** the Gradio app with your food images!

### Files Generated:
- `bangladeshi_food_classifier_final.keras` - Best trained model
- `class_names.json` - Class label mappings
- `nutrition_database.json` - Nutrition information
- `model_summary.json` - Training results summary
- Training history CSV files for each model

---

**Good luck with your thesis! üéìüáßüá©**