# Proyek Klasifikasi Makanan Indonesia

## Import & dependensi
Bagian awal mengimpor library utama: PyTorch (model, loss, optimizer, DataLoader), numpy, metric dari sklearn, time, os, dan RegNetX-400MF dari torchvision. Juga mengimpor modul kustom MakananIndo (dataset) dan check_set_gpu() (utility untuk menentukan device). Ini menyiapkan semua building block yang dipakai di sisa skrip.

## Pembuatan encoder label
Fungsi ini mengiterasi seluruh dataset (pemanggilan dataset[i]) untuk mengumpulkan label string, lalu membuat daftar kelas unik yang terurut. Menghasilkan tiga objek: label_to_idx (mapping label→index), idx_to_label (index→label), dan unique_labels. Catatan: mengambil label lewat __getitem__ seluruh dataset aman untuk dataset kecil; untuk dataset besar lebih efisien kalau dataset menyediakan atribut classes atau labels langsung.

## Inisialisasi model RegNet
Memuat RegNetX-400MF dengan bobot pretrained ImageNet. Bila freeze_backbone=True, semua parameter awal di-freeze sehingga hanya layer akhir yang dilatih. Layer akhir (classifier) diganti dengan Linear(in_features, num_classes) dan (di kode asli) di-wrap dengan Softmax(dim=1). Penting: saat menggunakan nn.CrossEntropyLoss() sebaiknya jangan menyertakan Softmax di akhir karena CrossEntropyLoss mengharapkan logits (belum di-softmax). Sebaiknya simpan Softmax hanya untuk inference/probabilitas.

# Model Testing and Prediction Generation

Notebook ini berisi tentang:
1. Load the best trained RegNet model
2. Process test images
3. Generate prediksi
4. Simpan results to jawaban.csv

## 1. Import Libraries and Load Model

In [2]:
import os
import torch
import torch.nn as nn
from torchvision.models import regnet_x_400mf, RegNet_X_400MF_Weights
from torchvision import transforms
import pandas as pd
import os
from PIL import Image

# Set device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

# Define the same model architecture
def create_regnet_model(num_classes):
    weights = RegNet_X_400MF_Weights.IMAGENET1K_V1
    model = regnet_x_400mf(weights=weights)
    
    # Replace the final classification layer
    in_features = model.fc.in_features
    model.fc = nn.Sequential(
        nn.Linear(in_features, num_classes),
        nn.Softmax(dim=1)
    )
    return model

# Create model with 5 classes (same as training)
model = create_regnet_model(num_classes=5)

# Load the best model weights
model.load_state_dict(torch.load('best_regnet_model.pth'))
model = model.to(device)
model.eval()

print("Model loaded successfully!")

Using device: cuda


  model.load_state_dict(torch.load('best_regnet_model.pth'))


Model loaded successfully!


## 2. Load and Preprocess Test Data

In [3]:
# Define the same transforms as used in training
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

# Read test.csv to get file list in correct order
test_df = pd.read_csv('test.csv')
print(f"Loaded {len(test_df)} test images from test.csv")

# Verify the format
print("\nVerifying test.csv format:")
print("Columns:", list(test_df.columns))
print("\nFirst few entries:")
print(test_df.head())

Loaded 277 test images from test.csv

Verifying test.csv format:
Columns: ['filename', 'label']

First few entries:
   filename  label
0  0001.jpg    NaN
1  0002.jpg    NaN
2  0003.jpg    NaN
3  0004.jpg    NaN
4  0005.jpg    NaN


## 3. Make Predictions with Best Model

In [4]:
# Function to get just the prediction label
def get_prediction(image_path):
    # Load and preprocess image
    image = Image.open(image_path).convert('RGB')
    image_tensor = transform(image).unsqueeze(0).to(device)
    
    # Make prediction
    with torch.no_grad():
        outputs = model(image_tensor)
        _, predicted = torch.max(outputs, 1)
    
    return idx_to_label[predicted.item()]

# Make predictions for all test images
predictions = []

print("Making predictions...")
for idx, row in test_df.iterrows():
    image_path = os.path.join('test', row['filename'])
    pred_label = get_prediction(image_path)
    predictions.append(pred_label)
    
    if (idx + 1) % 10 == 0:
        print(f"Processed {idx + 1}/{len(test_df)} images")

Making predictions...


FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\Lenovo\\Documents\\Semester 7\\DeepLearn\\test\\0001.jpg'

## 4. Create and Save Results

In [5]:
# Create DataFrame with predictions, maintaining exact order from test.csv
results_df = pd.DataFrame({
    'filename': test_df['filename'],
    'label': predictions
})

# Save to CSV without index
results_df.to_csv('jawaban.csv', index=False)

print("\nPredictions saved to jawaban.csv")
print("\nVerifying jawaban.csv format:")
print("First few lines of jawaban.csv:")
# Read back and display to verify format
verification = pd.read_csv('jawaban.csv')
print(verification.head())
print("\nColumns in jawaban.csv:", list(verification.columns))

ValueError: array length 0 does not match index length 277