# üå± Deteksi Penyakit Tanaman - Complete Pipeline

Notebook lengkap untuk:
1. Setup & Load Data
2. Build & Train Model
3. Evaluate Model
4. Predict Gambar Baru

---

## üì¶ PART 1: SETUP & INSTALLATION

In [None]:
# Install dependencies (uncomment jika belum install)
# !pip install tensorflow numpy matplotlib scikit-learn seaborn opencv-python pillow

In [None]:
# Import libraries
import os
import json
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras

# Import modul utils (pastikan file plant_disease_utils.py ada di folder yang sama)
from plant_disease_utils import (
    PlantDiseaseDataLoader,
    PlantDiseaseModel,
    Visualizer,
    ModelEvaluator,
    PlantDiseasePredictor,
    save_class_names,
    load_class_names,
    print_model_info
)

# Set random seed
np.random.seed(42)
tf.random.set_seed(42)

print("="*60)
print("üå± PLANT DISEASE DETECTION SYSTEM")
print("="*60)
print(f"‚úÖ TensorFlow version: {tf.__version__}")
print(f"‚úÖ GPU Available: {len(tf.config.list_physical_devices('GPU')) > 0}")
print("="*60)

## üìÅ PART 2: DATASET SETUP & LOADING

In [None]:
# Konfigurasi path dataset
DATASET_PATH = "plantvillage_dataset"  # ‚ö†Ô∏è SESUAIKAN dengan lokasi dataset Anda
TRAIN_PATH = os.path.join(DATASET_PATH, "train")
VAL_PATH = os.path.join(DATASET_PATH, "val")
TEST_PATH = os.path.join(DATASET_PATH, "test")

# Verifikasi path
print("üìÅ Checking dataset paths...")
for path_name, path in [("Train", TRAIN_PATH), ("Validation", VAL_PATH), ("Test", TEST_PATH)]:
    if os.path.exists(path):
        num_classes = len(os.listdir(path))
        print(f"‚úÖ {path_name}: {path} ({num_classes} classes)")
    else:
        print(f"‚ùå {path_name}: {path} NOT FOUND!")

In [None]:
# Konfigurasi training
IMG_SIZE = 224
BATCH_SIZE = 32
EPOCHS = 20  # ‚ö†Ô∏è Sesuaikan dengan kebutuhan (lebih banyak = lebih lama tapi lebih akurat)
LEARNING_RATE = 0.001

print("\n‚öôÔ∏è Training Configuration:")
print(f"   Image Size: {IMG_SIZE}x{IMG_SIZE}")
print(f"   Batch Size: {BATCH_SIZE}")
print(f"   Epochs: {EPOCHS}")
print(f"   Learning Rate: {LEARNING_RATE}")

In [None]:
# Load dataset dengan augmentasi
print("\nüîÑ Loading dataset...")
data_loader = PlantDiseaseDataLoader(img_size=IMG_SIZE, batch_size=BATCH_SIZE)

train_gen, val_gen, test_gen = data_loader.create_data_generators(
    TRAIN_PATH, VAL_PATH, TEST_PATH
)

# Get class names
class_names = data_loader.get_class_names(train_gen)
num_classes = len(class_names)

print(f"\n‚úÖ Dataset loaded successfully!")
print(f"üìä Number of classes: {num_classes}")
print(f"üìä Training samples: {train_gen.samples}")
print(f"üìä Validation samples: {val_gen.samples}")
print(f"üìä Test samples: {test_gen.samples}")

In [None]:
# Lihat nama kelas
print("\nüìã CLASS NAMES:")
print("="*60)
for i, class_name in enumerate(class_names, 1):
    print(f"{i:2d}. {class_name}")
print("="*60)

In [None]:
# Visualisasi distribusi kelas
visualizer = Visualizer()
visualizer.plot_class_distribution(train_gen, title='Training Set - Class Distribution')

In [None]:
# Tampilkan sample gambar
print("\nüì∑ Sample images from training set:")
data_loader.show_sample_images(train_gen, num_samples=9)

In [None]:
# Simpan class names dan config
save_class_names(class_names, 'class_names.txt')

config = {
    'num_classes': num_classes,
    'img_size': IMG_SIZE,
    'batch_size': BATCH_SIZE,
    'epochs': EPOCHS,
    'learning_rate': LEARNING_RATE,
    'train_samples': train_gen.samples,
    'val_samples': val_gen.samples,
    'test_samples': test_gen.samples
}

with open('config.json', 'w') as f:
    json.dump(config, f, indent=4)
    
print("\n‚úÖ Configuration saved!")

## üß† PART 3: BUILD MODEL

In [None]:
# Inisialisasi model builder
disease_model = PlantDiseaseModel(num_classes=num_classes, img_size=IMG_SIZE)

# ‚ö†Ô∏è PILIH SALAH SATU ARSITEKTUR (uncomment yang mau dipakai):

# Option A: Transfer Learning dengan MobileNetV2 (RECOMMENDED - Lebih Akurat)
model = disease_model.build_transfer_learning_model()
MODEL_NAME = "transfer_learning_model"

# Option B: CNN dari Scratch (Lebih Ringan)
# model = disease_model.build_cnn_model()
# MODEL_NAME = "cnn_scratch_model"

print(f"\n‚úÖ Model architecture: {MODEL_NAME}")

In [None]:
# Lihat arsitektur model
print_model_info(model)

In [None]:
# Compile model
disease_model.compile_model(learning_rate=LEARNING_RATE)

print("\n‚úÖ Model compiled successfully!")
print(f"   Optimizer: Adam (lr={LEARNING_RATE})")
print(f"   Loss: Categorical Crossentropy")
print(f"   Metrics: Accuracy, Top-3 Accuracy")

## üöÄ PART 4: TRAINING MODEL

‚ö†Ô∏è **PERHATIAN**: Proses training akan memakan waktu!

Estimasi waktu:
- **Dengan GPU**: 5-15 menit per epoch
- **Tanpa GPU**: 30-60 menit per epoch

Training akan otomatis:
- Menyimpan model terbaik
- Early stopping jika tidak ada peningkatan
- Mengurangi learning rate secara otomatis

In [None]:
# Set model save path
MODEL_SAVE_PATH = f"best_{MODEL_NAME}.h5"

print("="*60)
print(f"üöÄ Starting training for {EPOCHS} epochs...")
print(f"üíæ Model will be saved to: {MODEL_SAVE_PATH}")
print("="*60)
print("\n‚è≥ Training in progress... Please wait...\n")

# Train model
history = disease_model.train(
    train_gen,
    val_gen,
    epochs=EPOCHS,
    model_save_path=MODEL_SAVE_PATH
)

print("\n" + "="*60)
print("‚úÖ TRAINING COMPLETED!")
print("="*60)

## üìä PART 5: VISUALISASI HASIL TRAINING

In [None]:
# Plot training history (accuracy & loss)
visualizer.plot_training_history(history, save_path='training_history.png')

## üéØ PART 6: EVALUASI MODEL

In [None]:
# Load best model untuk evaluasi
best_model = keras.models.load_model(MODEL_SAVE_PATH)
print(f"‚úÖ Loaded best model from: {MODEL_SAVE_PATH}")

In [None]:
# Evaluasi model pada test set
evaluator = ModelEvaluator()
y_true, y_pred_classes, test_acc = evaluator.evaluate_model(
    best_model, test_gen, class_names
)

In [None]:
# Plot confusion matrix
visualizer.plot_confusion_matrix(y_true, y_pred_classes, class_names, 
                                 save_path='confusion_matrix.png')

In [None]:
# Plot sample predictions
visualizer.plot_sample_predictions(best_model, test_gen, class_names, 
                                   num_samples=9, save_path='sample_predictions.png')

## üîÆ PART 7: PREDIKSI GAMBAR BARU

Gunakan model untuk memprediksi gambar tanaman baru!

In [None]:
# Inisialisasi predictor
predictor = PlantDiseasePredictor(
    model_path=MODEL_SAVE_PATH,
    class_names=class_names,
    img_size=IMG_SIZE
)

In [None]:
# Prediksi SINGLE IMAGE
# ‚ö†Ô∏è Ganti path dengan gambar yang ingin diprediksi
test_image_path = "path/to/your/test_image.jpg"

# Prediksi dan visualisasi
results = predictor.predict_and_visualize(test_image_path, 
                                         save_path='prediction_result.png')

In [None]:
# Prediksi MULTIPLE IMAGES (Batch)
# ‚ö†Ô∏è Tambahkan list gambar yang ingin diprediksi
test_images = [
    "path/to/image1.jpg",
    "path/to/image2.jpg",
    "path/to/image3.jpg"
]

# Prediksi batch
batch_results = predictor.predict_batch(test_images, visualize=True)

In [None]:
# Lihat hasil prediksi batch dalam format tabel
import pandas as pd

# Convert ke DataFrame untuk tampilan yang lebih rapi
results_data = []
for result in batch_results:
    img_name = os.path.basename(result['image'])
    top_pred = result['predictions'][0]
    results_data.append({
        'Image': img_name,
        'Predicted Class': top_pred['class'],
        'Confidence (%)': f"{top_pred['confidence']:.2f}"
    })

df_results = pd.DataFrame(results_data)
print("\nüìä BATCH PREDICTION RESULTS:")
print("="*60)
print(df_results.to_string(index=False))
print("="*60)

## üíæ PART 8: SIMPAN HASIL & MODEL

In [None]:
# Ringkasan file yang tersimpan
saved_files = [
    MODEL_SAVE_PATH,
    'class_names.txt',
    'config.json',
    'training_history.png',
    'confusion_matrix.png',
    'sample_predictions.png'
]

print("\nüíæ SAVED FILES:")
print("="*60)
for i, file in enumerate(saved_files, 1):
    if os.path.exists(file):
        size = os.path.getsize(file) / (1024*1024)  # Convert to MB
        print(f"{i}. ‚úÖ {file} ({size:.2f} MB)")
    else:
        print(f"{i}. ‚ùå {file} (Not found)")
print("="*60)

## üìù SUMMARY & KESIMPULAN

### Hasil Training:
- ‚úÖ Model berhasil dilatih
- ‚úÖ Model terbaik tersimpan
- ‚úÖ Visualisasi training history dibuat
- ‚úÖ Confusion matrix dibuat
- ‚úÖ Model siap untuk prediksi

### Cara Menggunakan Model:

#### 1. **Untuk Prediksi di Notebook Ini:**
```python
predictor = PlantDiseasePredictor(
    model_path='best_model.h5',
    class_names=class_names
)
results = predictor.predict_and_visualize('path/to/image.jpg')
```

#### 2. **Untuk Prediksi di Script Python Lain:**
```python
from plant_disease_utils import PlantDiseasePredictor, load_class_names

class_names = load_class_names('class_names.txt')
predictor = PlantDiseasePredictor('best_model.h5', class_names)
results = predictor.predict('image.jpg')
```

### File Penting:
- `best_model.h5` - Model terlatih (file paling penting!)
- `class_names.txt` - Daftar nama kelas
- `config.json` - Konfigurasi training
- `plant_disease_utils.py` - Modul utilitas

### Tips:
1. Jangan hapus file `class_names.txt` - diperlukan untuk prediksi
2. Gunakan GPU untuk training lebih cepat
3. Tambah epoch untuk akurasi lebih tinggi
4. Gunakan Transfer Learning untuk hasil terbaik
5. Dataset minimal 100 gambar per kelas untuk hasil optimal

---
**üéâ Selamat! Model deteksi penyakit tanaman Anda siap digunakan!**