# **SIGMOID Modeling and Evaluation Notebook**

## Objectives

- Answer Business Requirement 2: Develop a Machine Learning model to classify cherry leaves as Healthy or Infected, enabling the prediction of powdery mildew presence.

## Inputs

Dataset
- inputs/mildew_dataset/cherry-leaves/train
- inputs/mildew_dataset/cherry-leaves/validation
- inputs/mildew_dataset/cherry-leaves/test

Precomputed Features (from Data Visualization Notebook)
- Image Shape Standardization → 128x128x3 for consistency across models.
- Class Distribution Analysis → Ensures balanced dataset splits.
- Pixel Intensity Distribution → Confirms brightness variations relevant for classification.

## Outputs

### Data Processing & Visualization
- **Dataset Distribution Plot** → Confirms balanced data split across training, validation, and test sets.  
- **Data Augmentation Visualization** → Showcases applied transformations (rotation, flipping, zooming).  

### Model Training & Evaluation
- **Baseline CNN Model** → Initial implementation for benchmarking.  
- **Hyperparameter-Tuned CNN** → Optimized model through manual adjustments.  
- **Best Model Selection** → Chooses the final model based on test accuracy and generalization ability.  
- **Saved Trained Models** → Final model stored for deployment.  

### Model Performance & Explainability
- **Learning Curves** → Visualizes loss and accuracy trends over epochs.  
- **Histograms** → Displays predicted probability distributions.  
- **Overfitting & Generalization Check** → Assesses potential overfitting using accuracy and loss gaps.  
- **Confusion Matrices** → Shows classification performance for train, validation, and test sets.  
- **Classification Reports** → Provides precision, recall, and F1-score analysis.  
- **ROC Curves** → Evaluates model performance using Receiver Operating Characteristic analysis.  
- **Business Goal Validation** → Confirms if the model meets the required accuracy threshold.  

## Additional Comments

- **Business Impact:** Enables early detection of powdery mildew, reducing manual inspection and improving monitoring.  
- **Data-Driven Improvements:** Model refinements were based on data insights, ensuring balanced class distribution.  
- **Deployment:** The optimized model is ready for Streamlit integration for real-world use.  



---

# Set Data Directory

---

## Import Necessary Packages

In [None]:
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.image import imread

## Set Working Directory

In [None]:
cwd= os.getcwd()

In [None]:
os.chdir('/workspaces/mildew-detector')
print("You set a new current directory")

#### Confirm the new current directory

In [None]:
work_dir = os.getcwd()
work_dir

## Set Input Directories

In [None]:
# Set input directories
my_data_dir = 'inputs/mildew_dataset/cherry-leaves'
train_path = os.path.join(my_data_dir, 'train')
val_path = os.path.join(my_data_dir, 'validation')
test_path = os.path.join(my_data_dir, 'test')

## Set Output Directory

In [None]:
version = 'draft_sigmoid'
file_path = f'outputs/{version}'

if 'outputs' in os.listdir(work_dir) and version in os.listdir(work_dir + '/outputs'):
    print('Old version is already available create a new version.')
    pass
else:
    os.makedirs(name=file_path)

## Set Label Names

In [None]:
# Set the labels for the images
labels = os.listdir(train_path)
print('Label for the images are', labels)

## Set Image Shape

In [None]:
## Import saved image shape embedding
import joblib
version = 'v1'
image_shape = joblib.load(filename=f"outputs/{version}/image_shape.pkl")
image_shape

---

## Number of Images in Train, Test and Validation Data

In [None]:
import pandas as pd

# Create an empty dictionary
data = {
    'Set': [],
    'Label': [],
    'Frequency': []
}

# Define dataset folders
folders = ['train', 'validation', 'test']

# Loop through each dataset split and count images
for folder in folders:
    for label in labels:
        path = os.path.join(my_data_dir, folder, label)
        num_images = len(os.listdir(path)) if os.path.exists(path) else 0  
        data['Set'].append(folder)
        data['Label'].append(label)
        data['Frequency'].append(num_images)
        print(f" {folder}/{label}: {num_images} images")

# Convert dictionary to DataFrame
df_freq = pd.DataFrame(data)

### Bar Chart - Image Distribution

In [None]:
sns.set_style("whitegrid")
plt.figure(figsize=(8, 5))
sns.barplot(data=df_freq, x='Set', y='Frequency', hue='Label')
plt.title("Image Distribution in Dataset")
plt.xlabel("Dataset Split")
plt.ylabel("Number of Images")
plt.savefig(f'{file_path}/labels_distribution.png', bbox_inches='tight', dpi=150)
plt.show()

---

# Implement Data Augmentation

---

### ImageDataGenerator

In [None]:
# Import TensorFlow/Keras ImageDataGenerator
from tensorflow.keras.preprocessing.image import ImageDataGenerator

## Augment Training, Validation, and Test Sets

- Initialize ImageDataGenerator for Data Augmentation

In [None]:
# Define Augmentation for Training Set
augmented_image_data = ImageDataGenerator(rotation_range=20,
                                          width_shift_range=0.10,
                                          height_shift_range=0.10,
                                          shear_range=0.1,
                                          zoom_range=0.1,
                                          horizontal_flip=True,
                                          vertical_flip=True,
                                          fill_mode='nearest',
                                          rescale=1./255
                                          )

- Augment Training Image Dataset

In [None]:
batch_size = 32  # Set batch size
train_set = augmented_image_data.flow_from_directory(train_path,
                                                     target_size=image_shape[:2],
                                                     color_mode='rgb',
                                                     batch_size=batch_size,
                                                     class_mode='binary',
                                                     shuffle=True
                                                     )

train_set.class_indices

- Augment Validation Image Dataset

In [None]:
validation_set = ImageDataGenerator(rescale=1./255).flow_from_directory(val_path,
                                                                        target_size=image_shape[:2],
                                                                        color_mode='rgb',
                                                                        batch_size=batch_size,
                                                                        class_mode='binary',
                                                                        shuffle=False
                                                                        )

validation_set.class_indices

- Augment Test Image Dataset

In [None]:
test_set = ImageDataGenerator(rescale=1./255).flow_from_directory(test_path,
                                                                  target_size=image_shape[:2],
                                                                  color_mode='rgb',
                                                                  batch_size=batch_size,
                                                                  class_mode='binary',
                                                                  shuffle=False
                                                                  )

test_set.class_indices

---

## Visualization of Augmented Images

### Plot Augmented Training Image

In [None]:
for _ in range(3):
    img, label = next(train_set)
    print(img.shape)  # (1,256,256,3)
    plt.imshow(img[0])
    plt.show()

### Plot Augmented Validation and Test Images

In [None]:
for _ in range(3):
    img, label = next(validation_set)
    print(img.shape)  # (1,256,256,3)
    plt.imshow(img[0])
    plt.show()

In [None]:
for _ in range(3):
    img, label = next(test_set)
    print(img.shape)  # (1,256,256,3)
    plt.imshow(img[0])
    plt.show()

### Save Class Indices

In [None]:
joblib.dump(value=train_set.class_indices,
            filename=f"{file_path}/class_indices.pkl")

### Compare Multiple Augmented Images in a Grid

In [None]:
def plot_augmented_images_grid(data_generator, num_images=10):
    """Displays a grid of augmented images to visualize transformation effects."""
    img_batch, label_batch = next(data_generator)

    fig, axes = plt.subplots(2, num_images // 2, figsize=(15, 6))
    
    for i in range(num_images):
        ax = axes[i // (num_images // 2), i % (num_images // 2)]
        ax.imshow(img_batch[i])
        ax.axis("off")

    plt.suptitle("Augmented Image Variations (Training Set)")
    plt.show()

# Display the augmented image grid
plot_augmented_images_grid(train_set)

---

# Model Creation

---

### Import Libraries

In [None]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import (
    Conv2D,
    MaxPooling2D,
    Flatten,
    Dense,
    Dropout,
    BatchNormalization,
    Input,
)
from tensorflow.keras.regularizers import l2
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.optimizers import Adagrad

## Convolutional Neural Network with Sigmoid

In [None]:
# Define L2 regularization strength
l2_lambda = 0.005  

# Create Sigmoid CNN Model with Fine-Tuned Regularization
model_sigmoid = Sequential(
    [
        Input(shape=(128, 128, 3)),
        Conv2D(16, (3, 3), activation="relu", kernel_regularizer=l2(l2_lambda)),
        MaxPooling2D((2, 2)),
        Conv2D(32, (3, 3), activation="relu", kernel_regularizer=l2(l2_lambda)),
        MaxPooling2D((2, 2)),
        Conv2D(64, (3, 3), activation="relu", kernel_regularizer=l2(l2_lambda)),
        MaxPooling2D((2, 2)),
        Flatten(),
        Dense(128, activation="relu", kernel_regularizer=l2(l2_lambda)),
        Dropout(0.2),  
        Dense(1, activation="sigmoid"),  # Sigmoid for binary classification
    ]
)

# Compile Model
model_sigmoid.compile(
    optimizer=Adam(learning_rate=0.001),  
    loss="binary_crossentropy",
    metrics=["accuracy"],
)

### Model Summary 

In [None]:
# Print model summary
model_sigmoid.summary()

In [None]:
# Save model summary to a text file
with open("outputs/draft_sigmoid/model_summary.txt", "w") as f:
    model.summary(print_fn=lambda x: f.write(x + "\n"))

---

## Model Training

### Early Stopping Implementation

In [None]:
# Import required callbacks
from tensorflow.keras.callbacks import EarlyStopping

# Set EarlyStopping callback
early_stop = EarlyStopping(monitor="val_loss", patience=5, restore_best_weights=True)

### Fit CNN Model for Training

In [None]:
# Train the base CNN model
history_sigmoid = model_sigmoid.fit(
    train_set,
    epochs=20,
    steps_per_epoch=len(train_set.classes) // batch_size,
    validation_data=validation_set,
    callbacks=[early_stop],
    verbose=1
)

### Save the Best Model

In [None]:
# Save the trained base CNN model
model_sigmoid.save("outputs/draft_sigmoid/mildew_detector_sigmoid.h5")

---

# Model Performance & Evaluation

---

### Import Packages

In [None]:
import sklearn
import sklearn.metrics as metrics
from sklearn.metrics import (
    classification_report,
    confusion_matrix,
    f1_score,
    accuracy_score,
)

### Load Saved Model

In [None]:
from keras.models import load_model

model = load_model("outputs/draft_sigmoid/mildew_detector_sigmoid.h5")

## Model Evaluation

In [None]:
## Model Evaluation
evaluation = model.evaluate(test_set, batch_size=batch_size)
print("Model accuracy: {:.2f}%".format(evaluation[1] * 100))
print("Model Loss: ", evaluation[0])

## Set Accuracy Variables

In [None]:
# Correctly obtain true labels
y_true = test_set.labels

# Obtain model predictions
preds = model.predict(test_set)
y_pred = np.argmax(preds, axis=1)

## Save Training History

In [None]:
df_history_sigmoid = pd.DataFrame(history_sigmoid.history)
df_history_sigmoid.to_csv("outputs/draft_sigmoid/history_sigmoid.csv", index=False)
print("Sigmoid CNN training history saved.")

## Learning Curves

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

output_dir = "outputs/draft_sigmoid"

# Set Seaborn style
sns.set_style("whitegrid")

# Loss Curve
df_history_sigmoid[["loss", "val_loss"]].plot(style=".-")
plt.title("Loss - Sigmoid CNN")
plt.legend(["Training Loss", "Validation Loss"])
plt.grid(True)
plt.tight_layout()
plt.savefig(
    "outputs/draft_sigmoid/model_training_losses.png", bbox_inches="tight", dpi=150
)
plt.show()

# Accuracy Curve
df_history_sigmoid[["accuracy", "val_accuracy"]].plot(style=".-")
plt.title("Accuracy - Sigmoid CNN")
plt.legend(["Training Accuracy", "Validation Accuracy"])
plt.grid(True)
plt.tight_layout()
plt.savefig(
    "outputs/draft_sigmoid/model_training_acc.png", bbox_inches="tight", dpi=150
)
plt.show()

## Histograms

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

output_dir = "outputs/draft_sigmoid"

# Get predicted probabilities
y_pred_probs = model_sigmoid.predict(validation_set)

# Plot histogram
plt.figure(figsize=(5, 4))
sns.set_style("whitegrid")
sns.histplot(
    y_pred_probs[:, 0], bins=20, kde=True, color="green", alpha=0.6, label="Healthy"
)
sns.histplot(
    y_pred_probs[:, 0], bins=20, kde=True, color="blue", alpha=0.6, label="Infected"
)

plt.axvline(x=0.5, color="red", linestyle="dashed", label="Threshold = 0.5")
plt.title("Prediction Probability Histogram")
plt.xlabel("Predicted Probability")
plt.ylabel("Frequency")
plt.legend()

# Save figure
plt.savefig("outputs/draft_sigmoid/histogram_test.png", bbox_inches="tight", dpi=150)
plt.show()

## Overfitting & Generalization Check

In [None]:
# Extract the last recorded training & validation metrics
train_acc = history_sigmoid.history["accuracy"][-1]
val_acc = history_sigmoid.history["val_accuracy"][-1]
train_loss = history_sigmoid.history["loss"][-1]
val_loss = history_sigmoid.history["val_loss"][-1]

# Compute Generalization Gap
accuracy_gap = train_acc - val_acc
loss_gap = val_loss - train_loss

print("\n### Generalization & Overfitting Check ###")
print(f"Final Train Accuracy: {train_acc:.4f}")
print(f"Final Validation Accuracy: {val_acc:.4f}")
print(f"Accuracy Gap: {accuracy_gap:.4f}")

print(f"Final Train Loss: {train_loss:.4f}")
print(f"Final Validation Loss: {val_loss:.4f}")
print(f"Loss Gap: {loss_gap:.4f}")

# Overfitting Analysis
if accuracy_gap > 0.05:
    print(
        "\nOverfitting detected: The model performs significantly better on training data than validation data."
    )

if loss_gap > 0.05:
    print(
        "\nOverfitting detected: Validation loss is significantly higher than training loss."
    )

if accuracy_gap < 0.05 and loss_gap < 0.05:
    print("\nNo significant overfitting detected. Model generalizes well.")

## Confusion Matrix

In [None]:
from sklearn.metrics import confusion_matrix
import os
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

output_dir = "outputs/draft_sigmoid"

# Get Class Labels
label_map = list(test_set.class_indices.keys())

# Evaluate Model on Train and Test Sets
y_true_train = train_set.classes
y_pred_train = (model_sigmoid.predict(train_set) > 0.5).astype(int)

y_true_test = test_set.classes
y_pred_test = (model_sigmoid.predict(test_set) > 0.5).astype(int)

# Generate Confusion Matrices
cm_train = confusion_matrix(y_true_train, y_pred_train)
cm_test = confusion_matrix(y_true_test, y_pred_test)

# Plot Confusion Matrices Side by Side
fig, axes = plt.subplots(1, 2, figsize=(10, 4))

sns.heatmap(
    pd.DataFrame(cm_train, index=label_map, columns=label_map),
    annot=True,
    fmt="d",
    cmap="Blues",
    linewidths=0.5,
    ax=axes[0],
)
axes[0].set_title("Confusion Matrix - Train Set")
axes[0].set_xlabel("Predicted Label")
axes[0].set_ylabel("True Label")

sns.heatmap(
    pd.DataFrame(cm_test, index=label_map, columns=label_map),
    annot=True,
    fmt="d",
    cmap="Blues",
    linewidths=0.5,
    ax=axes[1],
)
axes[1].set_title("Confusion Matrix - Test Set")
axes[1].set_xlabel("Predicted Label")
axes[1].set_ylabel("True Label")

plt.tight_layout()

# Save Figure
save_path = os.path.join(output_dir, "confusion_matrices_train_test.png")
plt.savefig(save_path, dpi=150)
plt.show()

print(f"Confusion Matrices saved at: {save_path}")

## Classification Reports

In [None]:
from sklearn.metrics import classification_report
import os


output_dir = "outputs/draft_sigmoid"

# Generate classification reports as text
report_train = classification_report(
    y_true_train, y_pred_train, target_names=label_map, digits=3, zero_division=1
)
report_test = classification_report(
    y_true_test, y_pred_test, target_names=label_map, digits=3, zero_division=1
)
# Print side by side
print("\n### Classification Reports (Train vs Test) ###\n")
train_lines = report_train.split("\n")
test_lines = report_test.split("\n")

# Align Train and Test reports side by side
for train_line, test_line in zip(train_lines, test_lines):
    print(f"{train_line:<40} | {test_line}")

# Save reports as text files
with open(f"{output_dir}/classification_report_train.txt", "w") as f:
    f.write(report_train)

with open(f"{output_dir}/classification_report_test.txt", "w") as f:
    f.write(report_test)

print("\nReports saved to outputs/draft_sigmoid/")

## ROC Curves

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import roc_curve, auc
import os

output_dir = "outputs/draft_sigmoid"

# Generate predictions (probabilities)
y_probs_train = model_sigmoid.predict(train_set)
y_probs_test = model_sigmoid.predict(test_set)

# Compute ROC curve for sigmoid model
fpr_train, tpr_train, _ = roc_curve(y_true_train, y_probs_train)  # Remove `[:, 1]`
fpr_test, tpr_test, _ = roc_curve(y_true_test, y_probs_test)      # Remove `[:, 1]`

auc_train = auc(fpr_train, tpr_train)
auc_test = auc(fpr_test, tpr_test)

# Plot ROC Curves
plt.figure(figsize=(6, 5))
plt.plot(fpr_train, tpr_train, label=f"Train AUC = {auc_train:.2f}")
plt.plot(fpr_test, tpr_test, label=f"Test AUC = {auc_test:.2f}")
plt.plot([0, 1], [0, 1], "k--", label="Random (AUC = 0.50)")

plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve - Train vs Test")
plt.legend(loc="lower right")

# Save figure
roc_curve_path = os.path.join(output_dir, "roc_curve.png")
plt.savefig(roc_curve_path, dpi=150)
plt.show()

print(f"ROC Curve saved at: {roc_curve_path}")

## Save Final Evaluation Results

In [None]:
import joblib
import os
from sklearn.metrics import classification_report, confusion_matrix

output_dir = "outputs/draft_sigmoid"

# Evaluate the model on the test set
test_loss, test_accuracy = model.evaluate(test_set, batch_size=batch_size)
print(f"Test Accuracy: {test_accuracy:.4f}")
print(f"Test Loss: {test_loss:.4f}")

# Get True Labels & Predictions
y_true = test_set.classes
y_pred_probs = model.predict(test_set)
y_pred = np.argmax(y_pred_probs, axis=1)

# Save evaluation results
evaluation_results = {
    "test_loss": test_loss,
    "test_accuracy": test_accuracy,
    "classification_report": classification_report(
        y_true, y_pred, target_names=label_map, output_dict=True
    ),
    "confusion_matrix": confusion_matrix(y_true, y_pred),
}

# Save to pickle file
joblib.dump(evaluation_results, "outputs/draft_sigmoid/evaluation.pkl")
print("Evaluation results saved: outputs/draft_sigmoid/evaluation.pkl")

## Business Goal Validation

In [None]:
# Define minimum required accuracy
accuracy_threshold = 0.90

# Load evaluation results
evaluation_results = joblib.load("outputs/draft_sigmoid/evaluation.pkl")

# Extract final test accuracy
test_accuracy = evaluation_results["test_accuracy"]

# Check requirement
if test_accuracy >= accuracy_threshold:
    print(f"Model meets the business requirement! (Accuracy: {test_accuracy:.2%})")
else:
    print(f"Model does NOT meet the requirement. (Accuracy: {test_accuracy:.2%})")

---

# Predict on New Images

---

## Load the Final Model

In [None]:
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing import image

# Load the Model
model = load_model("outputs/draft_sigmoid/mildew_detector_sigmoid.h5")

### Select and Load a Random Test Image

In [None]:
# Define test image selection parameters
pointer = 60  # Change this number to select a different image
label = labels[1]  # Select "Healthy" (0) or "Infected" (1)

# Load the image using PIL
img_path = test_path + "/" + label + "/" + os.listdir(test_path + "/" + label)[pointer]
pil_image = image.load_img(img_path, target_size=image_shape, color_mode="rgb")

# Display image details
print(f"Selected Image Path: {img_path}")
print(f"Image shape: {pil_image.size}, Image mode: {pil_image.mode}")

# Show the image
pil_image

### Convert Image to Array and Prepare for Model Input

In [None]:
my_image = image.img_to_array(pil_image)
my_image = np.expand_dims(my_image, axis=0) / 255.0  # Normalize pixel values
print(my_image.shape)

### Make Prediction & Display Result

In [None]:
# Predict class probabilities
pred_proba = model.predict(my_image)[0, 0]  # Extract single probability score

# Map indices to class labels
target_map = {v: k for k, v in train_set.class_indices.items()}  # Reverse mapping
pred_class = target_map[int(pred_proba > 0.5)]  # Ensure correct label mapping

# Adjust probability if necessary
if pred_class == target_map[0]:
    pred_proba = 1 - pred_proba

# Print prediction results
print(f"Predicted Class: {pred_class}")
print(f"Prediction Probability: {pred_proba:.4f}")

# Save the image to outputs/draft for PDF report
os.makedirs("outputs/draft_sigmoid", exist_ok=True)
pil_image.save("outputs/draft_sigmoid/selected_test_image.png")

# Save prediction results as a text file
with open("outputs/draft_sigmoid/prediction_result.txt", "w") as f:
    f.write(f"Predicted Class: {pred_class}\n")
    f.write(f"Prediction Probability: {pred_proba:.4f}\n")

---

# Conclusion and Next Steps

---

We successfully developed a deep learning model for image classification using a structured, beginner-friendly approach.  

### **Key Achievements**
- **Baseline & Optimized CNNs** → Established a benchmark model and improved it through manual hyperparameter tuning.  
- **Comprehensive Evaluation** → Assessed performance using accuracy, loss, confusion matrices, and ROC curves.  
- **Model Explainability** → Utilized evaluation metrics to understand predictions and ensure reliability.  
- **Final Model Selection** → Chose the best-performing model for deployment.  

### **Next Steps: Model Deployment**
- **Web App Integration** → Implement a user-friendly Streamlit interface for real-time image classification.  
- **Model Deployment** → Load the trained model and deploy it on a cloud platform for accessibility.  

This deployment will enable efficient real-world usage, making automated classification accessible to users.  