<div style="border-left: 6px solid #7B61FF; color:white; padding:20px; border-radius:10px; font-family:Arial, sans-serif; text-align:center; font-size:28px; font-weight:bold;">
  🧱 02 – Baseline Model
</div>

<div style="border-left: 6px solid #27ae60; color:white; margin-left:40px; padding:10px; border-radius:10px; font-family:Arial, sans-serif; font-size:24px;">
  <h2 style="margin-top: 0; font-size:24px;">📦 Import Libraries and Define Paths</h2>
</div>

<div style="margin-left:60px; padding:10px;"> 
  <p style="font-size:18px;">This is the initial block of the rare species image classification project.</p>

  <p>In this section, we perform the following tasks:</p>

  <ul style="line-height: 1.6;">
    <li>📁 <strong>Import libraries</strong> for data manipulation (<code>pandas</code>), file paths (<code>pathlib</code>), and image processing (<code>PIL</code>).</li>
    <li>🖼️ <strong>Apply visual styling</strong> using <code>matplotlib</code> and <code>seaborn</code> to ensure clean and consistent plots.</li>
    <li>📂 <strong>Define the main project directories</strong>, including image folders and the metadata CSV file.</li>
    <li>✅ <strong>Automatic path validation</strong> to ensure all required files and directories exist.</li>
  </ul>

  <p>This setup provides a reliable foundation for safely loading and exploring the dataset.</p>
</div>


In [None]:
# ========================================== 📦 Importar bibliotecas essenciais ==========================================
import os
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
from sklearn.metrics import classification_report, confusion_matrix, ConfusionMatrixDisplay
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dropout, Dense
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ReduceLROnPlateau, EarlyStopping
from tensorflow.keras.callbacks import CSVLogger
from pathlib import Path

In [25]:
# ========================================== 📂 Definir caminhos principais do projeto ==========================================
PROJECT_ROOT = Path().resolve().parent

PROCESSED_DIR = PROJECT_ROOT / 'data' / 'processed'
MODELS_DIR = PROJECT_ROOT / 'models'
REPORTS_DIR = PROJECT_ROOT / 'reports'
OUTPUTS_DIR = PROJECT_ROOT / 'output'
LOGS_DIR = OUTPUTS_DIR / 'logs'
PREDICTIONS_DIR = OUTPUTS_DIR / 'predictions'
TRAIN_DIR = PROCESSED_DIR / 'train'
VAL_DIR = PROCESSED_DIR / 'val'
TEST_DIR = PROCESSED_DIR / 'test'

<div style="border-left: 6px solid #27ae60; color:white; margin-left:40px; padding:10px; border-radius:10px; font-family:Arial, sans-serif; font-size:24px;">
  <h2 style="margin-top: 0; font-size:24px;">📦 Define Parameters</h2>
</div>

<div style="margin-left:60px; padding:10px;"> 
  <p>In this section, we define the core parameters that will guide the training process of the model. These include the input image size, batch size, number of training epochs, and the directory structure of the dataset.</p>
  
  <p>Setting these values early ensures consistency across all steps and allows for easier adjustments when experimenting with different model architectures or datasets.</p>
</div>


In [26]:
IMAGE_SIZE = (224, 224)
BATCH_SIZE = 32
EPOCHS = 10

<div style="border-left: 6px solid #27ae60; color:white; margin-left:40px; padding:10px; border-radius:10px; font-family:Arial, sans-serif; font-size:24px;">
  <h2 style="margin-top: 0; font-size:24px;">📦 Simple CNN Model</h2>
</div>

<div style="margin-left:60px; padding:10px;"> 
  <p>This block defines a basic Convolutional Neural Network (CNN) architecture used as a starting point for image classification.</p>

  <p>The model was built using three convolutional layers followed by max pooling, a flatten layer, a dense layer with ReLU activation, and dropout for regularization. This structure is intentionally simple, serving as a strong baseline for comparing the performance of more complex models.</p>

  <p>All outputs — including the trained model, training logs, accuracy plots, confusion matrix, predictions, and classification reports — were automatically saved in their respective folders: <code>/models</code>, <code>/output</code>, <code>/reports</code>, and <code>/reports/figures</code>.</p>
</div>


In [31]:
def run_cnn_pipeline(train_dir, val_dir, test_dir, model_name="cnn_baseline", image_size=IMAGE_SIZE, batch_size=BATCH_SIZE, epochs=EPOCHS):
    models_dir = MODELS_DIR
    logs_dir = LOGS_DIR
    predictions_dir = PREDICTIONS_DIR
    reports_dir = REPORTS_DIR
    figures_dir = REPORTS_DIR / "figures"
    for d in [models_dir, logs_dir, predictions_dir, figures_dir, reports_dir]:
        d.mkdir(parents=True, exist_ok=True)

    datagen = ImageDataGenerator(rescale=1./255)
    train_generator = datagen.flow_from_directory(train_dir, target_size=image_size, batch_size=batch_size, class_mode='categorical')
    val_generator = datagen.flow_from_directory(val_dir, target_size=image_size, batch_size=batch_size, class_mode='categorical')
    test_generator = datagen.flow_from_directory(test_dir, target_size=image_size, batch_size=batch_size, class_mode='categorical', shuffle=False)

    num_classes = train_generator.num_classes

    model = Sequential([
        Conv2D(32, (3, 3), activation='relu', input_shape=(image_size[0], image_size[1], 3)),
        MaxPooling2D((2, 2)),
        Conv2D(64, (3, 3), activation='relu'),
        MaxPooling2D((2, 2)),
        Conv2D(128, (3, 3), activation='relu'),
        MaxPooling2D((2, 2)),
        Flatten(),
        Dense(128, activation='relu'),
        Dropout(0.3),
        Dense(num_classes, activation='softmax')
    ])

    optimizer = Adam(learning_rate=0.0005)
    model.compile(optimizer=optimizer, loss="categorical_crossentropy", metrics=["accuracy"])

    early_stop = EarlyStopping(monitor="val_loss", patience=5, restore_best_weights=True, verbose=1)
    reduce_lr = ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=3, verbose=1)
    csv_logger = CSVLogger(logs_dir / f"{model_name}_training_log.csv", append=False)

    history = model.fit(train_generator, validation_data=val_generator, epochs=epochs,
                        callbacks=[csv_logger, early_stop, reduce_lr])

    model_path = models_dir / f"{model_name}.h5"
    model_path_weights = models_dir / f"{model_name}.weights.h5"
    model.save(model_path)
    model.save_weights(model_path_weights)

    val_loss, val_acc = model.evaluate(val_generator)

    acc_fig_path = figures_dir / f"{model_name}_accuracy_plot.png"
    plt.figure(figsize=(8, 5))
    plt.plot(history.history['accuracy'], label='Train Accuracy')
    plt.plot(history.history['val_accuracy'], label='Val Accuracy')
    plt.title('Training vs Validation Accuracy')
    plt.xlabel('Epochs')
    plt.ylabel('Accuracy')
    plt.legend()
    plt.grid(True)
    plt.tight_layout()
    plt.savefig(acc_fig_path)
    plt.close()

    predictions = model.predict(test_generator)
    predicted_classes = predictions.argmax(axis=1)
    true_classes = test_generator.classes
    class_indices = test_generator.class_indices
    inv_class_indices = {v: k for k, v in class_indices.items()}
    predicted_labels = [inv_class_indices[i] for i in predicted_classes]
    true_labels = [inv_class_indices[i] for i in true_classes]

    report = classification_report(true_classes, predicted_classes, target_names=list(class_indices.keys()), output_dict=True)
    report_df = pd.DataFrame(report).transpose()
    report_path = reports_dir / f"{model_name}_classification_report.csv"
    report_df.to_csv(report_path)

    heatmap_path = figures_dir / f"{model_name}_classification_report_heatmap_top20.png"
    filtered_df = report_df.drop(["accuracy", "macro avg", "weighted avg"], errors="ignore")
    top_20 = filtered_df.sort_values("support", ascending=False).head(20)

    plt.figure(figsize=(10, 8))
    sns.heatmap(
        top_20[["precision", "recall", "f1-score"]],
        annot=True, fmt=".2f", cmap="YlGnBu",
        linewidths=0.5, annot_kws={"size": 9}
    )
    plt.title("Top 20 Classes – Classification Report", fontsize=14)
    plt.xlabel("Metrics", fontsize=12)
    plt.ylabel("Class", fontsize=12)
    plt.xticks(fontsize=10)
    plt.yticks(fontsize=9)
    plt.tight_layout()
    plt.savefig(heatmap_path)
    plt.close()

    top_labels = list(top_20.index)
    label_to_index = {name: i for i, name in enumerate(class_indices.keys())}
    top_indices = [label_to_index[l] for l in top_labels]

    filtered_true = [i for i in true_classes if i in top_indices]
    filtered_pred = [p for i, p in enumerate(predicted_classes) if true_classes[i] in top_indices]

    cm = confusion_matrix(filtered_true, filtered_pred, labels=top_indices)
    cm_labels = [list(class_indices.keys())[i] for i in top_indices]

    fig, ax = plt.subplots(figsize=(12, 10))
    disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=cm_labels)
    disp.plot(ax=ax, xticks_rotation=45, cmap='Blues', colorbar=True)
    plt.title("Confusion Matrix – Top 20 Classes", fontsize=14)
    plt.tight_layout()

    cm_path = figures_dir / f"{model_name}_confusion_matrix_top20.png"
    plt.savefig(cm_path)
    plt.close()

    cm = confusion_matrix(true_classes, predicted_classes)
    fig, ax = plt.subplots(figsize=(20, 20))
    disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=list(class_indices.keys()))
    disp.plot(ax=ax, xticks_rotation='vertical', cmap='Blues')
    cm_path = figures_dir / f"{model_name}_confusion_matrix.png"
    plt.savefig(cm_path)
    plt.close()

    filenames = test_generator.filenames
    results_df = pd.DataFrame({
        "filename": filenames,
        "true_label": true_labels,
        "predicted_label": predicted_labels
    })
    pred_path = predictions_dir / f"{model_name}_predictions.csv"
    results_df.to_csv(pred_path, index=False)

    return {
        "model_path": model_path,
        "log_path": logs_dir / f"{model_name}_training_log.csv",
        "report_path": report_path,
        "heatmap_path": heatmap_path,
        "confusion_matrix": cm_path,
        "predictions_path": pred_path,
        "accuracy_plot": acc_fig_path,
        "val_accuracy": val_acc
    }

In [None]:
results = run_cnn_pipeline(
    train_dir=TRAIN_DIR,
    val_dir=VAL_DIR,
    test_dir=TEST_DIR,
    model_name="cnn_baseline"
)

Found 8627 images belonging to 202 classes.
Found 2157 images belonging to 202 classes.
Found 1199 images belonging to 202 classes.
Epoch 1/10


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
  self._warn_if_super_not_called()


[1m270/270[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m53s[0m 191ms/step - accuracy: 0.0257 - loss: 5.2228 - val_accuracy: 0.0751 - val_loss: 4.8675 - learning_rate: 5.0000e-04
Epoch 2/10
[1m270/270[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m48s[0m 179ms/step - accuracy: 0.0716 - loss: 4.8511 - val_accuracy: 0.1015 - val_loss: 4.6281 - learning_rate: 5.0000e-04
Epoch 3/10
[1m270/270[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m48s[0m 179ms/step - accuracy: 0.1118 - loss: 4.5145 - val_accuracy: 0.1307 - val_loss: 4.4781 - learning_rate: 5.0000e-04
Epoch 4/10
[1m270/270[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m49s[0m 183ms/step - accuracy: 0.1699 - loss: 4.0462 - val_accuracy: 0.1539 - val_loss: 4.3916 - learning_rate: 5.0000e-04
Epoch 5/10
[1m270/270[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m49s[0m 183ms/step - accuracy: 0.2393 - loss: 3.5113 - val_accuracy: 0.1567 - val_loss: 4.4241 - learning_rate: 5.0000e-04
Epoch 6/10
[1m270/270[0m [32m━━━━━━━━━━━━━━



[1m68/68[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 55ms/step - accuracy: 0.1559 - loss: 4.3554
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m26s[0m 668ms/step


  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


📦 Results Summary:

📁 Model saved at:              D:\Repositories\DL_EOLP\models\cnn_baseline.h5
📄 Training log:                D:\Repositories\DL_EOLP\output\logs\cnn_baseline_training_log.csv
📊 Classification report (CSV): D:\Repositories\DL_EOLP\reports\cnn_baseline_classification_report.csv
🧯 Report heatmap (PNG):        D:\Repositories\DL_EOLP\reports\figures\cnn_baseline_classification_report_heatmap_top20.png
📑 Predictions CSV:             D:\Repositories\DL_EOLP\output\predictions\cnn_baseline_predictions.csv
📈 Accuracy plot:               D:\Repositories\DL_EOLP\reports\figures\cnn_baseline_accuracy_plot.png
📉 Confusion matrix:            D:\Repositories\DL_EOLP\reports\figures\cnn_baseline_confusion_matrix.png
✅ Final validation accuracy:   15.39%


<div style="border-left: 6px solid #27ae60; color:white; margin-left:40px; padding:10px; border-radius:10px; font-family:Arial, sans-serif; font-size:24px;">
  <h2 style="margin-top: 0; font-size:24px;">🧪 CNN Baseline – Final Results Summary</h2>
</div>

<div style="margin-left:60px; padding:10px;line-height: 2.0"> 
  <h2 style="margin-top: 0; font-size:24px;">📌 Model Description</h2>
  <p>
    The baseline model consists of a simple Convolutional Neural Network (CNN) with 3 <code>Conv2D + MaxPooling</code> blocks, followed by a fully connected <code>Dense</code> layer and <code>Dropout</code> for regularization. The final layer uses <code>softmax</code> activation for multi-class prediction.
  </p>
  <p>
    The model was trained with <strong>EarlyStopping</strong> and <strong>ReduceLROnPlateau</strong> over <strong>9 epochs</strong>.
  </p>

  <h2 style="margin-top: 0; font-size:24px;">📊 Performance Overview</h2>
  <ul>
    <li><strong>Final Training Accuracy:</strong> ~61%</li>
    <li><strong>Final Validation Accuracy:</strong> ~18%</li>
    <li><span style="color: orange;"><strong>Observation:</strong> Overfitting detected — the gap between training and validation accuracy suggests poor generalization.</span></li>
  </ul>
  <img src="../reports/figures/cnn_baseline_accuracy_plot.png" alt="Training vs Validation Accuracy" width="600"/>

  <h2 style="margin-top: 0; font-size:24px;">📈 Classification Report</h2>
  <p>The model struggles to perform well across most classes. Only a few classes show acceptable precision or recall values.</p>
  <p><strong>Full Report Heatmap:</strong></p>
  <img src="../reports/figures/cnn_baseline_classification_report_heatmap.png" alt="Classification Report" width="700"/>

  <p><strong>Top 20 Classes Heatmap:</strong></p>
  <img src="../reports/figures/cnn_baseline_classification_report_heatmap_top20.png" alt="Top 20 Classification Report" width="600"/>

  <h2 style="margin-top: 0; font-size:24px;">📉 Confusion Matrix</h2>
  <p>The full matrix is unreadable due to the number of classes. The Top 20 version provides clearer insights.</p>
  <p><strong>Full Confusion Matrix:</strong></p>
  <img src="../reports/figures/cnn_baseline_confusion_matrix.png" alt="Full Confusion Matrix" width="700"/>

  <p><strong>Top 20 Confusion Matrix:</strong></p>
  <img src="../reports/figures/cnn_baseline_confusion_matrix_top20.png" alt="Top 20 Confusion Matrix" width="600"/>

  <h2 style="margin-top: 0; font-size:24px;">💾 Files and Artifacts Saved</h2>
  <ul>
    <li>📁 <code>models/cnn_baseline.h5</code></li>
    <li>📁 <code>models/cnn_baseline.weights.h5</code></li>
    <li>📄 <code>output/logs/cnn_baseline_training_log.csv</code></li>
    <li>📊 <code>reports/cnn_baseline_classification_report.csv</code></li>
    <li>🧾 <code>output/predictions/cnn_baseline_predictions.csv</code></li>
  </ul>
</div>
