This repository contains a Python script for training a Convolutional Neural Network (CNN) to classify images. The model is built using TensorFlow/Keras and utilizes image data augmentation for improved performance.
The code performs the following steps:
-
Data Visualization: Displays a random sample of images from the dataset using matplotlib. This helps understand the data distribution and identify potential issues.
-
Model Definition: Defines a CNN architecture using Keras. The model consists of convolutional layers, max pooling layers, dense layers, dropout, and a sigmoid activation function for binary classification.
-
Data Augmentation: Employs
ImageDataGeneratorfrom Keras to perform data augmentation on the training data. This includes rescaling, rotations, shifts, shearing, zooming, and horizontal flips. Data augmentation helps prevent overfitting and improves the model's ability to generalize to unseen data. -
Data Loading: Loads the image data from a specified directory using
flow_from_directory. This function automatically handles the organization of images into batches and their corresponding labels based on the directory structure. It is important to note that in the original code, bothtrain_generatorandtest_generatoruse thetrain_datagenand point to the same directory (train_path). This should be corrected - thetest_generatorshould use a separate test dataset andtest_datagen(which might have less aggressive augmentation). See "Usage" below for more details. -
Model Compilation: Compiles the model using the Adam optimizer and binary cross-entropy loss. Accuracy is used as the evaluation metric.
-
Model Training: Trains the model using the augmented training data and validates it on the test data. The training process includes a specified number of epochs.
-
Model Evaluation (Not included in original code but recommended): After training, you should evaluate the model's performance on a separate test set to get a realistic estimate of its performance.
- Python 3
- TensorFlow/Keras
- NumPy
- Matplotlib
- Pillow (or other image processing library if needed)
You can install the required libraries using pip:
pip install tensorflow numpy matplotlib
Markdown
# Image Classification Model
This repository contains a Python script for training a Convolutional Neural Network (CNN) to classify images. The model is built using TensorFlow/Keras and utilizes image data augmentation for improved performance.
## Overview
The code performs the following steps:
1. **Data Visualization:** Displays a random sample of images from the dataset using matplotlib. This helps understand the data distribution and identify potential issues.
2. **Model Definition:** Defines a CNN architecture using Keras. The model consists of convolutional layers, max pooling layers, dense layers, dropout, and a sigmoid activation function for binary classification.
3. **Data Augmentation:** Employs `ImageDataGenerator` from Keras to perform data augmentation on the training data. This includes rescaling, rotations, shifts, shearing, zooming, and horizontal flips. Data augmentation helps prevent overfitting and improves the model's ability to generalize to unseen data.
4. **Data Loading:** Loads the image data from a specified directory using `flow_from_directory`. This function automatically handles the organization of images into batches and their corresponding labels based on the directory structure. *It is important to note that in the original code, both `train_generator` and `test_generator` use the `train_datagen` and point to the same directory (`train_path`). This should be corrected - the `test_generator` should use a separate test dataset and `test_datagen` (which might have less aggressive augmentation). See "Usage" below for more details.*
5. **Model Compilation:** Compiles the model using the Adam optimizer and binary cross-entropy loss. Accuracy is used as the evaluation metric.
6. **Model Training:** Trains the model using the augmented training data and validates it on the test data. The training process includes a specified number of epochs.
7. **Model Evaluation (Not included in original code but recommended):** After training, you should evaluate the model's performance on a separate test set to get a realistic estimate of its performance.
## Requirements
- Python 3
- TensorFlow/Keras
- NumPy
- Matplotlib
- Pillow (or other image processing library if needed)
You can install the required libraries using pip:
```bash
pip install tensorflow numpy matplotlib
Usage
Dataset Preparation: Organize your image data into directories, where each subdirectory represents a class. For example:
data/
train/
class1/
image1.jpg
image2.jpg
...
class2/
image1.jpg
...
test/ # Important: Create a separate test directory!
class1/
...
class2/
...
Code Modification:
Correct Test Data: The provided code uses the training directory for both training and validation. This is a major issue. You must create a separate test directory with its own images. Also, create a separate test_datagen. A typical test data generator doesn't include augmentation (or very minimal augmentation). Here's how you would correct the code:
Python
test_datagen = ImageDataGenerator(rescale=1./255) # Less or no augmentation for test
test_generator = test_datagen.flow_from_directory( # Use test_datagen and test_path
test_path, # Path to your test directory
target_size = (224, 224),
batch_size = batch_size,
color_mode = "rgb",
class_mode = "categorical"
)
- **Paths:** Set the `train_path` and `test_path` variables in the script to the correct paths to your training and testing data directories.
Run the script:
Bash
python your_script_name.py
Evaluation: After training, evaluate the model on the test_generator:
Python
loss, accuracy = model.evaluate(test_generator)
print(f"Test Loss: {loss}")
print(f"Test Accuracy: {accuracy}")
Prediction (Not included in original code but recommended): To use the trained model to make predictions on new images:
Python
from tensorflow.keras.preprocessing import image
import numpy as np
img_path = 'path/to/your/new/image.jpg'
img = image.load_img(img_path, target_size=(224, 224))
img_array = image.img_to_array(img)
img_array = np.expand_dims(img_array, axis=0) # Add batch dimension
img_array /= 255. # Rescale
prediction = model.predict(img_array)
print(prediction) # Output will be probabilities for each class
Model Architecture
The model architecture consists of:
Convolutional layers (Conv2D) with ReLU activation
Max pooling layers (MaxPooling2D)
Flatten layer
Dense layers with ReLU activation
Dropout layers for regularization
Output dense layer with sigmoid activation for binary classification
Results
The training results, including loss and accuracy, will be printed to the console during training. The hist variable returned by model.fit contains the training history, which you can use to plot the learning curves.
Further Improvements
More Data: A larger and more diverse dataset will generally lead to better performance.
Hyperparameter Tuning: Experiment with different model architectures, optimizers, learning rates, batch sizes, and data augmentation parameters to optimize the model.
Transfer Learning: Consider using pre-trained models (e.g., ResNet, Inception) as a starting point for your model. Transfer learning can significantly improve performance, especially with limited data.
Regularization: Explore different regularization techniques (e.g., L1/L2 regularization, early stopping) to prevent overfitting.
Evaluation Metrics: Use more comprehensive evaluation metrics (e.g., precision, recall, F1-score, AUC) to get a better understanding of the model's performance.
This README provides a basic guide to using the code. Remember to adapt the paths and parameters to your specific needs.