

# **Automated Diagnosis of Retinal Diseases from OCT Images Using Deep Learning**

## Project Overview

Retinal diseases such as Choroidal Neovascularization (CNV), Diabetic Macular Edema (DME), and Drusen are among the primary causes of vision loss and blindness worldwide. Early diagnosis is critical for effective treatment, but manual interpretation of Optical Coherence Tomography (OCT) images can be both time-consuming and subject to variability between clinicians.

This project aims to develop an automated deep learning system that classifies OCT images into four categories:
- **CNV**
- **DME**
- **DRUSEN**
- **NORMAL**

Using state-of-the-art transfer learning techniques, the system fine-tunes pre-trained model (ResNet50) to extract robust features from OCT scans. The training is performed in two phases: first, the classification head is trained with the pre-trained base frozen, and then a subset of the base model layers is unfrozen to fine-tune the network with a lower learning rate. This two-phase training strategy allows the model to benefit from the general features learned on large-scale datasets while adapting to the specifics of retinal imaging.

## About the Dataset

The dataset used in this project is the [Kaggle “anirudhcv/labeled-optical-coherence-tomography-oct”](https://www.kaggle.com/datasets/anirudhcv/labeled-optical-coherence-tomography-oct) dataset. It contains thousands of high-resolution OCT images sorted into four distinct subfolders corresponding to each retinal condition:
- **CNV**
- **DME**
- **DRUSEN**
- **NORMAL**

After extraction from a RAR file (located at `/content/drive/MyDrive/OCT.rar`), the expected directory structure is as follows:

```
/content/OCT_data/OCT/
    ├── CNV/
    ├── DME/
    ├── DRUSEN/
    └── NORMAL/
```

The images are preprocessed by resizing them to a fixed size (224×224 pixels) and normalizing pixel values to the range [0,1]. This ensures consistency across the dataset and facilitates training.

## Model Development and Training

### Data Preprocessing

- **Extraction**: The dataset is extracted from the provided RAR archive into a structured directory.
- **Normalization**: All images are scaled so that pixel values fall between 0 and 1.
- **Train-Validation Split**: A standard 80/20 split is used to separate training and validation datasets.

### Model Architectures

The project supports the pre-trained architecture:
- **ResNet50**



### Two-Phase Training Strategy

1. **Phase 1 (Frozen Base Training)**:  
   The model is first trained with the pre-trained base ( ResNet50) frozen. This allows the new classification head to learn using the robust features already extracted by the base model.

2. **Phase 2 (Fine-Tuning)**:  
   After initial training, the top layers of the base model (e.g., the last 30 layers) are unfrozen. The model is then recompiled with a lower learning rate (e.g., 1e-5) and trained further. This fine-tuning helps the model adapt more closely to the OCT images without disrupting the pre-trained weights significantly.

### Evaluation

The model is evaluated using several key metrics:
- **Accuracy**
- **Weighted F1 Score**
- **Sensitivity (Recall)**
- **Specificity**

These metrics are computed on the validation dataset to ensure that the model generalizes well.

## Deployment & User Interaction

To facilitate real-world usage, a **Gradio interface** is implemented. This interface allows clinicians to easily upload an OCT image and receive an immediate classification result. The interface processes the image (resizing and normalization), passes it through the trained model, and then outputs a human-readable prediction that includes the predicted class (CNV, DME, DRUSEN, or NORMAL) along with the corresponding confidence level.

### How It Works

- **Image Upload**: The clinician uploads an OCT image via the Gradio web interface.
- **Preprocessing**: The image is resized and normalized to match the training input.
- **Prediction**: The image is fed into the trained deep learning model, which outputs class probabilities.
- **Result Display**: The predicted class along with its confidence score is displayed.

This user-friendly deployment makes it easy for clinicians to incorporate automated diagnostic support into their workflow.

## Conclusion

This project demonstrates how modern deep learning techniques can be effectively applied to OCT imaging for automated retinal disease diagnosis. By leveraging transfer learning and a two-phase training strategy, the system is capable of achieving competitive accuracy, while the Gradio interface offers a practical deployment option for clinical use.



##**Implementation**

Gradio is a Python library enabling quick creation of interactive web interfaces for machine learning models. It simplifies deploying and sharing AI applications without frontend coding.

In [None]:
!pip install gradio




Import Important Libraries

In [None]:
import os
import numpy as np
import matplotlib.pyplot as plt
import cv2
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models, applications
from sklearn.metrics import confusion_matrix, f1_score, accuracy_score, recall_score
import gradio as gr  # For deployment interface

Mount the Drive and define global variables

In [None]:
# Global variables
DATASET_RAR_PATH = "/content/drive/MyDrive/OCT.rar"  # Classification dataset RAR path on Drive
EXTRACT_PATH = "/content/OCT_data"                   # Where dataset will be extracted
IMAGE_SIZE = (224, 224)                              # Target image size for classification
BATCH_SIZE = 32
INITIAL_EPOCHS = 5                                   # Phase 1: frozen base
FINE_TUNE_EPOCHS = 10                                # Phase 2: unfreeze top layers
NUM_CLASSES = 4                                      # e.g., CNV, DME, DRUSEN, NORMAL                                          # Adjust epochs as needed
NUM_CLASSES = 4

Extract the dataset

In [None]:
# Mount Drive and Extract Dataset (RAR)

from google.colab import drive
drive.mount('/content/drive')


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
# Install unrar
!apt-get update
!apt-get install -y unrar

0% [Working]            Hit:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease
0% [Connecting to archive.ubuntu.com (91.189.91.83)] [Connecting to security.ubuntu.com (91.189.91.8                                                                                                    Hit:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
0% [Waiting for headers] [Waiting for headers] [Connected to r2u.stat.illinois.edu (192.17.190.167)]                                                                                                    Hit:3 http://archive.ubuntu.com/ubuntu jammy InRelease
0% [Waiting for headers] [Waiting for headers] [Waiting for headers] [Connecting to ppa.launchpadcon                                                                                                    Hit:4 http://security.ubuntu.com/ubuntu jammy-security InRelease
0% [Waiting for headers] [Waiting for headers] [Connecting to ppa.launchpadc

In [None]:
# Extract the RAR if not already extracted
if not os.path.exists(EXTRACT_PATH):
    os.makedirs(EXTRACT_PATH, exist_ok=True)
    # Use the "x" option to extract with full path
    !unrar x "{DATASET_RAR_PATH}" "{EXTRACT_PATH}"
    print("Dataset extracted to", EXTRACT_PATH)
else:
    print("Dataset already extracted or folder exists.")

Dataset already extracted or folder exists.


In [None]:
!ls -R /content/OCT_data

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
NORMAL-2604008-7.jpeg	NORMAL-4614357-14.jpeg	NORMAL-7056534-6.jpeg	NORMAL-9193451-4.jpeg
NORMAL-2604008-8.jpeg	NORMAL-4614357-15.jpeg	NORMAL-7056534-7.jpeg	NORMAL-9193451-5.jpeg
NORMAL-2604008-9.jpeg	NORMAL-4614357-16.jpeg	NORMAL-7056534-8.jpeg	NORMAL-9193451-6.jpeg
NORMAL-2611009-10.jpeg	NORMAL-4614357-17.jpeg	NORMAL-7056534-9.jpeg	NORMAL-9193451-7.jpeg
NORMAL-2611009-11.jpeg	NORMAL-4614357-18.jpeg	NORMAL-7057023-1.jpeg	NORMAL-9193451-8.jpeg
NORMAL-2611009-1.jpeg	NORMAL-4614357-19.jpeg	NORMAL-7057023-2.jpeg	NORMAL-9193451-9.jpeg
NORMAL-2611009-2.jpeg	NORMAL-4614357-1.jpeg	NORMAL-7057023-3.jpeg	NORMAL-9194489-10.jpeg
NORMAL-2611009-3.jpeg	NORMAL-4614357-2.jpeg	NORMAL-7057023-4.jpeg	NORMAL-9194489-1.jpeg
NORMAL-2611009-4.jpeg	NORMAL-4614357-3.jpeg	NORMAL-7057023-5.jpeg	NORMAL-9194489-2.jpeg
NORMAL-2611009-5.jpeg	NORMAL-4614357-4.jpeg	NORMAL-7057023-6.jpeg	NORMAL-9194489-3.jpeg
NORMAL-2611009-6.jpeg	NORMAL-4614357-5.jpeg	NO

In [None]:
EXTRACT_PATH = "/content/OCT_data"

train_dataset = tf.keras.preprocessing.image_dataset_from_directory(
    EXTRACT_PATH,    # <-- This only has one folder named 'OCT'
    validation_split=0.2,
    subset="training",
    seed=123,
    image_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE
)

Found 109309 files belonging to 1 classes.
Using 87448 files for training.


In [None]:
DATASET_DIR = "/content/OCT_data/OCT"  # The folder that actually has CNV, DME, DRUSEN, NORMAL

train_dataset = tf.keras.preprocessing.image_dataset_from_directory(
    DATASET_DIR,
    validation_split=0.2,
    subset="training",
    seed=123,
    image_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE
)

val_dataset = tf.keras.preprocessing.image_dataset_from_directory(
    DATASET_DIR,
    validation_split=0.2,
    subset="validation",
    seed=123,
    image_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE
)

Found 109309 files belonging to 4 classes.
Using 87448 files for training.
Found 109309 files belonging to 4 classes.
Using 21861 files for validation.


In [None]:
print(train_dataset.class_names)

['CNV', 'DME', 'DRUSEN', 'NORMAL']


The Normalization scales the pixel values of the images from their original range to [0,1] by applying the Rescaling layer. It ensures that both the training and validation datasets are normalized, providing consistent input for the model.

In [None]:
# Apply normalization only
normalization_layer = layers.Rescaling(1./255)
train_dataset = train_dataset.map(lambda x, y: (normalization_layer(x), y))
val_dataset = val_dataset.map(lambda x, y: (normalization_layer(x), y))


The prefetching, allowing the dataset to prepare the next batch while the model is training on the current one. Using `AUTOTUNE` optimizes the prefetch buffer size dynamically for improved performance and reduced I/O latency.

In [None]:
# Prefetch for performance
train_dataset = train_dataset.prefetch(buffer_size=tf.data.AUTOTUNE)
val_dataset = val_dataset.prefetch(buffer_size=tf.data.AUTOTUNE)

We choose resnet50

In [None]:
# Choose the architecture: set to 'resnet50', 'densenet121', or 'inceptionv3'
ARCHITECTURE = 'resnet50'

## I chose The resnet50, as it worked well, however for future work i'll try other arcitectures as well.

The function below constructs a classification model using the specified architecture (ResNet50, DenseNet121, or InceptionV3) as a frozen feature extractor, followed by global average pooling, dropout, and a dense classification head.
If InceptionV3 is selected, it updates the input size to (299,299) to meet its requirements before building and compiling the model.

In [None]:
# Model Development
# Build a Classification Model with a Choice of Architecture
# The function below builds a model using one of the specified architectures.
# If InceptionV3 is chosen, IMAGE_SIZE is updated to (299,299).

def build_classification_model(architecture):
    arch = architecture.lower()
    global IMAGE_SIZE  # allow updating IMAGE_SIZE for InceptionV3
    if arch == 'resnet50':
        base_model = applications.ResNet50(
            input_shape=IMAGE_SIZE + (3,),
            include_top=False,
            weights='imagenet'
        )
    elif arch == 'densenet121':
        base_model = applications.DenseNet121(
            input_shape=IMAGE_SIZE + (3,),
            include_top=False,
            weights='imagenet'
        )
    elif arch == 'inceptionv3':
        # InceptionV3 generally requires larger inputs
        IMAGE_SIZE = (299, 299)
        base_model = applications.InceptionV3(
            input_shape=IMAGE_SIZE + (3,),
            include_top=False,
            weights='imagenet'
        )
    else:
        raise ValueError("Unsupported architecture. Choose 'resnet50', 'densenet121', or 'inceptionv3'.")

    base_model.trainable = False  # Freeze base

    inputs = keras.Input(shape=IMAGE_SIZE + (3,))
    x = base_model(inputs, training=False)
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Dropout(0.2)(x)
    outputs = layers.Dense(NUM_CLASSES, activation='softmax')(x)
    model = keras.Model(inputs, outputs)

    model.compile(
        optimizer=keras.optimizers.Adam(),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    return model


In [None]:
# Build the model based on the chosen architecture "Resnet50"
classification_model = build_classification_model(ARCHITECTURE)
classification_model.summary()

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
[1m94765736/94765736[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


The code below initiates Phase 1 of training where the base model remains frozen so that only the classification head is updated. The model is trained for 5 epochs using the training and validation datasets.

In [None]:
# Two-Phase Training
# Phase 1: Train with the Base Model Frozen

print("Phase 1: Training with frozen base ({} epochs)".format(INITIAL_EPOCHS))
history_1 = classification_model.fit(
    train_dataset,
    validation_data=val_dataset,
    epochs=INITIAL_EPOCHS
)

Phase 1: Training with frozen base (5 epochs)
Epoch 1/5
[1m2733/2733[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m362s[0m 127ms/step - accuracy: 0.5612 - loss: 1.1001 - val_accuracy: 0.6600 - val_loss: 0.9609
Epoch 2/5
[1m2733/2733[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m313s[0m 114ms/step - accuracy: 0.6472 - loss: 0.9715 - val_accuracy: 0.6508 - val_loss: 0.9387
Epoch 3/5
[1m2733/2733[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m398s[0m 146ms/step - accuracy: 0.6603 - loss: 0.9421 - val_accuracy: 0.6768 - val_loss: 0.9060
Epoch 4/5
[1m2733/2733[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m363s[0m 117ms/step - accuracy: 0.6678 - loss: 0.9220 - val_accuracy: 0.6863 - val_loss: 0.8888
Epoch 5/5
[1m2733/2733[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m315s[0m 115ms/step - accuracy: 0.6729 - loss: 0.9121 - val_accuracy: 0.6883 - val_loss: 0.8793


Now unfreezes the last 30 layers of the base model so that they can be fine-tuned to better capture features specific to the OCT dataset. It then recompiles the model with a reduced learning rate (1e-5) and trains it for 10 epoches, allowing the pre-trained weights to adapt gradually.

In [None]:
# Phase 2: Fine-Tune Top Layers
# Unfreeze the top layers of the base model and train with a lower learning rate.
# Unfreeze the last 30 layers of the base model
UNFREEZE_LAYERS = 30
for layer in classification_model.layers:
    # Check if the layer belongs to the base model
    if hasattr(layer, 'name') and 'conv' in layer.name:
        continue  # This is not a robust check; instead, unfreeze layers from base_model:
for layer in classification_model.layers[1].layers[-UNFREEZE_LAYERS:]:
    layer.trainable = True

classification_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-5),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

print("Phase 2: Fine-tuning top {} layers for {} epochs".format(UNFREEZE_LAYERS, FINE_TUNE_EPOCHS))
history_2 = classification_model.fit(
    train_dataset,
    validation_data=val_dataset,
    epochs=FINE_TUNE_EPOCHS
)


Phase 2: Fine-tuning top 30 layers for 10 epochs
Epoch 1/10
[1m2733/2733[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m459s[0m 159ms/step - accuracy: 0.6725 - loss: 2.3668 - val_accuracy: 0.7937 - val_loss: 0.5789
Epoch 2/10
[1m2733/2733[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m407s[0m 149ms/step - accuracy: 0.8078 - loss: 0.5530 - val_accuracy: 0.8453 - val_loss: 0.4411
Epoch 3/10
[1m2733/2733[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m423s[0m 155ms/step - accuracy: 0.8369 - loss: 0.4639 - val_accuracy: 0.8512 - val_loss: 0.4340
Epoch 4/10
[1m2733/2733[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m474s[0m 167ms/step - accuracy: 0.8529 - loss: 0.4188 - val_accuracy: 0.8609 - val_loss: 0.3888
Epoch 5/10
[1m2733/2733[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m435s[0m 159ms/step - accuracy: 0.8653 - loss: 0.3845 - val_accuracy: 0.8562 - val_loss: 0.4316
Epoch 6/10
[1m2733/2733[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m411s[0m 148ms/step - accuracy: 0.8732 

Now collects true labels and model predictions from the validation dataset and calculates key evaluation metrics such as accuracy, weighted F1 score, and sensitivity (recall).
It also computes the confusion matrix to derive per-class specificity values and then averages these to obtain the overall specificity.

In [None]:
# Final Evaluation
# Compute metrics: accuracy, weighted F1, sensitivity (recall), specificity, etc.

y_true = []
y_pred = []

for images, labels in val_dataset:
    preds = classification_model.predict(images)
    y_true.extend(labels.numpy())
    y_pred.extend(np.argmax(preds, axis=1))

y_true = np.array(y_true)
y_pred = np.array(y_pred)

acc = accuracy_score(y_true, y_pred)
f1 = f1_score(y_true, y_pred, average='weighted')
sensitivity = recall_score(y_true, y_pred, average='weighted')

cm = confusion_matrix(y_true, y_pred)
specificities = []
for i in range(len(cm)):
    tn = np.sum(np.delete(np.delete(cm, i, axis=0), i, axis=1))
    fp = np.sum(np.delete(cm, i, axis=0)[:, i])
    spec = tn / (tn + fp) if (tn + fp) > 0 else 0
    specificities.append(spec)
specificity_avg = np.mean(specificities)


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 10s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 168ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 165ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 167ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 134ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 115ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 118ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 150ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 144ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 117ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 141ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 134ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 139ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m 

In [None]:
print("Classification Accuracy: {:.4f}".format(acc))
print("Classification F1 Score: {:.4f}".format(f1))
print("Classification Sensitivity (Recall): {:.4f}".format(sensitivity))
print("Classification Specificity: {:.4f}".format(specificity_avg))

Classification Accuracy: 0.8856
Classification F1 Score: 0.8703
Classification Sensitivity (Recall): 0.8856
Classification Specificity: 0.9551


- **Classification Accuracy (88.56%)**: Indicates that nearly 89% of all predictions made by the model are correct, suggesting strong overall performance.
- **Classification F1 Score (87.03%)**: Reflects a good balance between precision and recall. The F1 score is slightly lower than the accuracy, which may indicate that there is some imbalance in false positives or false negatives.
- **Classification Sensitivity (Recall) (88.56%)**: Shows that the model correctly identifies 88.56% of the actual positive cases. This high recall is crucial in medical diagnosis to minimize missed detections.
- **Classification Specificity (95.51%)**: Demonstrates that the model is very effective at correctly identifying negative cases, with a low false-positive rate. This is important to avoid over-diagnosis and unnecessary treatments.

Overall, these metrics indicate a robust model performance with particularly strong specificity, suggesting that the model is reliable at ruling out cases that do not have the disease.

#**Gradio Interface**
This Gradio block creates a simple web interface for clinicians to obtain a classification prediction for an uploaded OCT image. Here’s what each part does:

- **Function Definition (`process_oct_image`)**:  
  The function accepts an image (a NumPy array in the [0,255] range), resizes it to the model’s expected input dimensions, normalizes it to the [0,1] range, and then expands its dimensions to create a batch of size 1.  
  It passes the preprocessed image to the trained classification model, extracts the predicted probabilities, and determines the class with the highest confidence. Finally, it returns a formatted string with the predicted class and the corresponding confidence score.

- **Gradio Interface (`iface`)**:  
  The interface is configured to take an image as input and display the prediction in a textbox. The title and description help guide the clinician on how to use the interface.  
  When `iface.launch()` is called, a web-based UI is launched, allowing users to simply upload an OCT image and immediately see the classification result.

This setup makes the model accessible in a user-friendly way, enabling clinicians to quickly receive automated diagnostic predictions without needing to run code directly.

In [None]:
# Deployment & User Interaction with Gradio
# Create an interface for clinicians to upload an OCT image and receive a classification prediction.

def process_oct_image(image_input):
    # image_input is a numpy array in [0,255] with shape (H, W, 3)
    img_resized = cv2.resize(image_input, IMAGE_SIZE)
    img_norm = img_resized.astype("float32") / 255.0
    img_batch = np.expand_dims(img_norm, axis=0)

    preds = classification_model.predict(img_batch)
    class_idx = np.argmax(preds[0])
    confidence = preds[0][class_idx]
    class_names = ["CNV", "DME", "DRUSEN", "NORMAL"]
    predicted_label = class_names[class_idx]
    classification_result = f"Predicted: {predicted_label} (Confidence: {confidence:.2f})"
    return classification_result

iface = gr.Interface(
    fn=process_oct_image,
    inputs=gr.Image(type="numpy"),
    outputs=gr.Textbox(label="Classification Result"),
    title="OCT Diagnostic System",
    description="Upload an OCT image to get a classification result."
)

iface.launch()

Running Gradio in a Colab notebook requires sharing enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://d4462ca32f6926bb4b.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




In [None]:
!pip freeze > requirements.txt

In [None]:
# prompt: write code to generate readme file, include the project overview, description, dataset, usage, future work, acknowledgments to kaggle, make the readme file downloadable

%%writefile README.md
# OCT Image Classification

## Project Overview

This project implements a deep learning model for classifying Optical Coherence Tomography (OCT) images into four categories: CNV, DME, DRUSEN, and NORMAL.  The model leverages transfer learning with pre-trained architectures (ResNet50, DenseNet121, or InceptionV3) to achieve accurate and efficient classification.  A Gradio interface provides an easy-to-use platform for uploading OCT images and receiving predictions.


## Description

The model undergoes a two-phase training process:

1. **Initial Training:** The base model's weights are frozen, and only the top layers are trained to adapt to the OCT dataset.
2. **Fine-tuning:**  A subset of the top layers of the base model are unfrozen and trained with a lower learning rate to further refine the model's performance.


## Dataset

The dataset used for training and evaluation is an OCT image dataset (OCT.rar).  The dataset is expected to be placed in your Google Drive.  Make sure to adjust the `DATASET_RAR_PATH` variable in the script to reflect the correct path on your drive.  The dataset is organized into folders, each corresponding to a disease class.


## Usage

1. **Setup:** Ensure you have the required libraries installed (see `requirements.txt`).  You can install them using `pip install -r requirements.txt`.
2. **Data Preparation:**  Upload the OCT.rar file to your Google Drive and update the `DATASET_RAR_PATH` variable.
3. **Run the Notebook:** Execute the Jupyter Notebook. This will mount your Google Drive, extract the dataset, train the model, and launch the Gradio interface.
4. **Classification:** Use the Gradio interface to upload an OCT image and receive the model's classification prediction.


## Future Work

* **Data Augmentation:** Implement more advanced data augmentation techniques to improve model robustness.
* **Hyperparameter Tuning:** Perform a more thorough hyperparameter search to optimize model performance.
* **Ensemble Methods:** Explore ensemble methods to combine multiple models for improved accuracy.


## Acknowledgments

This project was inspired by Kaggle datasets and community contributions. Special thanks to those who provided public OCT image datasets.


## Downloading the README

This README.md file can be downloaded directly from the Colab environment.



Overwriting README.md


# New Section