<a href="https://colab.research.google.com/github/username/lung-disease-detection/blob/main/lung_disease_detection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab" width="140" height="30"/></a>
</p>
<a href="https://www.python.org/ftp/python/3.11.0/python-3.11.0-amd64.exe" target="_parent"><img src="https://www.python.org/static/community_logos/python-logo-generic.svg" alt="Python Logo" width="145" height="45"/></a>
</br>
<a href="https://github.com/VVBK24" target="_parent"><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/2/29/GitHub_logo_2013.svg/960px-GitHub_logo_2013.svg.png" alt="Linked in" width="130" height="34"/></a></br>


# Lung Disease Detection Using Deep Learning

This notebook demonstrates how to load and use the pre-trained lung disease detection model in Google Colab.

The model can:
1. Classify images into three categories: Normal, Pneumonia, and Lung Cancer
2. Determine pneumonia or normal
3. Classify cancer subtypes (Adenocarcinoma, Large Cell Carcinoma, Squamous Cell Carcinoma) when applicable

## Setup

First, let's install the required dependencies.
Such as tensorflow==2.12.0 numpy<2.0.0 matplotlib opencv-python h5py pillow

In [None]:
!pip install tensorflow==2.12.0 numpy<2.0.0 matplotlib opencv-python h5py pillow

## Model Architecture

The lung disease detection model uses a multimodal approach combining four different architectures:

- **VGG16**: Specializes in detailed feature extraction for X-ray images
- **MobileNetV2**: Specializes in efficient X-ray image classification
- **ResNet50**: Specializes in deep feature extraction for CT scans
- **EfficientNetB0**: Specializes in efficient CT scan classification

The features from all four models are combined to provide a comprehensive analysis of lung images, whether they're X-rays or CT scans.

## Download the Model

You can either upload the model directly to Colab or download it from a cloud
storage service like Google Drive.

In [None]:
!pip install requests tqdm

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
!unzip /content/drive/MyDrive/dataset.zip

## Helper Functions

Let's define helper functions for image preprocessing and prediction.

In [None]:
!pip install opencv-python
!pip install pillow
!pip install matplotlib
!pip install keras
!pip install h5py
!pip install tensorflow

## 1. **Typical Dataset Structure for CT and X-ray Lung Disease Classification Assuming your goal is to classify:**

**CT Scans into**:  ['Adenocarcinoma', 'Large Cell Carcinoma', 'Squamous Cell Carcinoma', 'normal']

**X-ray Images into**:  ['pneumonia', 'normal']



```
# Structure

├── ct/
│   ├── adenocarcinoma/
│   │   ├── img1.png
│   │   ├── img2.png
│   │   └── ...
│   ├── large_cell_carcinoma/
│   │   ├── img1.png
│   │   └── ...
│   ├── squamous_cell_carcinoma/
│   │   ├── img1.png
│   │   └── ...
│   └── normal/
│       ├── img1.png
│       └── ...
│
├── xray/
│   ├── pneumonia/
│   │   ├── img1.png
│   │   └── ...
│   └── normal/
│       ├── img1.png
│       └── ...
```





# **Split Folders** (Optional)

> "Which i did"

If you want to include train/test/val splits, structure it like this:
#  
```
dataset/
├── train/
│   ├── ct/
│   │   ├── adenocarcinoma/
│   │   └── ...
│   ├── xray/
│   │   └── pneumonia/
│   │   └── normal/
│
├── val/
│   └── ...  # similar to train
├── test/
│   
```



**for 1st "Typical Dataset Structure"**

In [None]:
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.applications import EfficientNetB0, ResNet50, MobileNetV2, VGG16
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import ModelCheckpoint
import os

# Suppress TensorFlow warnings (optional)
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'  # Set to '0' for full logs

# Enable GPU memory growth (safe multi-model training)
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        print("✅ GPU is available and memory growth is set!")
    except RuntimeError as e:
        print(f"❌ Error setting GPU memory growth: {e}")
else:
    print("⚠️ GPU not available. Training will use CPU.")

# Paths and constants
train_dir = 'dataset/train/'
val_dir = 'dataset/val/'
img_size = 224
batch_size = 32

# Image Generators
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    zoom_range=0.2,
    horizontal_flip=True
)
val_datagen = ImageDataGenerator(rescale=1./255)

# X-ray datasets (binary classification)
train_xray = train_datagen.flow_from_directory(
    os.path.join(train_dir, 'xray'),
    target_size=(img_size, img_size),
    batch_size=batch_size,
    class_mode='binary'
)
val_xray = val_datagen.flow_from_directory(
    os.path.join(val_dir, 'xray'),
    target_size=(img_size, img_size),
    batch_size=batch_size,
    class_mode='binary'
)

# CT scan datasets (4-class classification)
train_ct = train_datagen.flow_from_directory(
    os.path.join(train_dir, 'ct'),
    target_size=(img_size, img_size),
    batch_size=batch_size,
    class_mode='categorical'
)
val_ct = val_datagen.flow_from_directory(
    os.path.join(val_dir, 'ct'),
    target_size=(img_size, img_size),
    batch_size=batch_size,
    class_mode='categorical'
)

# Model builder
def build_model(base_model, input_shape=(224, 224, 3), classes=2):
    base = base_model(weights='imagenet', include_top=False, input_shape=input_shape)
    base.trainable = False
    x = base.output
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Dense(256, activation='relu')(x)
    x = layers.Dropout(0.5)(x)
    if classes == 2:
        out = layers.Dense(1, activation='sigmoid')(x)
    else:
        out = layers.Dense(classes, activation='softmax')(x)
    return models.Model(inputs=base.input, outputs=out)

# Build models
xray_mobilenetv2 = build_model(MobileNetV2, classes=2)
xray_vgg16 = build_model(VGG16, classes=2)
ct_efficientnetb0 = build_model(EfficientNetB0, classes=4)
ct_resnet50 = build_model(ResNet50, classes=4)

# Compile models
xray_mobilenetv2.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
xray_vgg16.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
ct_efficientnetb0.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
ct_resnet50.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Checkpoints
callbacks = {
    'xray_mobilenetv2': ModelCheckpoint('xray_mobilenetv2.h5', save_best_only=True),
    'xray_vgg16': ModelCheckpoint('xray_vgg16.h5', save_best_only=True),
    'ct_efficientnetb0': ModelCheckpoint('ct_efficientnetb0.h5', save_best_only=True),
    'ct_resnet50': ModelCheckpoint('ct_resnet50.h5', save_best_only=True),
}

# Train models
print("\n🔧 Training xray_mobilenetv2...")
xray_mobilenetv2.fit(train_xray, epochs=10, validation_data=val_xray, callbacks=[callbacks['xray_mobilenetv2']])

print("\n🔧 Training xray_vgg16...")
xray_vgg16.fit(train_xray, epochs=10, validation_data=val_xray, callbacks=[callbacks['xray_vgg16']])

print("\n🔧 Training ct_efficientnetb0...")
ct_efficientnetb0.fit(train_ct, epochs=10, validation_data=val_ct, callbacks=[callbacks['ct_efficientnetb0']])

print("\n🔧 Training ct_resnet50...")
ct_resnet50.fit(train_ct, epochs=10, validation_data=val_ct, callbacks=[callbacks['ct_resnet50']])

print("\n✅ All 4 models trained and saved using GPU (if available)!")

# **Save the model to the drive**

In [None]:
from google.colab import drive
import shutil

drive.mount('/content/drive')

# Define the source and destination directories
source_models = ['xray_mobilenetv2.h5', 'xray_vgg16.h5', 'ct_efficientnetb0.h5', 'ct_resnet50.h5']
destination_dir = '/content/drive/MyDrive/trained_models2'  # Replace with your desired destination

# Create the destination directory if it doesn't exist
!mkdir -p "{destination_dir}"

# Copy the model files to Google Drive
for model_file in source_models:
  if os.path.exists(model_file):
    shutil.copy(model_file, destination_dir)
    print(f"✅ '{model_file}' copied to Google Drive.")
  else:
    print(f"⚠️ '{model_file}' not found. Skipping...")



> for the 2nd structure



In [None]:
import tensorflow as tf
from tensorflow.keras.applications import MobileNetV2, VGG16
from tensorflow.keras.models import Model
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense, Dropout, BatchNormalization
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam
import os

# ✅ Enable GPU memory growth
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        print("✅ GPU memory growth enabled!")
    except RuntimeError as e:
        print(f"❌ GPU Error: {e}")
else:
    print("⚠️ GPU not available. Using CPU.")

# 📁 Paths for X-ray data
train_path = 'dataset/train/xray/'
val_path = 'dataset/val/xray/'

# 🔄 Data generators
train_datagen = ImageDataGenerator(rescale=1./255,
                                   rotation_range=20,
                                   zoom_range=0.2,
                                   width_shift_range=0.1,
                                   height_shift_range=0.1,
                                   horizontal_flip=True,
                                   fill_mode='nearest')

val_datagen = ImageDataGenerator(rescale=1./255)

train_gen = train_datagen.flow_from_directory(train_path,
                                              target_size=(224, 224),
                                              batch_size=32,
                                              class_mode='categorical',
                                              shuffle=True)

val_gen = val_datagen.flow_from_directory(val_path,
                                          target_size=(224, 224),
                                          batch_size=32,
                                          class_mode='categorical',
                                          shuffle=False)

# 🧠 Model builder
def build_model(base_model, num_classes=2):
    x = base_model.output
    x = GlobalAveragePooling2D()(x)
    x = BatchNormalization()(x)
    x = Dropout(0.4)(x)
    predictions = Dense(num_classes, activation='softmax')(x)
    return Model(inputs=base_model.input, outputs=predictions)

# 🔨 MobileNetV2
mobile_base = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
mobile_model = build_model(mobile_base, num_classes=train_gen.num_classes)
mobile_model.compile(optimizer=Adam(1e-4), loss='categorical_crossentropy', metrics=['accuracy'])

# 🔨 VGG16
vgg_base = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
vgg_model = build_model(vgg_base, num_classes=train_gen.num_classes)
vgg_model.compile(optimizer=Adam(1e-4), loss='categorical_crossentropy', metrics=['accuracy'])

# 🚀 Train MobileNetV2
print("\n🚀 Training MobileNetV2...")
mobile_model.fit(train_gen, validation_data=val_gen, epochs=10)

# 🚀 Train VGG16
print("\n🚀 Training VGG16...")
vgg_model.fit(train_gen, validation_data=val_gen, epochs=10)

# 💾 Save models
mobile_model.save('xray_mobilenetv2.h5')
vgg_model.save('xray_vgg16.h5')

In [None]:
# 📦 Imports
import tensorflow as tf
from tensorflow.keras.applications import ResNet50, EfficientNetV2S
from tensorflow.keras.models import Model
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense, Dropout, BatchNormalization
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam
import os

# 🖥️ Suppress warnings & enable GPU memory growth
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

gpus = tf.config.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        print("✅ GPU is available and memory growth is set!")
    except RuntimeError as e:
        print(f"❌ Error setting GPU memory growth: {e}")
else:
    print("⚠️ GPU not available. Training will use CPU.")

# 📁 Data Paths
train_path = 'dataset/train/ct'
val_path = 'dataset/val/ct'

# 🔄 Data Generators
train_datagen = ImageDataGenerator(rescale=1./255, rotation_range=20,
                                   width_shift_range=0.2, height_shift_range=0.2,
                                   shear_range=0.2, zoom_range=0.2,
                                   horizontal_flip=True, fill_mode='nearest')

val_datagen = ImageDataGenerator(rescale=1./255)

train_gen = train_datagen.flow_from_directory(train_path, target_size=(224, 224),
                                              batch_size=32, class_mode='categorical')

val_gen = val_datagen.flow_from_directory(val_path, target_size=(224, 224),
                                          batch_size=32, class_mode='categorical')

# 🧠 Model Builder
def build_model(base_model, num_classes=4):
    x = base_model.output
    x = GlobalAveragePooling2D()(x)
    x = BatchNormalization()(x)
    x = Dropout(0.4)(x)
    output = Dense(num_classes, activation='softmax')(x)
    return Model(inputs=base_model.input, outputs=output)

# 🔨 ResNet50
resnet_base = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
resnet_model = build_model(resnet_base, num_classes=train_gen.num_classes)
resnet_model.compile(optimizer=Adam(1e-4), loss='categorical_crossentropy', metrics=['accuracy'])

# 🔨 EfficientNetV2S (TensorFlow Built-in)
eff_base = EfficientNetV2S(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
eff_model = build_model(eff_base, num_classes=train_gen.num_classes)
eff_model.compile(optimizer=Adam(1e-4), loss='categorical_crossentropy', metrics=['accuracy'])

# 🚀 Train both models
resnet_model.fit(train_gen, validation_data=val_gen, epochs=10)
eff_model.fit(train_gen, validation_data=val_gen, epochs=10)

# 💾 Save models
resnet_model.save('ct_resnet50.h5')
eff_model.save('ct_efficientnetv2s.h5')

# **Save the models to Drive**

In [None]:
from google.colab import drive
import shutil

drive.mount('/content/drive')

# Define the source and destination directories
source_models = ['xray_mobilenetv2.h5', 'xray_vgg16.h5', 'ct_efficientnetv2s.h5', 'ct_resnet50.h5']
destination_dir = '/content/drive/MyDrive/trained_models2'  # Replace with your desired destination

# Create the destination directory if it doesn't exist
!mkdir -p "{destination_dir}"

# Copy the model files to Google Drive
for model_file in source_models:
  if os.path.exists(model_file):
    shutil.copy(model_file, destination_dir)
    print(f"✅ '{model_file}' copied to Google Drive.")
  else:
    print(f"⚠️ '{model_file}' not found. Skipping...")

# **Checking if GPU is being used**
and also checking for which version of the tensorflow

In [None]:
import tensorflow as tf

# List devices
print("\n🧠 TensorFlow version:", tf.__version__)
print("📦 GPU devices detected:", tf.config.list_physical_devices('GPU'))

# Check if GPU is being used
from tensorflow.python.client import device_lib
print("\n💻 Available devices:")
print(device_lib.list_local_devices())

# **Image Classification**
Before we move to the model testing we need to create 1 sperate model to identify if the model is ct scan or xray  

In [None]:
# ✅ Step 1: Setup
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.models import Model
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense, Dropout
from tensorflow.keras.optimizers import Adam
import os

print("✅ TensorFlow Version:", tf.__version__)

train_dir = 'dataset/train'
val_dir = 'dataset/val'

# ✅ Step 4: Preprocessing
image_size = (224, 224)
batch_size = 32

datagen_train = ImageDataGenerator(rescale=1./255)
datagen_val = ImageDataGenerator(rescale=1./255)

# ✅ Step 5: Load CT and X-ray folders as classes
train_generator = datagen_train.flow_from_directory(
    train_dir,
    target_size=image_size,
    batch_size=batch_size,
    class_mode='binary',  # 'ct' vs 'xray'
    classes=['ct', 'xray']
)

val_generator = datagen_val.flow_from_directory(
    val_dir,
    target_size=image_size,
    batch_size=batch_size,
    class_mode='binary',
    classes=['ct', 'xray']
)

# ✅ Step 6: Define the model
base_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dropout(0.3)(x)
output = Dense(1, activation='sigmoid')(x)

model = Model(inputs=base_model.input, outputs=output)

# ✅ Step 7: Compile
model.compile(optimizer=Adam(1e-4), loss='binary_crossentropy', metrics=['accuracy'])

# ✅ Step 8: Train
history = model.fit(
    train_generator,
    validation_data=val_generator,
    epochs=10
)

# ✅ Step 9: Save the model
model.save("ct_vs_xray_classifier.h5")
print("✅ Model saved as ct_vs_xray_classifier.h5")

## Conclusion

This notebook demonstrated how to use the pre-trained lung disease detection model to analyze X-ray and CT scan images. The model can classify images into different disease categories, determine pneumonia severity, and identify cancer subtypes.

### Next Steps

1. Try with your own medical images
2. Fine-tune the model on your specific dataset
3. Experiment with different visualization techniques
4. Integrate with other medical diagnostic systems

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing import image
from tensorflow.keras.models import load_model
import os
from google.colab import files

base_path = '/content/drive/MyDrive/trained_models/'

# ✅ Paths for the models
ct_efficientnetv2s_path = base_path + 'ct_efficientnetb0.h5'
ct_resnet50_path = base_path + 'ct_resnet50.h5'
# xray_mobilenetv2_path = base_path + 'xray_mobilenetv2.h5'
# xray_vgg16_path = base_path + 'xray_vgg16.h5'


# ✅ Load all models
scan_type_model = load_model('ct_vs_xray_classifier.h5')
ct_efficientnetv2s = load_model(ct_efficientnetv2s_path)
ct_resnet50 = load_model(ct_resnet50_path)
xray_mobilenetv2 = load_model('xray_mobilenetV2.h5')
xray_vgg16 = load_model('xray_vgg16.h5')

# ✅ Class labels
ct_classes = ['Adenocarcinoma', 'Large Cell Carcinoma', 'Squamous Cell Carcinoma', 'normal']
xray_classes = ['pneumonia', 'normal']

def predict_scan_and_disease(img_path):
    try:
        # Load & preprocess image
        img = image.load_img(img_path, target_size=(224, 224))
        img_array = image.img_to_array(img) / 255.0
        img_array = np.expand_dims(img_array, axis=0)

        # 🧠 Step 1: Predict scan type
        scan_pred = scan_type_model.predict(img_array)[0][0]
        scan_type = "ct" if scan_pred < 0.5 else "xray"
        print(f"🧪 Predicted Scan Type: {scan_type.upper()}")

        # 🧠 Step 2: Print all model results
        print("\n🧠 All Model Predictions:")

        # For CT scans
        if scan_type == "ct":
            print("📡 Using CT Models:")
            ct_resnet50_pred = ct_resnet50.predict(img_array)
            print(f"ResNet50 Prediction: {ct_classes[np.argmax(ct_resnet50_pred)]} (Raw: {ct_resnet50_pred})")

            ct_efficientnetv2s_pred = ct_efficientnetv2s.predict(img_array)
            print(f"EfficientNetV2s Prediction: {ct_classes[np.argmax(ct_efficientnetv2s_pred)]} (Raw: {ct_efficientnetv2s_pred})")

        # For X-ray scans
        else:
            print("📡 Using X-ray Models:")
            xray_mobilenetv2_pred = xray_mobilenetv2.predict(img_array)
            print(f"MobileNetV2 Prediction: {xray_classes[np.argmax(xray_mobilenetv2_pred)]} (Raw: {xray_mobilenetv2_pred})")

            xray_vgg16_pred = xray_vgg16.predict(img_array)
            print(f"VGG16 Prediction: {xray_classes[np.argmax(xray_vgg16_pred)]} (Raw: {xray_vgg16_pred})")

        # Return the best prediction
        if scan_type == "ct":
            disease_pred = ct_resnet50.predict(img_array)  # Default to ResNet50 for CT
            disease_label = ct_classes[np.argmax(disease_pred)]
        else:
            disease_pred = xray_mobilenetv2.predict(img_array)  # Default to MobileNetV2 for X-ray
            disease_label = xray_classes[np.argmax(disease_pred)]

        print(f"\n💉 Final Predicted Disease: {disease_label}")
        return scan_type, disease_label

    except Exception as e:
        print(f"❌ Error: {e}")

# ✅ Upload image function (using Google Colab file upload)
def upload_image():
    uploaded = files.upload()  # Upload files using Google Colab
    if uploaded:
        img_path = list(uploaded.keys())[0]  # Get the path of the uploaded image
        print(f"Image uploaded: {img_path}")
        return img_path
    else:
        print("No file selected.")
        return None

# Example: Upload an image and predict
img_path = upload_image()
if img_path:
    scan_type, disease_label = predict_scan_and_disease(img_path)