<a href="https://colab.research.google.com/github/hermesresearch/CNN_Detect_Melanoma/blob/main/Joseph_hanna_nn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Problem Statement:
The objective of this project is to develop a custom convolutional neural network (CNN) model for accurate detection of melanoma, a deadly form of skin cancer. Melanoma accounts for 75% of skin cancer deaths and early detection is crucial for effective treatment. The proposed solution aims to automate the evaluation of skin images and alert dermatologists about the presence of melanoma, thereby reducing the manual effort required for diagnosis.

Dataset Description:
The dataset used for this project can be downloaded from [link to the dataset]. It comprises a collection of 2357 images representing both malignant and benign oncological diseases. The dataset was obtained from the International Skin Imaging Collaboration (ISIC). The images have been categorized based on the classification provided by ISIC, with an approximately equal number of images in most subsets, except for melanomas and moles, which have a slightly larger representation.

The dataset includes the following diseases:
1. Actinic keratosis
2. Basal cell carcinoma
3. Dermatofibroma
4. Melanoma
5. Nevus
6. Pigmented benign keratosis
7. Seborrheic keratosis
8. Squamous cell carcinoma
9. Vascular lesion



# Project Pipeline


Data Reading & Understanding:

Define paths for train and test images.
Dataset Creation:

Create train and validation datasets.
Resize images to 180x180 pixels.
Dataset Visualization:

Visualize one instance of each class in the dataset.
Model Building & Training:

Create a CNN model to detect the nine classes.
Rescale images between 0 and 1.
Choose optimizer and loss function.
Train the model for approximately 20 epochs.
Data Augmentation:

Apply data augmentation techniques to address underfitting/overfitting.
Model Building & Training on Augmented Data:

Build a CNN model on augmented data.
Train the model for approximately 20 epochs.
Class Distribution Analysis:

Examine the class distribution in the training dataset.
Identify the class with the least number of samples.
Determine classes dominating the dataset.
Handling Class Imbalances:

Use Augmentor library to rectify class imbalances.
Model Building & Training on Rectified Data:

Build a CNN model on the rectified data.
Train the model for approximately 30 epochs.
Findings & Evaluation:

Evaluate model performance and check for overfitting/underfitting.

In [None]:
# Import necessary libraries
import pathlib
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import os
from glob import glob
import PIL
import numpy as np
# TensorFlow
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import BatchNormalization, Dense, Dropout, Activation, Flatten, Conv2D, MaxPool2D
from tensorflow.keras.layers.experimental.preprocessing import Rescaling

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ReduceLROnPlateau
from tensorflow.keras.utils import to_categorical


In [None]:
# Data Reading & Understanding:



In [None]:
# Mount Google Drive to the Colab notebook
from google.colab import drive
drive.mount('/content/gdrive')

In [None]:
# Specify the paths for the training and testing datasets
## Todo: Update the paths of the training and testing datasets

path_folder = '/content/gdrive/MyDrive/CNN/Skin_Cancer_Data'
train_data = pathlib.Path(path_folder + '/Train')
test_data = pathlib.Path(path_folder + '/Test')

In [None]:
import os

# List all files/directories in the path
print(os.listdir(train_data))
print(os.listdir(test_data))


In [None]:
# Counting the number of images in the training dataset
count_train = len(list(train_data.glob('*/*.jpg')))
print(count_train)
# Counting the number of images in the testing dataset
count_test = len(list(test_data.glob('*/*.jpg')))
print(count_test)

In [None]:
# Set the batch size to 32
# Set the image height and width to 180 pixels
batch_size = 32
img_height = 180
img_width = 180

In [None]:
# Create a TensorFlow dataset for training images

ds_train = tf.keras.preprocessing.image_dataset_from_directory(
  train_data,
  validation_split=0.2,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

In [None]:
# Create a TensorFlow dataset for validation images
ds_validation = tf.keras.preprocessing.image_dataset_from_directory(
  train_data,
  validation_split=0.2,
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

In [None]:
# Create a TensorFlow dataset for the testing images
ds_test = tf.keras.preprocessing.image_dataset_from_directory(
  test_data,
  image_size=(img_height, img_width),
  batch_size=batch_size)

In [None]:
# Get the class names of skin cancer and store them in a list
# The class names can be found in the 'class_names' attribute of the datasets
# The class names correspond to the directory names in alphabetical order

class_names = ds_train.class_names
print(class_names)

In [None]:
# Create a figure with a size of 10x10 inches
# Iterate over the first batch of images and labels in the training dataset
# Iterate over the first 9 images in the batch
# Create a subplot with 3 rows, 3 columns, and the current index
# Display the image as a numpy array of type "uint8"
# Set the title of the subplot as the corresponding class name from the "class_names" list
# Disable the axis lines and labels for better visualization


import matplotlib.pyplot as plt

plt.figure(figsize=(10, 10))
for images, labels in ds_train.take(1):
  for i in range(9):
    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(images[i].numpy().astype("uint8"))
    plt.title(class_names[labels[i]])
    plt.axis("off")

In [None]:
# Cache, shuffle, and prefetch the training dataset
# Cache and prefetch the validation dataset

AUTOTUNE = tf.data.experimental.AUTOTUNE
ds_train = ds_train.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
ds_validation = ds_validation.cache().prefetch(buffer_size=AUTOTUNE)

In [None]:
preprocessing_layers = [
    tf.keras.layers.experimental.preprocessing.Rescaling(1./255, input_shape=(180, 180, 3))
]

In [None]:
# Rescale the input images to a range of [0, 1]
# Convolutional layers with 32 filters, kernel size of (3, 3), and ReLU activation
# Convolutional layers with 64 filters, kernel size of (3, 3), and ReLU activation
# Convolutional layers with 128 filters, kernel size of (3, 3), and ReLU activation
# Flatten the output from previous layers
# Dense (fully connected) layer with 512 units and ReLU activation

# Output layer with 1 unit and sigmoid activation for binary classification
# Set the number of classes to 9

input_shape = (180,180,3)
lr = 1e-5
init = 'normal'
activ = 'relu'

model = Sequential()
model.add(tf.keras.layers.experimental.preprocessing.Rescaling(1./255, input_shape=(180, 180, 3)))
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))
model.add(MaxPool2D(pool_size=(2, 2)))

model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(2, 2)))

model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPool2D(pool_size=(2, 2)))

model.add(Flatten())

model.add(Dense(512, activation='relu'))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

# Note: The model as currently defined is a binary classifier, not a multi-class classifier

model.summary()

In [None]:
# Compile the model with the specified optimizer, loss function, and metrics


optimizer = 'adam'
loss_fn = "binary_crossentropy"
model.compile(optimizer=optimizer,
              loss=loss_fn,
              metrics=['accuracy'])

In [None]:
# View the summary
model.summary()

In [None]:
# Train the model on the training dataset while validating on the validation dataset

epochs = 20
batch_size = 32

history = model.fit(
  ds_train,
  batch_size=batch_size,
  validation_data=ds_validation,
  epochs=epochs
)

In [None]:


# Plotting training and validation accuracy
plt.figure(figsize=(12, 6))

plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Accuracy')

# Plotting training and validation loss
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Loss')

plt.tight_layout()
plt.show()


In [None]:
# Evaluate the model on the training dataset

loss, accuracy = model.evaluate(ds_train, verbose=1,)
# Evaluate the model on the validation dataset

loss_v, accuracy_v = model.evaluate(ds_validation, verbose=1)
# Print the evaluation results
print("Accuracy: ", accuracy)
print("Validation Accuracy: ",accuracy_v)
print("Loss: ",loss)
print("Validation Loss", loss_v)




# The model's performance on both the training and validation datasets is evaluated.
# However, the provided loss values (-80029597696.0 and -81044750336.0) seem to be unusually large and negative,
# which indicates that there might be an issue with the model or the evaluation process.
# Similarly, the obtained accuracy values (0.12077402323484421 and 0.12906096875667572) are relatively low,
# suggesting that the model is not performing well in terms of accuracy on both datasets.
# Further investigation is needed to identify and address any potential issues.

In [None]:
# Create an instance of ImageDataGenerator for data augmentation
datagen = ImageDataGenerator(
        featurewise_center=False,
        samplewise_center=False,
        featurewise_std_normalization=False,
        samplewise_std_normalization=False,
        zca_whitening=False,
        rotation_range=10,
        zoom_range = 0.1,
        width_shift_range=0.1,
        height_shift_range=0.1,
        horizontal_flip=False,
        vertical_flip=False)

image_class = ['nevus','melanoma','basal_cell_caricoma','actinic_keratosis','vasc_lesion','dermatofibroma', 'pigmented_keratosis', 'seborrheic_keratosis', 'squamous_carci']

train_batches = datagen.flow_from_directory(train_data,
    target_size = (180,180),
    classes = image_class,
    batch_size = 64
 )

valid_batches = datagen.flow_from_directory(test_data,
    target_size = (180,180),
    classes = image_class,
    batch_size = 64
)

In [None]:
# Create a figure with a size of 10x10 inches
import matplotlib.pyplot as plt
# Iterate over the first batch of images and labels in the training dataset

plt.figure(figsize=(10, 10))
for images, labels in ds_train.take(1):
# Create a subplot with 3 rows, 3 columns, and the current index
# Display the image as a numpy array of type "uint8"
  for i in range(9):
    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(images[i].numpy().astype("uint8"))
    plt.title(class_names[labels[i]])
    plt.axis("off")

In [None]:
# Define the model architecture
# Convolutional layers with 32 filters, kernel size of (3, 3), ReLU activation, and 'same' padding
# Max pooling layer with pool size (2, 2)
# Dropout layer with a rate of 0.25
# Convolutional layers with 64 filters, kernel size of (3, 3), ReLU activation, and 'same' padding
# Max pooling layer with pool size (2, 2)
# Dropout layer with a rate of 0.4
# Convolutional layer with 128 filters, kernel size of (3, 3), and ReLU activation
# Max pooling layer with pool size (2, 2)
# Dropout layer with a rate of 0.5
# Dense (fully connected) layer with 9 units and softmax activation for multi-class classification
# Print the summary of the model architecture


model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', padding='same', input_shape=input_shape))
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', padding='same'))

model.add(MaxPool2D(pool_size=(2, 2)))

model.add(Dropout(0.25))

model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', padding = 'Same'))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', padding = 'Same'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(0.4))

model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(0.4))

model.add(Flatten())

model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(9, activation='softmax'))

model.summary()

In [None]:
from tensorflow.keras.optimizers import legacy
# Create an instance of the Adam optimizer with custom parameters

optimizer = legacy.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
# Compile the model with categorical cross-entropy loss, the custom optimizer, and accuracy metric

model.compile(loss='categorical_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])


In [None]:
# Create an instance of the ReduceLROnPlateau callback
learning_rate_reduction = ReduceLROnPlateau(monitor='val_accuracy',
    patience=3,
    verbose=1,
    factor=0.5,
    min_lr=0.00001)

In [None]:
# Train the model using the training data batches

epochs = 20
batch_size = 10
history = model.fit(train_batches,
  epochs = epochs, verbose = 1, validation_data=valid_batches , callbacks=[learning_rate_reduction])

In [None]:
## Retrieve the accuracy values from the training history

acc = history.history['accuracy']
# Print the available keys in the history object

print(history.history.keys, ":")
# Retrieve the validation accuracy values from the training history

val_acc = history.history['val_accuracy']
# Retrieve the loss values from the training history

loss = history.history['loss']
# Retrieve the validation loss values from the training history

val_loss = history.history['val_loss']
# Create a range of epochs

epochs_range = range(epochs)
# Create a figure with subplots for accuracy and loss visualization
# Plot the training and validation accuracy over epochs

plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

In [None]:
# Create a sequential model
    # Convolutional layer with 32 filters, kernel size of (3, 3), and ReLU activation
    # Max pooling layer with pool size (2, 2)
    # Convolutional layer with 64 filters, kernel size of (3, 3), and ReLU activation
    # Max pooling layer with pool size (2, 2)
    # Flatten the output from the previous layers
    # Dense (fully connected) layer with 512 units and ReLU activation
    # Dense (fully connected) layer with 9 units and softmax activation for multi-class classification

model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(180, 180, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(9, activation='softmax')
])


In [None]:
# Remove the last layer from the model

model.pop()
# Add a new dense layer with 10 units and softmax activation


model.add(tf.keras.layers.Dense(10, activation='softmax'))


In [None]:
# Compile the model with the Adam optimizer, sparse categorical cross-entropy loss, and accuracy metric

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])


In [None]:

import tensorflow as tf
# Evaluate the model on the training dataset and obtain the loss and accuracy

# Evaluate the model on the validation dataset and obtain the loss and accuracy

loss, accuracy = model.evaluate(ds_train, verbose=1,)
loss_v, accuracy_v = model.evaluate(ds_validation, verbose=1)

print("Accuracy: ", accuracy)
print("Validation Accuracy: ",accuracy_v)
print("Loss: ",loss)
print("Validation Loss", loss_v)


In [None]:
import matplotlib.pyplot as plt

# Create a dictionary to store the count of images per class
data = {}

# Initialize the dictionary with empty lists for each class
for i in class_names:
    data[i] = []

# Create a figure for plotting
plt.figure(figsize=(5, 5))

# Iterate over the training dataset and collect images for each class
for images, labels in ds_train:
    for i in range(9):
        # Append the image to the respective class in the data dictionary
        data[class_names[labels[i]]].append(images[i].numpy().astype("uint8"))

# Count the number of images in each class
for i in data:
    data[i] = len(data[i])

# Create a bar plot to visualize the image count per class
f = plt.figure()
f.set_figwidth(5)
f.set_figheight(5)

plt.bar(range(len(data)), list(data.values()), align='center')
plt.xticks(range(len(data)), list(data.keys()))
plt.show()


In [None]:

pip install Augmentor

In [None]:
import Augmentor
# Iterate over the class names

path_to_training_dataset="/content/gdrive/MyDrive/CNN/Skin_Cancer_Data/Train/"
    # Create an Augmentor pipeline for each class using the path to the training dataset
    # Apply rotation augmentation with a probability of 0.7 and maximum rotation angles of 10 degrees
    # Generate and save 500 augmented images for each class

for i in class_names:
    p = Augmentor.Pipeline(path_to_training_dataset + i)
    p.rotate(probability=0.7, max_left_rotation=10, max_right_rotation=10)
    p.sample(500)

In [None]:
# Count the number of images in the training dataset
count_train = len(list(train_data.glob('*/output/*.jpg')))
# Print the total count of images in the training dataset

print(count_train)

In [None]:
import glob
import os

# Create a list of file paths matching the pattern '*/*/output/*.jpg' in the training dataset directory
path_list = [x for x in glob.glob(os.path.join(train_data, '*','output', '*.jpg'))]

# Print the list of file paths
print(path_list)


In [None]:


# Create a list of lesion names by extracting the directory names from the file paths
lesion_list_new = [os.path.basename(os.path.dirname(os.path.dirname(y))) for y in glob.glob(os.path.join(train_data, '*','output', '*.jpg'))]

# Print the list of lesion names
print(lesion_list_new)


In [None]:
# Create a dictionary that maps file paths to corresponding lesion names
dataframe_dict_new = dict(zip(path_list, lesion_list_new))

# Print the dictionary
print(dataframe_dict_new)


In [None]:


# Create a DataFrame from the dictionary with columns 'Path' and 'Label'
df2 = pd.DataFrame(list(dataframe_dict_new.items()), columns=['Path', 'Label'])

# Assign the DataFrame to a new variable
new_df = df2


In [None]:
# Count the occurrences of each unique value in the 'Label' column of the DataFrame
label_counts = new_df['Label'].value_counts()

# Print the value counts
print(label_counts)


In [None]:
# Set the batch size for training and evaluation
batch_size = 32

# Set the image height and width dimensions
img_height = 180
img_width = 180


In [None]:


# Set the directory path to the training dataset
train_data = "/content/gdrive/MyDrive/CNN/Skin_Cancer_Data/Train"

# Create a training dataset using the image_dataset_from_directory function
ds_train = tf.keras.preprocessing.image_dataset_from_directory(
    train_data,
    seed=123,
    validation_split=0.2,
    subset='training',
    image_size=(img_height, img_width),
    batch_size=batch_size
)


In [None]:


# Create a validation dataset using the image_dataset_from_directory function
ds_validation = tf.keras.preprocessing.image_dataset_from_directory(
    train_data,
    seed=123,
    validation_split=0.2,
    subset='validation',
    image_size=(img_height, img_width),
    batch_size=batch_size
)


In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, BatchNormalization, Dropout, Flatten, Dense

# Create a Sequential model
model = Sequential()

# Add the first Conv2D layer with 32 filters, kernel size (3, 3), ReLU activation, padding 'same', and input shape
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', padding='same', input_shape=input_shape))
# Add another Conv2D layer with 32 filters and ReLU activation
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', padding='same'))

# Add MaxPooling layer with pool size (2, 2)
model.add(MaxPool2D(pool_size=(2, 2)))
# Add BatchNormalization layer
model.add(BatchNormalization())
# Add Dropout layer with a rate of 0.25
model.add(Dropout(0.25))

# Add another Conv2D layer with 64 filters, kernel size (3, 3), ReLU activation, and padding 'same'
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', padding='same'))
# Add another Conv2D layer with 64 filters and ReLU activation
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', padding='same'))

# Add MaxPooling layer with pool size (2, 2)
model.add(MaxPool2D(pool_size=(2, 2)))
# Add BatchNormalization layer
model.add(BatchNormalization())
# Add Dropout layer with a rate of 0.4
model.add(Dropout(0.4))

# Add another Conv2D layer with 128 filters, kernel size (3, 3), and ReLU activation
model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
# Add BatchNormalization layer
model.add(BatchNormalization())
# Add MaxPooling layer with pool size (2, 2)
model.add(MaxPool2D(pool_size=(2, 2)))
# Add Dropout layer with a rate of 0.4
model.add(Dropout(0.4))

# Flatten the previous layer outputs
model.add(Flatten())

# Add a Dense layer with 128 units and ReLU activation
model.add(Dense(128, activation='relu'))
# Add BatchNormalization layer
model.add(BatchNormalization())
# Add Dropout layer with a rate of 0.5
model.add(Dropout(0.5))

# Add the final Dense layer with 1 unit and softmax activation
model.add(Dense(1, activation='softmax'))

# Print the model summary
model.summary()


In [None]:
# Compile the model with the specified optimizer, loss function, and metrics
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

#### Here


In [None]:

import tensorflow as tf

# Set the batch size, image dimensions, and number of epochs
batch_size = 10
img_height = 180
img_width = 180
epochs = 50

# Set the directory path to the training dataset
train_data = "/content/gdrive/MyDrive/CNN/Skin_Cancer_Data/Train"

# Create a training dataset using the image_dataset_from_directory function
ds_train = tf.keras.preprocessing.image_dataset_from_directory(
    train_data,
    seed=123,
    validation_split=0.2,
    subset='training',
    image_size=(img_height, img_width),
    batch_size=batch_size
)

# Create a validation dataset using the image_dataset_from_directory function
ds_validation = tf.keras.preprocessing.image_dataset_from_directory(
    train_data,
    seed=123,
    validation_split=0.2,
    subset='validation',
    image_size=(img_height, img_width),
    batch_size=batch_size
)

# Set the buffer size for dataset prefetching
AUTOTUNE = tf.data.AUTOTUNE

# Cache and prefetch the training dataset
ds_train = ds_train.cache().prefetch(buffer_size=AUTOTUNE)

# Cache and prefetch the validation dataset
ds_validation = ds_validation.cache().prefetch(buffer_size=AUTOTUNE)

# Define a learning rate reduction callback
learning_rate_reduction = tf.keras.callbacks.ReduceLROnPlateau(
    monitor='val_accuracy',
    patience=3,
    verbose=1,
    factor=0.5,
    min_lr=0.00001
)

# Train the model on the training dataset
history = model.fit(
    ds_train,
    epochs=epochs,
    verbose=1,
    validation_data=ds_validation,
    callbacks=[learning_rate_reduction]
)


In [None]:

# Get the output layer of the model
output_layer = model.layers[-1]
num_output_neurons = output_layer.units
print(f'Number of output neurons: {num_output_neurons}')

# Get the activation function of the output layer
output_activation = output_layer.activation.__name__
print(f'Activation function of the output layer: {output_activation}')

# Check the shape of the labels in the training dataset
sample_batch = next(iter(ds_train))
images, labels = sample_batch
label_shape = labels.shape
print(f'Shape of the labels: {label_shape}')

# Check the shape of the images in the training dataset
image_shape = images.shape
print(f'Shape of the images: {image_shape}')

# Check the expected input shape of the model
input_shape = model.input_shape
print(f'Expected input shape of the model: {input_shape}')



In [None]:
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

In [None]:
import tensorflow as tf

batch_size = 10
img_height = 180
img_width = 180
epochs = 50

train_data = "/content/gdrive/MyDrive/CNN/Skin_Cancer_Data/Train"

ds_train = tf.keras.preprocessing.image_dataset_from_directory(
  train_data,
  seed=123,
  validation_split=0.2,
  subset='training',
  image_size=(img_height, img_width),
  batch_size=batch_size)

ds_validation = tf.keras.preprocessing.image_dataset_from_directory(
  train_data,
  seed=123,
  validation_split=0.2,
  subset='validation',
  image_size=(img_height, img_width),
  batch_size=batch_size)

AUTOTUNE = tf.data.AUTOTUNE
ds_train = ds_train.cache().prefetch(buffer_size=AUTOTUNE)
ds_validation = ds_validation.cache().prefetch(buffer_size=AUTOTUNE)

# Adjust the loss function and activation function
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

learning_rate_reduction = tf.keras.callbacks.ReduceLROnPlateau(monitor='val_accuracy',
    patience=3,
    verbose=1,
    factor=0.5,
    min_lr=0.00001)

history = model.fit(ds_train,
  epochs=epochs,
  verbose=1,
  validation_data=ds_validation,
  callbacks=[learning_rate_reduction])


In [None]:
import matplotlib.pyplot as plt

# Get the accuracy and loss values from the history object
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(epochs)

# Plot the training and validation accuracy
plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

# Plot the training and validation loss
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')

# Show the plot
plt.show()


In [None]:
test_dir = "/content/gdrive/MyDrive/CNN?Skin_Cancer_Data/Test"

ds_test = tf.keras.preprocessing.image_dataset_from_directory(
    test_dir,
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size
)

# Preprocess the test dataset
ds_test = ds_test.cache().prefetch(buffer_size=AUTOTUNE)

# Evaluate the model on the test dataset
test_loss, test_accuracy = model.evaluate(ds_test)

print("Test Loss:", test_loss)
print("Test Accuracy:", test_accuracy)


In [None]:
import os

# List all files/directories in the train directory
print(os.listdir(train_data))

# List all files/directories in the test directory
print(os.listdir(test_data))


In [None]:

from tensorflow.keras.preprocessing import image

# Define the class names
class_names = ['actinic keratosis', 'seborrheic keratosis', 'basal cell carcinoma', 'nevus', 'squamous cell carcinoma', 'vascular lesion', 'dermatofibroma', 'pigmented benign keratosis', 'melanoma']

# Path to the directory containing the test images
test_data_dir = "/content/gdrive/MyDrive/CNN/Skin_Cancer_Data/Test/melanoma"

# List all the image filenames in the directory
image_files = os.listdir(test_data_dir)

# Choose a random image from the directory
random_image_file = random.choice(image_files)

# Load the image and preprocess it
img_path = os.path.join(test_data_dir, random_image_file)
img = image.load_img(img_path, target_size=(img_height, img_width))
img_array = image.img_to_array(img)
img_array = np.expand_dims(img_array, axis=0)
img_array /= 255.0  # Normalize the image

# Make predictions using the trained model
predictions = model.predict(img_array)

# Get the predicted class label
predicted_class_index = np.argmax(predictions[0])
predicted_class_label = class_names[predicted_class_index]

# Display the predicted class label
print("Predicted class label:", predicted_class_label)


In [None]:
# Define the class names
class_names = ['actinic keratosis', 'seborrheic keratosis', 'basal cell carcinoma', 'nevus', 'squamous cell carcinoma', 'vascular lesion', 'dermatofibroma', 'pigmented benign keratosis', 'melanoma']

#Path to the directory containing the test images
test_data_dir = "/content/gdrive/MyDrive/CNN/Skin_Cancer_Data/Test/melanoma"

#List all the image filenames in the directory
image_files = os.listdir(test_data_dir)

#Choose a random image from the directory
random_image_file = random.choice(image_files)

#Load the image and preprocess it
img_path = os.path.join(test_data_dir, random_image_file)
img = image.load_img(img_path, target_size=(img_height, img_width))

#Display the chosen picture
plt.imshow(img)
plt.axis('off')
plt.title('Chosen Picture')
plt.show()

##Convert the image to a NumPy array
img_array = image.img_to_array(img)
img_array = np.expand_dims(img_array, axis=0)
img_array /= 255.0 # Normalize the image

#Make predictions using the trained model
predictions = model.predict(img_array)

#Get the predicted class label
predicted_class_index = np.argmax(predictions[0])
predicted_class_label = class_names[predicted_class_index]

#Display the predicted class probabilities
plt.bar(class_names, predictions[0])
plt.title('Predicted Class Probabilities')
plt.xlabel('Class')
plt.ylabel('Probability')
plt.show()

#Print the predicted class label
print("Predicted class label:", predicted_class_label)