# Skin Cancer Detection using CNN

Cancer is a disease in which cells are developed abnormally and they divide uncontrollably. This in turn has the ability to infiltrate and destroy normal body tissue. Skin cancer is by far the most common type of cancer. If detected early, it is highly treatable.

Deep learning in the field of image processing has shown exceptional results. The aim of this project is to do skin cancer detection using Convolutional Neural Networks (CNN) and transfer learning. The goal is to create such a model that it can detect from a scan if the cancer is benign or malignant, so that it is detected and appropriate medical treatment can be provided for its cure.

TensorFlow library is used for the deep learning. Xception Convolutional Neural Network (CNN) is used.

### Importing the libraries

The required libraries are imported.

In [None]:
# import libraries
import tensorflow as tf
from tensorflow.keras.applications.xception import Xception
from tensorflow.keras import Input, Model
from tensorflow.keras.layers import Lambda, Dropout, Activation, Dense, AveragePooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator 
from tensorflow.keras.preprocessing import image
import pandas as pd
import numpy as np
import plotly.express as px
import os

### Creating the directories for Kaggle

Directories for train, test and output folders are created.

The image dimension is also set for 299 pixels based upon the data.

In [None]:
trainFolder = "../input/skin-cancer-malignant-vs-benign/train/"
testFolder = "../input/skin-cancer-malignant-vs-benign/test/"
outputDir = "/kaggle/working/"
imgDim = 299

### Generalized Mean Pooling

Generalized Mean Pooling computes the generalized mean of each channel of a tensor. It helps focus on the salient features of the image.

In [None]:
class GeneralizedMeanPool2D(tf.keras.layers.Layer):
    def __init__(self, gm_exp=3.0, **kwargs):
        super().__init__(**kwargs)
        self.gm_exp = gm_exp
    
    def call(self, inputs):
        pool = tf.reduce_mean(tf.abs(inputs**(self.gm_exp)), axis=[1, 2], keepdims=False) + 1.e-8
        pool = tf.pow(pool, 1.0/self.gm_exp)
        return pool


### Base Model

We have defined a function "getBaseModel" that takes "freezedLayers" as attributes and those number of layers are freezed in the "Xception" CNN for transfer learning. The property "layer.trainable" is set as "False" for the layers that we want to freeze.

In [None]:
def getBaseModel(freezedLayers):
    model = Xception(input_shape=(imgDim,imgDim,3),weights='imagenet',include_top=False)
    for layer in model.layers:
        layer.trainable = True
    for layer in model.layers[:freezedLayers]:
        layer.trainable = False   
    return model

### Additional Layers

A function "getAttachmentForModel" is defined in which The additional layers are defined for training i.e., 3 hidden (dropout, activation and dense) layers and 1 output layer.

In [None]:
def getAttachmentForModel(model):
    X_feat = Input(model.output_shape[1:])
    X = GeneralizedMeanPool2D()(X_feat)
    X = Dropout(0.05)(X)
    X = Activation('relu')(X)
    X = Dense(2, activation='sigmoid')(X)
    return Model(inputs=X_feat, outputs=X)

### Final Model

The "getFinalModel" function takes the base model as input and creates a model with the freezed layers as well as the new layers that are required for the training of the model.

In [None]:
def getFinalModel(baseModel):
    attachment = getAttachmentForModel(baseModel)
    imageInput = Input((299,299,3))
    finalModel = baseModel(imageInput)
    finalModel = attachment(finalModel)
    finalModel = Model(inputs=imageInput, outputs=finalModel)
    return finalModel

### Save History

The "saveHistoryFile" fuction saves the history including the accuracy for each epoch into a csv file. This can be used later on for visualizations and drawing conclusions.

In [None]:

def saveModel(freezedLayers,model,history):
    modelSavePath = f'{outputDir}model'
    if os.path.exists(modelSavePath) == False:
        os.mkdir(modelSavePath)
    model.save(modelSavePath)
    
    df = pd.DataFrame(history.history)
    with open(f'{outputDir}{freezedLayers}.json','w') as f:
        df.index.name = "Epoch"
        df["freezed_layer"] = f"{freezedLayers}"
        df.to_csv(f)


### Image Augmentation

This is a technique that is used to expand the dataset artificially. We have set the attributes of "rescale" to 1/255, "shear_range" to 0.2, "zoon_range" to 0.2 and "horizontal_flip" as True for both train dataset as well as test dataset.

We have defined the train and test folders, targert_size as 299 pixels by 299 pixels, batch_size as 48 images and class_mode as "categorical" for both train and test dataset.

In [None]:
trainGenerator = ImageDataGenerator(
    rescale = 1./255,                                     
    shear_range = 0.2,
    zoom_range = 0.2,
    horizontal_flip = True)

testGenerator = ImageDataGenerator(rescale = 1./255)

trainData = trainGenerator.flow_from_directory(
    trainFolder,
    target_size = (299, 299),
    batch_size = 48,
    class_mode = 'categorical')

testData = testGenerator.flow_from_directory(
    testFolder, 
    target_size = (299, 299),
    batch_size = 48, 
    class_mode = 'categorical')


### Train the Model

Now, we are training the model. Xception CNN model has 132 layers, therefore, we are starting from 130 freezed layers and going down to 85 layers with the difference of 5 layers each time to reach the optimum result in which we have the best results.

The loss function used is "categorical_crossentropy" and the optimized is "adam".

In [None]:
for i in range(85,84,-5):
    print(f"Using {i} freezed layers")
    freezedLayers = i
    finalModel = getFinalModel(getBaseModel(freezedLayers))
    print(finalModel.summary())
    finalModel.compile(
        loss='binary_crossentropy',
        optimizer='adam',
        metrics=[
            'binary_accuracy', 
            'AUC',
            tf.keras.metrics.TruePositives(name='true_positives'),
            tf.keras.metrics.TrueNegatives(name='true_negatives'),
            tf.keras.metrics.FalsePositives(name='false_positives'),
            tf.keras.metrics.FalseNegatives(name='false_negatives'),
        ]
    )
    history = finalModel.fit_generator(
        trainData,
        validation_data=testData,
        epochs=100,
    )
    saveModel(freezedLayers,finalModel,history)
    print("Saved and dusted")

In [None]:
!ls /kaggle/working/model

In [None]:
!zip -o /kaggle/working/model.zip -r /kaggle/working/model 


In [None]:
category={
    0:'benign',1:'malignant'
}

model = tf.keras.models.load_model(f'{outputDir}model')
img_ = image.load_img("/kaggle/input/skin-cancer-malignant-vs-benign/test/malignant/1.jpg", target_size=(299, 299))
img_array = image.img_to_array(img_)
img_processed = np.expand_dims(img_array, axis=0) 
img_processed /= 255.  
pred = model.predict(img_processed)
index = np.argmax(pred)
print(category[index])