# Skin Cancer detection 

## downloading tensorflow for the project
##### run the following command in a code sell to download tensorflow



```terminal
!pip install tensorflow
```



### transfer learning is used to make this project 
####  <u>Transfer learning</u> :   Transfer Learning is basically when you train a model on a data and then used that trained model on a diffrent but similer data so that you don't have to train the entire model , which may take a lot of time and GPU power and for a good preformance will also required a large data set, which we know may not be peresent and making our own data set is tedious and expencive task

##### we will be using VGG16 and VGG19

#### you can donload the data set form [here](https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia)

In [5]:
# importing all the necessary libraries
from keras.layers import Input, Lambda, Dense, Flatten
from keras.models import Model
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential 
import numpy as np
import glob as glob
import matplotlib.pyplot as plt

In [2]:
# re-sizing all the images ,as we are delaing with rgb images so it will have 3 dimentions
IMAGE_SIZE = [244, 244]

train_path = 'skin_cancer/train'
valid_path = 'skin_cancer/test'

In [3]:
# using vgg16 model , and we will be using imagenet weights, as they were the one who win the competion and so ther weights 
# are of the trained modle so we won't have to train the whole model , this is transfer learnig 
vgg = VGG16(input_shape=IMAGE_SIZE + [3], weights= 'imagenet', include_top=False)

# [3] is for the dimention of the rgb imgae

# include_top = False means we don't want the exact model as we are going to use a diffrent data set so , we will train 
# the the last layer and the output layer for out dataset so that the model predicts for this particular data set
# in the vgg16 starting layers of the cnn identify the basic pattrens that are persent in all the images like lines, edges etc, 
# so we don't need to change the staring layers for that we have keept them as same and removed the top part that we will be 
# training for out data set

In [4]:
# we are loading a pretrained model so we don't want to change the weights that we have laoded in the avobe code cell 
for layer in vgg.layers:
        layer.trainable = False    # so that we don't train the layers

In [5]:
from glob import glob
folders = glob("data_set/train/*")

# this will tell us which folders are present in Data_set/train , * repersents all , 

In [6]:
folders

['Data_set/train\\NORMAL', 'Data_set/train\\PNEUMONIA']

In [7]:
x = Flatten()(vgg.output)
# here flatten is bsically reducing the dimentionality of the layers , making it in 1D
# vgg.output is the output layer and we are flattening the output layre

In [8]:
prediction = Dense(len(folders), activation= "softmax")(x)
# x is our output layer and we are usign Dense function to five it no. of neurons that there are in the train file
# for this case there are only two , pneumonia and not pneumonia 
# we are using activatin function softmax

model = Model(inputs = vgg.input, outputs = prediction)
# we are giving the model the inputs that are pretrained form vgg and outputs according to out data set

In [9]:
model.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 244, 244, 3)]     0         
                                                                 
 block1_conv1 (Conv2D)       (None, 244, 244, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 244, 244, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 122, 122, 64)      0         
                                                                 
 block2_conv1 (Conv2D)       (None, 122, 122, 128)     73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 122, 122, 128)     147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 61, 61, 128)       0     

In [10]:
# now we will compile the model , i.e. giving it all the things it will need to train well and make good predictions 
model.compile(loss = 'categorical_crossentropy', 
             optimizer = "adam",
             metrics = ["accuracy"])

In [11]:
# using image generator to import the images form the data set
from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale=1./255,
                                  shear_range=0.2,
                                  zoom_range=0.2, 
                                  horizontal_flip=True)

# here we are configuring the ImageDataGenerator with the parameters like rescale and all so that our model performs well
# it is adviced that if all the data is in same range then its better for the model 
### the basic reason we are doing this is because we want to increase out tring data somehow so that our model doesn't overfit
### this data augmentation is helping us with generarating more data for the training 

# rescale = 1./255 is rescaling the image's pixel value that ranges form 0 to 255 by deviding by 255 we can make it in range of 
# [0,1], this is a convention for better training 

# shear_range=0.2 shearing means keeping one part of the image fixed and move the other part in one direction keeping the 
# image stationary, this can introduce diversity into the training data

# horizontal_flip is used to mirror image this helps in making the model invarient to horizontal flips


test_datagen = ImageDataGenerator(rescale = 1./255)

In [12]:
# training data
# make sure to provide the same target size as initialized before 
training_set = train_datagen.flow_from_directory("data_set/train",
                                                target_size=(244,244),
                                                batch_size=32,
                                                class_mode= "categorical")

# at once 32 images will be given for training and we have to categorize the images 
# target size is (244, 244) this is done to ensure that the size of all the images is same when being fed to the model


Found 5216 images belonging to 2 classes.


In [13]:
# doing the same for test data
test_set = test_datagen.flow_from_directory("data_set/test",
                                                target_size=(244,244),
                                                batch_size=32,
                                                class_mode= "categorical")

Found 624 images belonging to 2 classes.


In [14]:
# now we train our model 
xray = model.fit(training_set, validation_data= test_set,
                          epochs = 5,
                          steps_per_epoch = int(len(training_set)),
                          validation_steps = int(len(test_set))
                          )

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [1]:
from keras.models import load_model
model.save("VGG16_skin_cancer.h5")
# loaded_model = load_model("VGG16_skin_cancer.h5")


# # alternate way to save :- this one will only save model weights nothing else 
# # the the above code save everything like optimizer, loss function, the whole 
# # architechture of the model.

# model.save_weights("model.h5")
# print("Saved model to disk")

NameError: name 'model' is not defined

In [2]:
from keras.models import load_model
loaded_model = load_model("VGG16_skin_cancer.h5")

In [7]:
# image has to be converted into numbers --> array to before to be processed

img = image.load_img( "data_set\\test\\malignant\\13.jpg", target_size = (244,244))
x = image.img_to_array(img)
x = np.expand_dims(img, axis = 0)
img_data = preprocess_input(x)
ans = loaded_model.predict(img_data)



In [12]:
ans

array([[0., 1.]], dtype=float32)

In [13]:
if ans[0][0] == 0:
    print("you dont have Skin Cancer")
else:
    print("Sorry, you have Skin Cancer")

you dont have Pneumonia


In [15]:
img = image.load_img(  "data_set\\test\\benign\\154.jpg", target_size = (244,244))
x = image.img_to_array(img)
x = np.expand_dims(img, axis = 0)
img_data = preprocess_input(x)
ans = loaded_model.predict(img_data)


                



In [16]:
if ans[0][0] == 0:
    print("you dont have Skin Cancer")
else:
    print("Sorry, you have Skin Cancer")

Sorry, you have Pneumonia
