<a href="https://colab.research.google.com/github/gabcastro/Unisinos-TopicosEspeciaisII-AnaliseImagens/blob/master/Melanoma_Detection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Melanoma Detection

![alt text](https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/Melanoma.jpg/300px-Melanoma.jpg)

Melanoma is a type of skin cancer. It is the common type of skin cancer. On surfing the internet, we will come images of Melanoma which have a dark-black spot on the skin. We'll create a Siamese Convolutional Neural Network with TensorFlow using this notebook to detect two classes : **Melanoma and Non-Melanoma**




## 1) Importing the packages
We import the TensorFlow package along with NumPy and Keras. Also, we set the verbosity of TensorFlow to `tf.logging.ERROR` so that only errors are being printed in the output.




In [0]:

from tensorflow import keras
import tensorflow as tf
import numpy as np

tf.logging.set_verbosity( tf.logging.ERROR )

print( tf.VERSION )


## 2) Downloading the preprocessed data
Processing and loading of data on a notebook could be time-consuming, so we directly download the processed data from GitHub.  The data was collected and processed in these steps :

Images ->

1. Images of Melanoma were collected from the internet ( using [Fatkun Chrome Extension](https://chrome.google.com/webstore/detail/fatkun-batch-download-ima/nnjjahlikiabnchcpehcpkdeckfgnohf) ). Some images of healthy skin were also collected.
2. Due to the lesser amount of data, the images were augmented to increase their number upto 5041. Also, they were resized to dimensions 32 * 32 pixels.
3. The RGB values were normalized ( brought down to the range ( 0 , 1 ) ).
4. Converted to NumPy array of shape ( 5041 ,  32 , 32 , 3 ) 

Labels ->

1. Pairs of images were generated. 
2. If both the images in that pair belonged to the same class then the label of 1 was assigned else 0 was assigned.
3. Converted to NumPy array of shape ( 5041 , 1 ) 



In [0]:

import requests, zipfile, io

r = requests.get( 'https://github.com/shubham0204/Dataset_Archives/blob/master/mel_detect_processed.zip?raw=true' ) 
z = zipfile.ZipFile(io.BytesIO(r.content))
z.extractall()


## 3) Reshaping the data

We have total 5041 images and 5041 labels. We make a split of 4500-541 for training and validation repectively.

* We squeeze the last 3 dimensions of the image array. Hence the shape transforms into ( 5041 , 32 , 32 , 3 ) -> ( 5041 , 3072 )


In [0]:

DIMEN = 32

X1 = np.load( 'processed_data/x1.npy')
X2 = np.load( 'processed_data/x2.npy')
Y = np.load( 'processed_data/y.npy')

X1 = X1.reshape( ( X1.shape[0]  , DIMEN**2 * 3  ) ).astype( np.float32 )
X2 = X2.reshape( ( X2.shape[0]  , DIMEN**2 * 3  ) ).astype( np.float32 )

train_X1 = X1[ : 4500 ] 
train_X2 = X2[ : 4500 ] 
train_Y = Y[ : 4500 ] 

test_X1 = X1[ 4500 : ]
test_X2 = X2[ 4500 : ] 
test_Y = Y[ 4500 : ]
 
print(  train_X1.shape )
print(  train_X2.shape )
print(  train_Y.shape )

print(test_X1.shape )
print(test_X1.shape )
print(test_Y.shape) 


## 4) Defining the Model

We use Keras to build our Siamese Convolutional Neural Network. The layers will perform operations in this manner :

1. Reshape the input ( from ( None , 3072 ) to ( None , 32 , 32 , 3 ) )
2. Extract features using `Conv2D` and `MaxPooling2D` layers
3. A `Dense` layer to produce a binary output using `sigmoid` activation function.
4. Triplet Loss Function
5. Produce a similarity score using `sigmoid`.

Since our output is binary in nature we use `tf.keras.losses.binary_crossentropy` loss function with `tf.keras.optimizers.Adam` with a learning rate of 0.0001.

You can read more about these networks from [here](https://medium.com/predict/face-recognition-from-scratch-using-siamese-networks-and-tensorflow-df03e32f8cd0).


In [0]:

import tensorflow.keras.backend as K

input_shape = ( (DIMEN**2) * 3 , )
convolution_shape = ( DIMEN , DIMEN , 3 )

kernel_size_1 = ( 8 , 8 )
kernel_size_2 = ( 6 , 6 )
kernel_size_3 = ( 4 , 4 )

pool_size_1 = ( 6 , 6 )
pool_size_2 = ( 4 , 4 )

strides = 1

seq_conv_model = [

	tf.keras.layers.Reshape( input_shape=input_shape , target_shape=convolution_shape),
	
	tf.keras.layers.Conv2D( 32, kernel_size=kernel_size_1 , strides=strides ,activation='relu' ),
	tf.keras.layers.MaxPooling2D(pool_size=pool_size_1, strides=strides ),
	
	tf.keras.layers.Conv2D( 64, kernel_size=kernel_size_2 , strides=strides ,activation='relu'),
	tf.keras.layers.MaxPooling2D(pool_size=pool_size_2 , strides=strides),
    
    tf.keras.layers.Conv2D( 128, kernel_size=kernel_size_3 , strides=strides ,activation='relu'),
    
	tf.keras.layers.Flatten(),
	
	tf.keras.layers.Dense( 3076 , activation=tf.keras.activations.sigmoid )

]

seq_model = tf.keras.Sequential( seq_conv_model )

input_x1 = tf.keras.layers.Input( shape=input_shape )
input_x2 = tf.keras.layers.Input( shape=input_shape )

output_x1 = seq_model( input_x1 )
output_x2 = seq_model( input_x2 )

distance_euclid = tf.keras.layers.Lambda( lambda tensors : K.abs( tensors[0] - tensors[1] ))( [output_x1 , output_x2] )
outputs = tf.keras.layers.Dense( 1 , activation=tf.keras.activations.sigmoid) ( distance_euclid )
model = tf.keras.models.Model( [ input_x1 , input_x2 ] , outputs )

model.compile( loss=tf.keras.losses.binary_crossentropy , optimizer=tf.keras.optimizers.Adam(lr=0.0001) , metrics=['accuracy'])

model.summary()


## 5) Training the model
We train the model for 25 epochs with a batch size of 100 samples.

**Inputs > `train_X1` and `train_X2`**

**Outputs > `train_Y`**


In [0]:

model.fit( [ train_X1 , train_X2 ] , train_Y , epochs=25 , batch_size=100 ) 


## 6) Evaluate the Model

We evaluate the model for its loss and accuracy on `test_X1` and `test_X2` arrays.



In [0]:

metrics = model.evaluate( [ test_X1 , test_X2 ] , test_Y ) 
print( 'Loss of {} and Accuracy is {} %'.format( metrics[0] , metrics[1] * 100 ) ) 


## 7) Converting to TensorFlow Lite model

Since, we are going to use this model on Android, we need to convert it to a .tflite file ( [TensorFlow Lite](https://ww.tensorflow.org/lite) model ). This model will be loaded in Android and used for inferencing.

**Tip: Perform [Post Training Quantization](https://www.tensorflow.org/lite/performance/post_training_quantization) to reduce the models's size.**


In [0]:

model.save( 'model.h5' ) 
converter = tf.lite.TFLiteConverter.from_keras_model_file( 'model.h5' ) 
converter.post_training_quantize = True
tflite_model = converter.convert()
size = open( 'recog_model.tflite' , 'wb' ).write( tflite_model ) 
print( 'Model size > {} Megabytes'.format( size / 1024**2))
