# Melanoma classification
Melanoma is a type of skin cancer. It's the common type of skin cancer, on surfing the internet, we will come images of Melanoma which have a dark-black spot on the skin. We'll create a siamese Convolutional Neural Network with TensorFlow using this notebook to detect two classes: **Melanoma and Non-Melanoma**.
## 1) Import the packages.
We import the tensorFlow package along with NumPy and Keras. Also, we set the verbosity of TensorFlow to "tf.logging.EROR" so that only errors are being printed in the output.

In [2]:
from tensorflow import keras
import tensorflow as tf
import numpy as np

tf.logging.set_verbosity(tf.logging.ERROR)
print(tf.__VERSION__)

AttributeError: module 'tensorflow' has no attribute 'logging'

## 2) Dowloading the preprocessed data
Processing and loading of data on a notebook could be time-consumming, so we directly download the processed from GitHub. The data was collected and processed in these steps:
Images => 
    1. Images of Melanoma were collected from the Internet (using [Fatkun Chrome Extension](https://chrome.google.com/webstore/detail/fatkun-batch-download-ima/nnjjahlikiabnchcpehcpkdeckfgnohf) . Some images of healthy skin were also collected. 
    2. Due to the lesser amount of data, the images were augmented to increase their number upto 5041. Also, they were resized to dimensions 32x32 px.
    3. The RGB velues were normalized (brought down to the range(0,1)).
    4. Converted to Numpy of shape (5041, 32, 32,3)
Labels =>
    1. Pairs of images were generated.
    2. If both the images in the pair belongs to the same class then the label of 1 was assigned else 0 was assigned. 
    3. Converted to NumPy array of shape (5041, 1).

In [7]:
import requests, zipfile, io

r = requests.get('https://github.com/chunche95/IntroMachineLearning/blob/master/contents/Programas/Melanoma-Images/mel_detect_processed.zip')
z = zipfile.ZipFile(io.BytesIO(r.content))
z.extractall()

BadZipFile: File is not a zip file

## 3) Reshaping the data
We have total 50411 images and 5041 labels. We make a split of 4500-541 for training and validation receptively. 
    * We squeeze the last 3 dimensions of the image array. Hence the shape transform into(5041, 32, 32, 3) --> (5041, 3072).

In [9]:
DIMEN = 32
X1 = np.load('processed_data/x1.py') 
X2 = np.load('processed_data/x2.py')
Y =  np.load('processed_data/y.py')

X1 = X1.reshape( (X1.shape[0], DIMEN**2*3) ).astype(np.float32)
X2 = X2.reshape( (X2.shape[0], DIMEN**2*3) ).astype(np.float32)

train_X1 = X1[ : 4500 ]
train_X2 = X2[ : 4500 ]
train_Y = Y[ : 4500 ]

print( train_X1.shape)
print( train_X2.shape)
print( train_Y.shape)

print( test_X1.shape)
print( test_X2.shape)
print( test_Y.shape)

FileNotFoundError: [Errno 2] No such file or directory: 'processed_data/x1.py'

# 4) Defining the Model.
We use Keras to built out Siamese Convolutional Neural Network. The layers will perform 
operations in this manner.
    1. Reshape the input (from(None, 3072) to (None,32,32,3))
    2. Extract features using "Conv2D" and "MaxPooling2D" layers
    3. A "Dense" layer to produce a binary output using "sigmoid" activation function.
    4. Triplet Loss Function
    5.Produce a similarity score using "sigmoid".
Since our is binary in nature we use "tf.keras.losser.binary_crossentropy" loss function with "tf.keras.optimizers.Adam" with a leaning rate of 0.0001.
You can read more about these network from [here](https://medium.com/predict/face-recognition-from-scratch-using-siamese-networks-and-tensorflow-df03e32f8cd0)

In [10]:
import tensorflow.keras.backend as K
input:shape = ((DIM**2)*3, )
convolution_shape = (DIMEN, DIMEN, 3)

kernel_size_1 = (8,8)
kernel_size_2 = (6,6)
kernel_size_3 = (4,4)

pool_size_1 = (6,6)
pool_size_2 = (4,4)

strides = 1
seq_conv_model=[
    tf.keras.layer.Reshape( input_shape=input_shape, target_shape=convolution_shape),
    tf.keras.layers.Conv2D(32, kernel_size=kernel_size_1, strides=strides, activation='relu')
    tf.keras.layers.MaxPooling2D(pool_size=pool_size_1, strides=strides),
    
    tf.keras.layers.Conv2D(64, kernel_size=kernel_size_2, strides=strides, activation='relu'),
    tf.keras.layers.MaxPooling2D(pool_size=pool_size_2, strides=strides),
        tf.keras.layers.Conv2D(128, kernel_size=kernel_size_3, strides=strides, activation='relu'),
    tf.keras.layers.Flatten(),
    tf.keras.layers.dense(3076, activation=tf.keras.activations.sigmoid)
]

seq_model = tf.keras.Sequential(seq_conv_model)

input_x1 = tf.keras.layers.Input(shape=input_shape)
input_x2 = tf.keras.layers.Input(shape=input_shape)

output_x1 = seq_model(input_x1)
output_x2 = seq_model(input_x2)

distance_euclid = tf.keras.layers.Lambda(lambda tensors: K.abs(tensors[0]-tensor[1]))([output_x1, output_x2])
outputs = tf.keras.layers.dense(1, activation=tf.keras.activations.sigmoid) (distance_euclid)
model = tf.keras.model.Model ([input_x1, input_x2], outputs)

model.compile(loss=tf.keras.losses.binary_crossentropy, optimizer=tf.keras.optimizers.Adam(lr=0.0001), metrics=['accuracy'])
model.summary()

SyntaxError: invalid syntax (<ipython-input-10-dca336ead693>, line 16)