# Exemplo Deep Learning: Cancer

exemplo  de uma rede neural usada para identificar tumores maléficos e neutros

## Funcionamento do Dataset

De acordo com a UCI Machine Learning Repository, o vetor de entrada é composto por:

    a) radius (mean of distances from center to points on the perimeter)
	b) texture (standard deviation of gray-scale values)
	c) perimeter
	d) area
	e) smoothness (local variation in radius lengths)
	f) compactness (perimeter^2 / area - 1.0)
	g) concavity (severity of concave portions of the contour)
	h) concave points (number of concave portions of the contour)
	i) symmetry 
	j) fractal dimension ("coastline approximation" - 1)

Sendo a saída:

0 = tumor benigno
1 = tumor maligno


In [12]:
#Dataset a ser utilizado
from sklearn.datasets import load_breast_cancer
#Classicador Perceptron Multi Camadas
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split 
#preprocessador
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import confusion_matrix, classification_report

In [6]:
#dataset
cancer_dataset = load_breast_cancer()

#imprimir os dados
print("\ndata\n")
print(cancer_dataset.data)
print("\ntargets\n")
print(cancer_dataset.target)


data

[[1.799e+01 1.038e+01 1.228e+02 ... 2.654e-01 4.601e-01 1.189e-01]
 [2.057e+01 1.777e+01 1.329e+02 ... 1.860e-01 2.750e-01 8.902e-02]
 [1.969e+01 2.125e+01 1.300e+02 ... 2.430e-01 3.613e-01 8.758e-02]
 ...
 [1.660e+01 2.808e+01 1.083e+02 ... 1.418e-01 2.218e-01 7.820e-02]
 [2.060e+01 2.933e+01 1.401e+02 ... 2.650e-01 4.087e-01 1.240e-01]
 [7.760e+00 2.454e+01 4.792e+01 ... 0.000e+00 2.871e-01 7.039e-02]]

targets

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 1 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 0 1 0 0 1 1 1 1 0 1 0 0 1 1 1 1 0 1 0 0
 1 0 1 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 0 0 1 1 1 1 0 1 1 0 1 1
 1 1 1 1 1 1 0 0 0 1 0 0 1 1 1 0 0 1 0 1 0 0 1 0 0 1 1 0 1 1 0 1 1 1 1 0 1
 1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 1 1 0 1 1 0 0 0 1 0
 1 0 1 1 1 0 1 1 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 1 1 0 0 1 1
 1 0 1 1 1 1 1 0 0 1 1 0 1 1 0 0 1 0 1 1 1 1 0 1 1 1 1 1 0 1 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 0 1 1 0 1 1 0 1 0 0

In [8]:
#treinamento

X_train,X_test,y_train,y_test = train_test_split(cancer_dataset.data,cancer_dataset.target,test_size=50,random_state=0)

In [9]:
#imprimir X_train e y_train

print("\nDados para treinamento\n")
print(X_train)
print("\nAlvos para treinamento\n")
print(y_train)


Dados para treinamento

[[1.917e+01 2.480e+01 1.324e+02 ... 1.767e-01 3.176e-01 1.023e-01]
 [1.486e+01 2.321e+01 1.004e+02 ... 1.727e-01 3.000e-01 8.701e-02]
 [1.845e+01 2.191e+01 1.202e+02 ... 1.379e-01 3.109e-01 7.610e-02]
 ...
 [9.436e+00 1.832e+01 5.982e+01 ... 5.052e-02 2.454e-01 8.136e-02]
 [9.720e+00 1.822e+01 6.073e+01 ... 0.000e+00 1.909e-01 6.559e-02]
 [1.151e+01 2.393e+01 7.452e+01 ... 9.653e-02 2.112e-01 8.732e-02]]

Alvos para treinamento

[0 0 0 1 1 1 1 1 1 0 0 0 1 1 0 1 0 0 0 1 1 0 1 0 0 1 1 1 1 1 0 0 0 1 0 1 1
 1 0 0 1 0 1 0 1 1 0 1 1 1 1 1 1 1 0 1 0 1 0 0 1 0 0 1 1 1 1 1 1 1 1 1 0 1
 0 1 1 1 1 1 0 1 1 1 1 1 1 0 0 1 1 1 0 1 1 0 1 0 1 1 1 1 1 1 1 0 1 0 1 0 0
 1 1 0 1 0 0 0 1 1 1 1 1 1 0 1 1 1 1 0 0 1 1 0 1 1 1 1 0 1 1 0 0 1 1 0 0 1
 1 0 1 1 0 0 0 1 1 1 0 1 1 1 1 1 0 1 0 1 0 1 0 1 0 1 1 1 1 0 1 0 1 1 1 0 1
 1 1 0 1 1 0 0 1 0 1 0 1 0 0 0 0 1 0 1 0 1 0 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1
 1 1 1 0 1 1 1 0 1 1 1 1 1 1 0 1 0 1 1 0 1 0 0 1 0 0 1 1 0 1 1 1 0 1 1 1 1
 1 0 0 1 1 1 1 0 

In [14]:
#criar o scale
scaler = StandardScaler()
scaler.fit(X_train)

X_train_scale = scaler.transform(X_train)
X_test_scale = scaler.transform(X_test)

In [17]:
#numero de epocas
epochs = 4000
#numero de camadas
hl = [5,5,5]

#criar o classificador MultiPlayer Percetron
mlp = MLPClassifier(hidden_layer_sizes=hl,max_iter=epochs)

In [18]:
#treinar a rede
mlp.fit(X_train_scale,y_train)

MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=[5, 5, 5], learning_rate='constant',
       learning_rate_init=0.001, max_iter=4000, momentum=0.9,
       n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
       random_state=None, shuffle=True, solver='adam', tol=0.0001,
       validation_fraction=0.1, verbose=False, warm_start=False)

In [19]:
#predicao e score
predict = mlp.predict(X_test_scale)
mlp.score(X_test_scale,y_test)

1.0

In [20]:
#ver a confusion_matrix
print(confusion_matrix(y_test,predict)) 

[[19  0]
 [ 0 31]]


In [21]:
#por fim, ver os scores totais e a taxa de erro
print(classification_report(y_test,predict))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        19
           1       1.00      1.00      1.00        31

   micro avg       1.00      1.00      1.00        50
   macro avg       1.00      1.00      1.00        50
weighted avg       1.00      1.00      1.00        50

