## Optimização de hyper-parâmetros

A ideia é que esta função selecione um número X de diferentes redes.

Depois vamos criar uma função que para cada rede, vai fazer o treino e testar o score com os dados de validação. Essa função depois ordena consoante o melhor valor de AUC.

In [1]:
from sklearn.model_selection import ParameterSampler
import random as r
def selecaoHyperParametros(d,neuronios,reg):
    lista_parametros = list(ParameterSampler(d, n_iter=10, random_state=10))
    r.seed(10)
    for var in lista_parametros:
        var['topologia'] = r.choices(neuronios,k=var['nrCamadas'])
        aux = var['regularizer']
        if aux == 1:
            var['l1'] = r.choice(reg)
        elif aux == 2:
            var['l2'] = r.choice(reg)
        elif aux == 3:
            var['l1'] = r.choice(reg)
            var['l2'] = r.choice(reg)
        else:
            pass
    return lista_parametros

In [2]:
from keras import models,layers,regularizers
def criaRede(param,inputSize):
    model=models.Sequential()
    aux = param['regularizer']
    kernel_reg = None
    '''if aux == 1:
        kernel_reg = regularizers.l1(param['l1'])
    elif aux == 2:
        kernel_reg = regularizers.l2(param['l2'])
    elif aux == 3:
        kernel_reg = regularizers.l1_l2(l1=param['l1'],l2=param['l2'])
    else:
        pass '''
    
    model.add(layers.Dense(param['topologia'][0],activation=param['ativacao'],
                           kernel_regularizer=kernel_reg,input_shape=(inputSize,)))
    if param['dropout'] > 0:
        model.add(layers.Dropout(param['dropout']))
    for var in param['topologia'][1:]:
        model.add(layers.Dense(var,activation=param['ativacao'],kernel_regularizer=kernel_reg))
        if param['dropout'] > 0:
            model.add(layers.Dropout(param['dropout']))
    model.add(layers.Dense(1,activation='sigmoid'))
    model.compile(optimizer=param['optimizer'],
    loss='binary_crossentropy',
    metrics=['accuracy'])
    return model

Using TensorFlow backend.


In [3]:
from sklearn.metrics import roc_curve, auc
from keras.callbacks import EarlyStopping
def optimizacaoHyperParametros(d,neuronios,reg,trainX,trainY,valX,valY):
    params = selecaoHyperParametros(d,neuronios,reg)
    for param in params:
        print(param)
        rede = criaRede(param,trainX.shape[1])
        if param['early_stopping'] > 0:
            early = EarlyStopping(monitor='val_loss', patience=param['early_stopping'],
                                  min_delta=0, verbose=True, mode='auto')
            callb = [early]
            history = rede.fit(trainX,
                                trainY,
                                epochs=param['epochs'],
                                batch_size=param['batch_size'],
                                validation_data=(valX,valY),
                                callbacks=callb)
        else:
            history = rede.fit(trainX,
                                trainY,
                                epochs=param['epochs'],
                                batch_size=param['batch_size'],
                                validation_data=(valX,valY))
        pred = rede.predict(valX)
        false_positive_rate, true_positive_rate, thresholds = roc_curve(valY, pred)
        score = auc(false_positive_rate, true_positive_rate)
        param['score'] = score
    return params

## Aqui estão os parametros que vamos optimizar

### Faltam ainda ver o dropout, early stopping e regularização

### Optimizer temos de ver os parâmetros existentes também:
- rmsprop
    - rho: float >= 0.
    - epsilon: float >= 0. Fuzz factor. If None, defaults to K.epsilon().
    - decay: float >= 0. Learning rate decay over each update.
- SGD
    - momentum: float >= 0. Parameter that accelerates SGD in the relevant direction and dampens oscillations.
    - decay: float >= 0. Learning rate decay over each update.
    - nesterov: boolean. Whether to apply Nesterov momentum.
- Adam
    - beta_1: float, 0 < beta < 1. Generally close to 1.
    - beta_2: float, 0 < beta < 1. Generally close to 1.
    - epsilon: float >= 0. Fuzz factor. If None, defaults to K.epsilon().
    - decay: float >= 0. Learning rate decay over each update.
    - amsgrad: boolean. Whether to apply the AMSGrad variant of this algorithm from the paper "On the Convergence of Adam and Beyond"
    
##### Todos têm learning rate!
##### Existem ainda mais optimizers mas não sei se vale a pena ver todos!

In [4]:
dicionario = {
    'nrCamadas':[1,2,3,4,5,6,7,8],
    'ativacao':['relu',#'exponential',
                'tanh','sigmoid','linear'],
    'epochs':[10,20],
    'batch_size':[64,128,256,512],
    'optimizer':['rmsprop','adam','sgd'],
    'dropout':[0.0,0.1,0.2,0.3,0.4],
    'regularizer':[0],#1,2,3],
    'early_stopping':[0,2,3,4,5]
}
neuronios = [2,3,4,5,8,10,16,32,64]
valores_l1 = [0.1,0.01,0.001] 

In [5]:
import pandas as pd
numericos = ['AVProductsInstalled',
'AVProductsEnabled',
'Census_ProcessorCoreCount',
'Census_PrimaryDiskTotalCapacity',
'Census_SystemVolumeTotalCapacity',
'Census_TotalPhysicalRAM',
'Census_InternalPrimaryDiagonalDisplaySizeInInches',
'Census_InternalPrimaryDisplayResolutionHorizontal',
'Census_InternalPrimaryDisplayResolutionVertical',
'Census_InternalBatteryNumberOfCharges']
dtype = {}
for df in pd.read_csv('final_sembat.csv',low_memory=False,chunksize=10):
    for var in df.columns:
        if var not in numericos:
            dtype[var] = 'int8'
    break

In [6]:
import gc
del df
gc.collect()

0

In [7]:
import pandas as pd
auxPred = pd.DataFrame()
i = 0
for tp in pd.read_csv('final_sembat.csv',low_memory=False,chunksize=50000,dtype=dtype):
    if i == 0:
        auxPred = pd.concat([auxPred,tp])
    else:
        auxPred = pd.concat([auxPred,tp],ignore_index=True)
    i+=1
    print(i)

1
2
3
4
5
6
7
8
9
10
11
12


In [8]:
trainX = auxPred.loc[:499999,auxPred.columns!='HasDetections']
valX = auxPred.loc[500000:549999,auxPred.columns!='HasDetections']
trainY = auxPred.loc[:499999,'HasDetections']
valY = auxPred.loc[500000:549999,'HasDetections']

In [9]:
del auxPred
gc.collect()

14

In [10]:
trainX.shape

(500000, 707)

In [11]:
valX.shape

(50000, 707)

In [12]:
trainY.shape

(500000,)

In [13]:
valY.shape

(50000,)

In [14]:
import gc
import math
from sklearn.feature_selection import VarianceThreshold
def realizaVarThreshold():
    indices = []
    col = trainX.columns
    total = len(col)
    chunk = math.floor(total / 10)
    print(chunk)
    quantos = 0
    for i in range(chunk):
        sel = VarianceThreshold(threshold=0.001)
        try:
            sel.fit(trainX[col[quantos:quantos+10]])
            aux = [i+quantos for i in sel.get_support(indices=True)]
            indices.extend(aux)
        except:
            pass
        del sel
        gc.collect()
        quantos = quantos+10
    sel = VarianceThreshold(threshold=0.001)
    try:
        sel.fit(trainX[col[quantos:quantos+7]])
        indices.extend(sel.get_support(indices=True))
    except:
        pass
    del sel
    gc.collect()
    return indices

In [15]:
indices = realizaVarThreshold()
print(len(indices))
col = []
coln = trainX.columns
for i in indices:
       col.append(coln[i])

70
300


In [16]:
res = optimizacaoHyperParametros(dicionario,neuronios,valores_l1,
                          trainX[col],trainY,
                          valX[col],valY)

{'regularizer': 0, 'optimizer': 'rmsprop', 'nrCamadas': 4, 'epochs': 10, 'early_stopping': 4, 'dropout': 0.3, 'batch_size': 256, 'ativacao': 'linear', 'topologia': [10, 5, 10, 3]}
Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
Instructions for updating:
Use tf.cast instead.
Train on 500000 samples, validate on 50000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 00005: early stopping
{'regularizer': 0, 'optimizer': 'rmsprop', 'nrCamadas': 8, 'epochs': 20, 'early_stopping': 2, 'dropout': 0.0, 'batch_size': 256, 'ativacao': 'tanh', 'topologia': [32, 32, 10, 3, 8, 4, 4, 64]}
Train on 500000 samples, validate on 50000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 00004: early stopping
{'regularizer': 0, 'optimizer': 'adam', 'nrCamadas': 6, 'epochs': 10, 'early_stopping': 5, 'dropout': 0.3, 'batch_size': 256, 'ativacao'

Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
{'regularizer': 0, 'optimizer': 'rmsprop', 'nrCamadas': 1, 'epochs': 10, 'early_stopping': 4, 'dropout': 0.4, 'batch_size': 256, 'ativacao': 'linear', 'topologia': [64]}
Train on 500000 samples, validate on 50000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 00006: early stopping
{'regularizer': 0, 'optimizer': 'adam', 'nrCamadas': 5, 'epochs': 10, 'early_stopping': 2, 'dropout': 0.2, 'batch_size': 64, 'ativacao': 'sigmoid', 'topologia': [4, 8, 5, 2, 10]}
Train on 500000 samples, validate on 50000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 00003: early stopping
{'regularizer': 0, 'optimizer': 'sgd', 'nrCamadas': 6, 'epochs': 10, 'early_stopping': 4, 'dropout': 0.2, 'batch_size': 128, 'ativacao': 'linear', 'topologia': [32, 3, 4, 5, 

In [17]:
resultado = pd.DataFrame(res)
resultado.sort_values(by=['score'],ascending=False)

Unnamed: 0,ativacao,batch_size,dropout,early_stopping,epochs,nrCamadas,optimizer,regularizer,score,topologia
1,tanh,256,0.0,2,20,8,rmsprop,0,0.529797,"[32, 32, 10, 3, 8, 4, 4, 64]"
4,tanh,512,0.4,0,10,5,rmsprop,0,0.528062,"[64, 10, 2, 2, 3]"
6,tanh,64,0.0,0,20,2,sgd,0,0.508834,"[4, 5]"
9,linear,128,0.2,4,10,6,sgd,0,0.508834,"[32, 3, 4, 5, 2, 8]"
3,tanh,256,0.0,2,20,7,adam,0,0.508826,"[16, 8, 16, 10, 3, 16, 64]"
5,sigmoid,64,0.2,3,20,1,adam,0,0.508737,[64]
0,linear,256,0.3,4,10,4,rmsprop,0,0.5,"[10, 5, 10, 3]"
7,linear,256,0.4,4,10,1,rmsprop,0,0.5,[64]
8,sigmoid,64,0.2,2,10,5,adam,0,0.5,"[4, 8, 5, 2, 10]"
2,linear,256,0.3,5,10,6,adam,0,0.491166,"[64, 2, 32, 10, 5, 4]"


In [18]:
(trainY.memory_usage() + valY.memory_usage() + 
 trainX.memory_usage().sum() + valX.memory_usage().sum()) / (1000*1000)

427.900328