Here is a simple example of how to explore different architectures. The data is a 23k sample from the DrivenData PredictingPoverty competition, the features have been engineered already from survey data and the output is a true/false for weather the respondent is poor or not. 

The neural networks both converge fast to their best configuration, so the example isn't very interesting from a architectural challenge, but is more to demonstrate how to to use the gluon_search script to summarise training experiments for different feed-forward neural nets in an automated way. 


In [2]:
%matplotlib inline

import os

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns
import io
import time

In [9]:
X = pd.read_csv('train_data.csv', index_col = 0)

y = pd.read_csv('train_output.csv', index_col = 0)

from sklearn.model_selection import train_test_split
X_tr, X_test, Y_tr, Y_test = train_test_split(X, y, test_size = 0.2, random_state = 5)

from gluon_search import *
import time
import numpy as np
import mxnet as mx
from mxnet import nd, autograd, gluon

mx_X_tr = mx.gluon.data.DataLoader(gluon.data.ArrayDataset(np.array(X_tr), np.array(Y_tr).astype(int)), batch_size = 64, shuffle = True)
mx_X_test = mx.gluon.data.DataLoader(gluon.data.ArrayDataset(np.array(X_test), np.array(Y_test).astype(int)), batch_size = 64, shuffle = True)

In [27]:
num_hidden_arr = np.array([[20,20], [10,10,10]])
activations_arr = np.array([['relu', 'relu'], ['tanh', 'tanh', 'tanh']])
loss_func = gluon.loss.SoftmaxCrossEntropyLoss()
epochs = 10

In [32]:
X_tr.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
19384,0.071429,0.011111,0.0,0.777778,1.0,1.0,0.991304,0.869565,0.0,0.0,0.022857,0.333333,0.278301,7.9e-05,1.0,0.166667,0.857143,0.75,0.996296,0.003425
27117,0.071429,0.022222,0.166667,0.611111,1.0,0.8,0.986957,0.913043,0.166667,0.0,0.048571,0.333333,0.270533,0.0,1.0,0.166667,1.0,1.0,0.988889,0.004668
5249,0.0,0.788889,0.083333,0.833333,0.923913,1.0,0.382609,1.0,0.083333,0.0,0.405714,0.333333,0.775867,0.0003,1.0,0.333333,1.0,1.0,0.474074,0.018507
13229,0.214286,0.0,0.0,0.888889,0.875604,1.0,0.995652,0.826087,0.0,0.0,0.008571,0.0,0.365717,0.003412,1.0,0.333333,0.857143,1.0,1.0,0.013921
26729,0.071429,0.005556,0.166667,0.777778,0.870773,1.0,0.982609,0.73913,0.166667,0.0,0.242857,0.333333,0.640497,0.0,0.5,0.166667,0.714286,0.75,1.0,0.003472


In [33]:
Y_tr.head()

Unnamed: 0,poor
19384,False
27117,False
5249,False
13229,True
26729,False


In [29]:
?search_over_NNs

[1;31mSignature:[0m [0msearch_over_NNs[0m[1;33m([0m[0mX_tr[0m[1;33m,[0m [0mX_test[0m[1;33m,[0m [0mY_tr[0m[1;33m,[0m [0mY_test[0m[1;33m,[0m [0mepochs[0m[1;33m,[0m [0mnum_hidden_arr[0m[1;33m,[0m [0mactivations_arr[0m[1;33m,[0m [0mloss_func[0m[1;33m,[0m [0minit_std[0m[1;33m=[0m[1;36m0.1[0m[1;33m,[0m [0mlearning_rate[0m[1;33m=[0m[1;36m0.01[0m[1;33m,[0m [0mmomentum[0m[1;33m=[0m[1;36m0.9[0m[1;33m)[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Function for searching over Gluon NN architectures and hyperparameters. 
X_tr: original training data, numpy array
X_test: original test data, numpy array
Y_tr: numpy array of training labels
Y_test: numpy array of test labels
epochs: number of iterations over training data
num_hidden_arr: array of num_hidden lists, providing the different architectures
acitvations_arr: array of activations lists, giving the activation functions of the layers
loss_func: a loss function from Gluon loss API 
init_std

In [31]:
search_over_NNs(X_tr, X_test, Y_tr, Y_test, epochs = epochs, num_hidden_arr = num_hidden_arr, activations_arr = activations_arr, loss_func = loss_func, learning_rate = 0.00001, momentum = 0.9)

-----------------------------------------------

Building model with hidden layers : [20, 20]
-----------------------------------------------

Building model with hidden layers : [10, 10, 10]


(   epoch  model_id  test_accuracy  test_losses      time  train_accuracy  \
 0      1  876532.0       0.410162     0.694036  1.050168        0.401630   
 1      2  876532.0       0.684272     0.693271  0.983589        0.518303   
 2      3  876532.0       0.743607     0.693169  1.006583        0.611701   
 3      4  876532.0       0.758482     0.693150  1.029741        0.651400   
 4      5  876532.0       0.761992     0.693148  0.999394        0.673214   
 0      1  655631.0       0.748955     0.692904  1.113874        0.632177   
 1      2  655631.0       0.763831     0.690865  1.135223        0.742165   
 2      3  655631.0       0.763831     0.687962  1.100153        0.764689   
 3      4  655631.0       0.763831     0.685134  1.082881        0.764647   
 4      5  655631.0       0.763831     0.682310  1.081841        0.764647   
 
    train_losses  
 0      0.695812  
 1      0.694529  
 2      0.693946  
 3      0.693713  
 4      0.693607  
 0      0.693189  
 1      0.692059  