# Repsly trial data

In [1]:
from repsly_data import RepslyData

repsly_data = RepslyData()
print('Reading data (this might take a minute or so)...', end='')
repsly_data.read_data('data/trial_users_analysis.csv', mode='FC')
print('done.')

Reading data (this might take a minute or so)...done.



Let's see what the data looks like:

In [2]:
read_batch = repsly_data.read_batch(batch_size=20)

X, y = next(read_batch)
print('X{}: {}'.format(list(X.shape), X))
print('y:', y)

X[20, 241]: [[ 153.    1.    1. ...,    0.    0.    0.]
 [ 224.    0.    0. ...,    0.    0.    0.]
 [  54.    0.    0. ...,    0.    0.    0.]
 ..., 
 [ 185.    0.    0. ...,    0.    0.    0.]
 [  55.    0.    0. ...,    0.    0.    0.]
 [ 131.    0.    0. ...,    1.    5.    0.]]
y: [0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0]


As you can see above, each input vector `X` has `1+15*16=241` values, most of which are zeros. The first one is the trial start date as offset from `2016-01-01` and the rest is different usage parameters for the following `16` days. Data provided by batch read is randomly shuffled. Output values are stored in `y` and they represent if the user purchased the Repsly service after the trial or not.

# Training

First, we create a network:

In [3]:
from repsly_nn import RepslyFC

repsly_nn = RepslyFC()

arch = [100, 200]
arch_dict = {'keep_prob': 0.7}
repsly_nn.create_net(arch, arch_dict)

And then we train it for one epoch.

In [4]:
batch_size = 16
epochs = 1

for i in range(epochs):
    print('Training for one epoch...')
    repsly_nn.train(data=repsly_data, batch_size=batch_size, epochs=1)

Training for one epoch...
Checkpoint directory is: /Users/davor/PycharmProjects/deep_learning/repsly_challenge/checkpoints/RepslyFC-[100,200]/keep_prob-0.7/lr-0.001/dr-0.999/ds-20
Creating tf.train.Saver()...done
self.checkpoint_path: checkpoints/RepslyFC-[100,200]/keep_prob-0.7/lr-0.001/dr-0.999/ds-20
ckpt: None
[00000/1.7 sec]   train/validation loss = 3.68377/4.30155
[00020/4.0 sec]   train/validation loss = 0.96413/5.53576
[00040/6.5 sec]   train/validation loss = 0.60906/0.01058
[00060/8.8 sec]   train/validation loss = 0.92490/0.70330
[00080/10.8 sec]   train/validation loss = 0.00492/5.42241
[00100/12.9 sec]   train/validation loss = 5.86104/1.03274
[00120/14.3 sec]   train/validation loss = 2.42943/1.71400
[00140/16.1 sec]   train/validation loss = 0.89884/1.02610
[00160/17.6 sec]   train/validation loss = 3.65262/0.27851
[00180/19.0 sec]   train/validation loss = 0.89011/0.42529
[00200/20.5 sec]   train/validation loss = 0.18845/1.49120
[00220/22.0 sec]   train/validation loss