Opening notes

In [1]:
import pandas as pd
import tensorflow as tf
import numpy as np

In [2]:
#First row is headers, so just simple import on the csv data using pandas
train_csv = pd.read_csv("au_train.csv")
test_csv = pd.read_csv("au_test.csv")

Taking a look at the data, there are 14 variable columns, as well as the "class", or target column.
We want to convert object  columns to discrete data. We can do this by hand, but an easier way is to use pandas builtin Categorical functionality.

In [3]:
#Convert object columns to discrete numerical values
for col in train_csv:
    if train_csv[col].dtype == np.dtype('object'):
        temp_col = pd.Categorical(train_csv[col])
        temp_col = temp_col.codes
        train_csv[col] = temp_col

#Do the same for the test data
for col in test_csv:
    if test_csv[col].dtype == np.dtype('object'):
        temp_col = pd.Categorical(test_csv[col])
        temp_col = temp_col.codes
        test_csv[col] = temp_col

Now we can load the data for tensorflow and start making the model

In [4]:
#Pop the class columns off each dataset to save as targets for each
train_y = train_csv.pop('class')
test_y = test_csv.pop('class')

#Load train csv into tensor, then shuffle and create batches
train_dataset = tf.data.Dataset.from_tensor_slices((train_csv.values, train_y.values))
train_dataset = train_dataset.shuffle(len(train_csv)).batch(50)

#Load test csv into tensor
test_dataset = tf.data.Dataset.from_tensor_slices((test_csv.values, test_y.values))
test_dataset = test_dataset.batch(len(test_csv))

In [5]:
def get_compiled_model():
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(500, activation='relu'),
        tf.keras.layers.Dense(250, activation='relu'),
        tf.keras.layers.Dense(1)
    ])

    model.compile(optimizer='adam',
        loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
        metrics=['accuracy'])
    return model

In [6]:
model = get_compiled_model()
model.fit(train_dataset, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x1d4b925cd60>

In [7]:
model.evaluate(test_dataset)



[47.40016555786133, 0.7939316034317017]