# A Binary Classifier for Tabular Data

In this example we train a simple neural network for binary classification on a benign tabular data set. We also learn how to plug `keras` and `scikit-learn` together with an adapter.

In [None]:
import data_science_learning_paths
data_science_learning_paths.setup_plot_style()

In [None]:
import pandas
import seaborn
import tensorflow as tf
from tensorflow import keras

## The Dataset

We are going to use the well-known _iris_ dataset since we know that it poses a simple classification problem due to a distribution of the data points that make them clearly separable.

In [None]:
iris_data = data_science_learning_paths.datasets.read_iris()

In [None]:
iris_data["species"] = iris_data["species"].apply(lambda l: 1 if l == 0 else 0)

In [None]:
iris_data.head()

In [None]:
seaborn.pairplot(iris_data, vars=iris_data.columns.difference(["species"]), hue="species")

The dataset is now split into target and feature columns, was well as into a training and test set:

In [None]:
X = iris_data[iris_data.columns.difference(["species"])].values
X.shape

In [None]:
y = iris_data["species"].values
y.shape

In [None]:
from sklearn.model_selection import train_test_split

In [None]:
X_train, X_test, y_train, y_test = train_test_split(
    iris_data[iris_data.columns.difference(["species"])],
    iris_data["species"],
    train_size=0.5, 
    shuffle=True
)

In [None]:
X_train.shape

In [None]:
y_train.shape

## The Network

The network we use for this purpose is quite simple: Two fully-connected layers, with the last layer being a single neuron.

Since we are going to use it with the `keras`/`sklearn` API, we need to write a function that builds and compiles the network.

In [None]:
def build_binary_classifier(n_neurons=5, input_dim=4):
    net = keras.models.Sequential(
        [
            keras.layers.Dense(
                units=n_neurons, 
                input_dim=input_dim, 
                activation="relu", 
                kernel_initializer='random_uniform'
            ),
            keras.layers.Dense(
                units=1, 
                activation="sigmoid"
            ),
        ]
    )
    net.compile(
        optimizer="adam",
        loss="binary_crossentropy",
        metrics=["binary_accuracy"]
    )
    return net

In [None]:
build_binary_classifier(n_neurons=5).summary()

This function is now passed to a wrapper class that aims to implement the `sklearn` estimator interface:

In [None]:
keras.wrappers.scikit_learn.KerasClassifier(
    build_fn=build_binary_classifier,
    epochs=100, 
    batch_size=5,
)


We can now pass the data in the shape usual for `sklearn` to train the classifier...

In [None]:
model = keras.wrappers.scikit_learn.KerasClassifier(
    build_fn=build_binary_classifier,
    epochs=20, 
    batch_size=5,
)
model.fit(
    x=X_train,
    y=y_train
)

... and ask for class predictions via the `predict` method. However, we get the predictions in the nested array shape that is usual for TensorFlow, so we need to flatten the array to get a 1D-vector.

In [None]:
y_pred = model.predict(X_test).flatten()

In [None]:
y_pred

In [None]:
y_pred.shape

A quick evaluation on the test set show: We have a perfect classifier.

In [None]:
from sklearn.metrics import f1_score

In [None]:
f1_score(y_test, y_pred)

In [None]:
X_test["pred"] = y_pred

In [None]:
seaborn.pairplot(
    X_test,
    vars=X_test.columns.difference(["pred"]),
    hue="pred"
)

## References

- [Keras Tutorial: Deep Learning in Python](https://www.datacamp.com/community/tutorials/deep-learning-python)
- [Binary Classification with Keras](https://machinelearningmastery.com/binary-classification-tutorial-with-the-keras-deep-learning-library/)

---
_This notebook is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/). Copyright © 2018-2025 [Point 8 GmbH](https://point-8.de)_