# Multi-Label Classification with Deep Learning

https://machinelearningmastery.com/multi-label-classification-with-deep-learning/

- Machine Learning Mastery

- Jason Brownlee

Multi-label classification involves predicting zero or more class labels.

Unlike normal classification tasks where class labels are mutually exclusive, multi-label classification requires specialized machine learning algorithms that support predicting multiple mutually non-exclusive classes or “labels.”

In multi-label classification, zero or more labels are required as output for each input sample, and the outputs are required simultaneously.

In summary, to configure a neural network model for multi-label classification, the specifics are:

* Number of nodes in the output layer matches the number of labels.

* Sigmoid activation for each node in the output layer.

* Binary cross-entropy loss function.



In [1]:
from numpy import mean
from numpy import std
from sklearn.datasets import make_multilabel_classification
from sklearn.model_selection import RepeatedKFold
from keras.models import Sequential
from keras.layers import Dense
from sklearn.metrics import accuracy_score

Using TensorFlow backend.


In [16]:
# get the dataset
# n_labels - average number of labels per instance
def get_dataset():
    X, y = make_multilabel_classification(n_samples=1000, n_features=10, n_classes=3, n_labels=2, random_state=1)
    return X, y

In [17]:
X, y = get_dataset()

In [None]:
X.shape

In [18]:
y.shape

(1000, 3)

In [None]:
X[:10]

In [19]:
y[:10]

array([[1, 1, 0],
       [0, 0, 0],
       [1, 1, 0],
       [1, 1, 1],
       [0, 1, 0],
       [0, 0, 0],
       [0, 1, 0],
       [1, 1, 1],
       [1, 1, 1],
       [1, 1, 1]])

In [24]:
def get_model(n_inputs, n_outputs):
    model = Sequential()
    model.add(Dense(20, input_dim=n_inputs, kernel_initializer='he_uniform', activation='relu'))
    model.add(Dense(n_outputs, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam')
    return model


Create a function to evaluate the model with a RepeatedKFold cross validation.

RepeatedKFold:

* n_splits - split the data into 10 'folds' and evaluate the model against each test fold

* n_repeats - perform the 10 'folds' evaluation n_repeats times


In [28]:
# evaluate a model using repeated k-fold cross-validation
def evaluate_model(X, y):
    results = list()
    n_inputs, n_outputs = X.shape[1], y.shape[1]
    # define evaluation procedure
    # n_splits
    cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)
    # enumerate folds
    split_count = 0
    for train_ix, test_ix in cv.split(X):
        split_count += 1
        # prepare data
        X_train, X_test = X[train_ix], X[test_ix]
        y_train, y_test = y[train_ix], y[test_ix]
        # define model
        model = get_model(n_inputs, n_outputs)
        # fit model
        model.fit(X_train, y_train, verbose=0, epochs=100)
        # make a prediction on the test set
        yhat = model.predict(X_test)
        # round probabilities to class labels
        yhat = yhat.round()
        # calculate accuracy
        acc = accuracy_score(y_test, yhat)
        # store result
        print(f'Split[{split_count}] >{acc:.3f}')
        results.append(acc)
    return results

In [22]:
X,y = get_dataset()

In [29]:
model_eval = evaluate_model(X, y)

Split[1] >0.830
Split[2] >0.810
Split[3] >0.870
Split[4] >0.880
Split[5] >0.840
Split[6] >0.840
Split[7] >0.810
Split[8] >0.810
Split[9] >0.790
Split[10] >0.800
Split[11] >0.830
Split[12] >0.830
Split[13] >0.770
Split[14] >0.820
Split[15] >0.810
Split[16] >0.830
Split[17] >0.800
Split[18] >0.880
Split[19] >0.820
Split[20] >0.840
Split[21] >0.780
Split[22] >0.830
Split[23] >0.800
Split[24] >0.850
Split[25] >0.850
Split[26] >0.810
Split[27] >0.800
Split[28] >0.800
Split[29] >0.770
Split[30] >0.860


## Model predictions on unseen data

In [43]:
import numpy as np
# print numpy array values without scientific notation
np. set_printoptions(suppress=True)


In [30]:
X, y = get_dataset()

In [31]:
n_inputs = X.shape[1]
n_outputs = y.shape[1]
model = get_model(n_inputs, n_outputs)

In [32]:
model.fit(X, y, verbose=0, epochs=100)

<keras.callbacks.callbacks.History at 0x15726b4d0>

In [44]:
new_data = [3, 3, 6, 7, 8, 2, 11, 11, 1, 3]
new_data = np.asarray([new_data])

In [37]:
y_pred = model.predict(new_data)

In [42]:
y_pred[0]

array([0.99989164, 0.9879595 , 0.00072322], dtype=float32)

In [46]:
new_data = [1, 1, 2, 4, 8, 10, 8, 8, 5, 6]
new_data = np.asarray([new_data])
y_pred = model.predict(new_data)
y_pred[0]

array([0.99590033, 0.9776813 , 0.9917393 ], dtype=float32)