# Demonstrate Multi Label Classification

Deep learning neural networks are an example of an algorithm that natively supports multi-label classification problems. Neural network models for multi-label classification tasks can be easily defined and evaluated using the Keras deep learning library.

## All about multilabel classification
Typically, a classification task involves predicting a single label. Alternately, it might involve predicting the likelihood across two or more class labels. In these cases, the classes are mutually exclusive, meaning the classification task assumes that the input belongs to one class only.

Some classification tasks require predicting more than one class label. This means that class labels or class membership are not mutually exclusive. These tasks are referred to as multiple label classification, or multi-label classification for short.

In multi-label classification, zero or more labels are required as output for each input sample, and the outputs are required simultaneously. The assumption is that the output labels are a function of the inputs

https://machinelearningmastery.com/multi-label-classification-with-deep-learning/


## Library imports

In [12]:
import numpy as np
from sklearn.datasets import make_multilabel_classification
from sklearn.model_selection import RepeatedKFold
from sklearn.metrics import accuracy_score
import tensorflow as tf

## Demonstrate multi label classification inbuilt function

In [13]:
# define dataset
X, y = make_multilabel_classification(n_samples=2000, n_features=10, n_classes=3, n_labels=2, random_state=101)

# summarize dataset shape
print(X.shape, y.shape)

# Summarize firt few samples
for index in range(10):
    print(X[index], y[index])

(2000, 10) (2000, 3)
[3. 7. 2. 2. 0. 6. 2. 9. 5. 3.] [1 1 0]
[ 5.  3.  7.  3.  1.  4.  5.  4.  6. 13.] [1 1 1]
[5. 5. 4. 7. 4. 6. 6. 5. 6. 8.] [1 1 1]
[ 5. 11.  8.  6.  0.  3.  0. 13.  7.  7.] [0 1 0]
[ 4.  5.  9.  5.  5.  4. 11.  4.  6.  7.] [0 0 0]
[ 7.  7.  6.  6.  2.  5.  0. 10.  6.  3.] [0 1 0]
[4. 7. 7. 5. 7. 6. 4. 4. 3. 4.] [0 0 0]
[3. 2. 0. 5. 0. 5. 3. 7. 4. 2.] [1 1 0]
[ 4.  4.  5.  5.  2.  8.  3.  1.  3. 11.] [1 1 1]
[ 1.  3.  1.  6.  4. 13.  3.  7.  5. 10.] [1 1 0]


## Define some functions

In [14]:
# to create/get the dataset

def get_dataset():
    X, y = make_multilabel_classification(n_samples=2000, n_features=10, n_classes=3, n_labels=2, random_state=101)
    return X, y

In [15]:
# to create and compile the neural network model
def get_model(n_inputs, n_outputs):
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.Dense(20, input_dim=n_inputs, kernel_initializer='he_uniform', activation='relu'))
    model.add(tf.keras.layers.Dense(n_outputs, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam')
    return model

In [16]:
# evaluate a model using repeated k-fold cross-validation
def evaluate_model(X, y):
	results = list()
	n_inputs, n_outputs = X.shape[1], y.shape[1]
	# define evaluation procedure
	cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=101)
	# enumerate folds
	for train_ix, test_ix in cv.split(X):
		# prepare data
		X_train, X_test = X[train_ix], X[test_ix]
		y_train, y_test = y[train_ix], y[test_ix]
		# define model
		model = get_model(n_inputs, n_outputs)
		# fit model
		model.fit(X_train, y_train, verbose=1, epochs=10)
		# make a prediction on the test set
		yhat = model.predict(X_test)
		# round probabilities to class labels
		yhat = yhat.round()
		# calculate accuracy
		acc = accuracy_score(y_test, yhat)
		# store result
		print('>%.3f' % acc)
		results.append(acc)
	return results

## Main Program

In [17]:
# load dataset
X, y = get_dataset()

In [18]:
# Evaluate Model
results = evaluate_model(X, y)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
>0.605
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
>0.465
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
>0.570
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
>0.565
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
>0.620
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
>0.610
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
>0.600
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
>0.585
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
E

In [20]:
# summarize performance
print('Accuracy: %.3f (%.3f)' % (np.mean(results), np.std(results)))

Accuracy: 0.579 (0.042)
