Sascha Spors,
Professorship Signal Theory and Digital Signal Processing,
Institute of Communications Engineering (INT),
Faculty of Computer Science and Electrical Engineering (IEF),
University of Rostock,
Germany

# Data Driven Audio Signal Processing - A Tutorial with Computational Examples

Winter Semester 2021/22 (Master Course #24512)

- lecture: https://github.com/spatialaudio/data-driven-audio-signal-processing-lecture
- tutorial: https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise

Feel free to contact lecturer frank.schultz@uni-rostock.de

# Exercise 12
CategoricalCrossentropy, i.e.
Multi-Label Classification Using Softmax Loss
using convenient stuff from scikit learn

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder, LabelBinarizer
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.backend as K

print('TF version', tf.__version__,
      '\nKeras version', keras.__version__)

In [None]:
# rng = np.random.RandomState(1)  # for debug
rng = np.random.RandomState()

verbose = 1  # plot training status

In [None]:
# DATA
nlabels = 3  # number of classes
labels = np.arange(nlabels)

m = int(5/4*80000)  # data examples
nx = 2*nlabels  # number of features

# train_size = 1/2  # 50% are used for training
train_size = 4/5  # 80% are used for training
# train_size = 95/100  # 95% are used for training

X, Y = make_classification(n_samples=m,
                           n_features=nx, n_informative=nx,
                           n_redundant=0,
                           n_classes=nlabels, n_clusters_per_class=1,
                           class_sep=1,
                           flip_y=1e-2,
                           random_state=None)
encoder = OneHotEncoder(sparse=False)
Y = encoder.fit_transform(Y.reshape(-1, 1))

X_train, X_test,\
    Y_train, Y_test = train_test_split(
        X, Y, train_size=train_size, random_state=None)
m_train = X_train.shape[0]
m_test = X_test.shape[0]
print('m_train', m_train)
print('X train dim', X_train.shape, 'Y train dim', Y_train.shape)
print('m_test', m_test)
print('X test dim', X_test.shape, 'Y test dim', Y_test.shape, '\n')

In [None]:
# SETUP of TensorFlow MODEL
# hyper parameters (should be learned as well,
# here set for this toy example such that we end
# up in reasonable computing time and results)
epochs = 5
batch_size = 32
no_perceptron_in_hl = np.array([2*nx, 4*nx, nlabels])

optimizer = keras.optimizers.Adam()
loss = keras.losses.CategoricalCrossentropy(
    from_logits=False, label_smoothing=0)
metrics = [keras.metrics.CategoricalCrossentropy(),
           keras.metrics.CategoricalAccuracy()]

model = keras.Sequential()
model.add(keras.Input(shape=(nx,)))
# hidden layers:
for n in no_perceptron_in_hl:
    model.add(keras.layers.Dense(n, activation='relu'))
# output layer with softmax for multi-label classificaton
model.add(keras.layers.Dense(nlabels, activation='softmax'))
model.compile(optimizer=optimizer, loss=loss, metrics=metrics)
tw = np.sum([K.count_params(w) for w in model.trainable_weights])
print('\ntrainable_weights', tw, '\n')
print(model.summary())

In [None]:
# TRAINING PHASE
model.fit(X_train, Y_train,
          validation_data=(X_test, Y_test),
          epochs=epochs,
          batch_size=batch_size,
          verbose=verbose)
print(model.summary())
# print(model.get_weights())

In [None]:
print('\n\nmetrics on train data:')
results = model.evaluate(X_train, Y_train,
                         batch_size=m_train,
                         verbose=verbose)
print('Cost', results[0],
      '\nCategoricalCrossentropy', results[1],
      '\nCategoricalAccuracy', results[2])
Y_train_pred = model.predict(X_train)
# https://stackoverflow.com/questions/48908641/how-to-get-a-single-value-from-softmax-instead-of-probability-get-confusion-ma:
lb = LabelBinarizer()
lb.fit(labels)
cm = tf.math.confusion_matrix(labels=lb.inverse_transform(Y_train),
                              predictions=lb.inverse_transform(Y_train_pred),
                              num_classes=nlabels)
print('confusion matrix in %\n', cm/m_train*100)
print('categorical_accuracy from confusion matrix = ',
      np.sum(np.diag(cm.numpy())) / m_train * 100, '%')


print('\n\nmetrics on test data:')
results = model.evaluate(X_test, Y_test,
                         batch_size=m_test,
                         verbose=verbose)
print('Cost', results[0],
      '\nCategoricalCrossentropy', results[1],
      '\nCategoricalAccuracy', results[2])
Y_test_pred = model.predict(X_test)
cm = tf.math.confusion_matrix(labels=lb.inverse_transform(Y_test),
                              predictions=lb.inverse_transform(Y_test_pred),
                              num_classes=nlabels)
print('confusion matrix in %\n', cm/m_test*100)
print('categorical_accuracy from confusion matrix = ',
      np.sum(np.diag(cm.numpy())) / m_test * 100, '%')

## Copyright

- the notebooks are provided as [Open Educational Resources](https://en.wikipedia.org/wiki/Open_educational_resources)
- feel free to use the notebooks for your own purposes
- the text is licensed under [Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/)
- the code of the IPython examples is licensed under under the [MIT license](https://opensource.org/licenses/MIT)
- please attribute the work as follows: *Frank Schultz, Data Driven Audio Signal Processing - A Tutorial Featuring Computational Examples, University of Rostock* ideally with relevant file(s), github URL https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise, commit number and/or version tag, year.