Sascha Spors,
Professorship Signal Theory and Digital Signal Processing,
Institute of Communications Engineering (INT),
Faculty of Computer Science and Electrical Engineering (IEF),
University of Rostock,
Germany

# Data Driven Audio Signal Processing - A Tutorial with Computational Examples

Winter Semester 2022/23 (Master Course #24512)

- lecture: https://github.com/spatialaudio/data-driven-audio-signal-processing-lecture
- tutorial: https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise

Feel free to contact lecturer frank.schultz@uni-rostock.de

In [None]:
import numpy as np
import tensorflow as tf
from scipy.linalg import toeplitz

# Correlation / Convolution of 1D-Arrays

- we learned about discrete-time convolution in a basic signals and systems course
- amongst other stuff, we typically learned about LTI systems that exhibit finite impulse responses and we convolved these with finite-length discrete-time input signals to obtain finite-length output signals
- it is a good idea to revisit this 1D convolution using the Toeplitz matrix approach, cf. https://github.com/spatialaudio/signals-and-systems-exercises/blob/master/convolution_dt/convolution_discrete_4CBF4358D5.ipynb
and
https://github.com/spatialaudio/signals-and-systems-exercises/blob/master/convolution_dt/convolution_discrete_FD58EEB1EC.ipynb
- in machine learning convolution (or more specific correlation, see below) of higher-dimension arrays is a vivid part of the optimization job
- we therefore should check what types of convolution are implemented within Tensorflow
- it is important to realize, that most often in machine learning the wording **convolution** refers to the actual **correlation** operation. As the weights of the filters/correlators are to be learned anyway, it is actually not really important, if the input array is flipped (then performing a conv) or not (then performing a corr) before processing. We only need to be aware of it. As always it is all about the tiny details that make life easy (if being aware of the details) or hard (if not).
- see the toy examples below to see the difference between conv and corr, and what TF is actually doing

## Correlation / Convolution with Toeplitz matrix

In [None]:
x = np.array([0, 1, 1, 2, 0, -1, 0], dtype=np.int32)
h = np.array([2, 1, -1], dtype=np.int32)

# 0-vec with full conv length
tmp = np.zeros(x.shape[0] + h.shape[0] - 1, dtype=np.int32)
tmp[0:x.shape[0]] = x  # insert x
r = np.copy(h) * 0  # prep 0-row vec as long as h
r[0] = x[0]  # 1,1 entry of toeplitz must match 1st col entry
A = toeplitz(tmp, r)  # create toeplitz matrix

print('full correlation:', A @ np.flip(h))
print('full convolution:', A @ h)

## Correlation / Convolution with tf.nn.conv1d

### SAME Flag

- output signal y has same length as input signal x
- in order to realize a **full** correlation or convolution we need to zero-pad input signal x appropriately, since tf.nn.conv1d has **no 'FULL'** option
- if not zero-padded, the algorithm yields a part of the linear corr/conv result, we need to decide if this is useful result for us!

In [None]:
# for flag='SAME' we ensure that length(x) = length(y)
# by left/right zeropadding for x:
x = tf.constant([0, 1, 1, 2, 0, -1, 0],
                dtype=tf.int32, name='x')
h = tf.constant([2, 1, -1],
                dtype=tf.int32, name='h')

#### correlation

In [None]:
data = tf.reshape(x, [1, int(x.shape[0]), 1], name='data')

kernel = tf.reshape(h, [int(h.shape[0]), 1, 1], name='kernel')

# conv1d is actually a correlation
res = tf.squeeze(tf.nn.conv1d(data, kernel, 1, 'SAME'))
res = np.array(res)
# thus check with np.correlate
print('  ', res)
print('  ', np.correlate(x, h, 'SAME'))
print(np.correlate(x, h, 'FULL'))

#### convolution

In [None]:
data = tf.reshape(x, [1, int(x.shape[0]), 1], name='data')

# flip h to go for a real convolution
kernel = tf.reshape(np.flip(h), [int(h.shape[0]), 1, 1], name='kernel')

# conv1d with flipped h is a convolution
res = tf.squeeze(tf.nn.conv1d(data, kernel, 1, 'SAME'))
res = np.array(res)
# thus check with np.convolve
print('  ', res)
print('  ', np.array(tf.squeeze(tf.nn.convolution(data, kernel, 1, 'SAME'))))
print('  ', np.convolve(x, h, mode='SAME'))
print(np.convolve(x, h, mode='FULL'))

### VALID Flag

- full-overlapping part of input signal x and kernel (filter) h is considered only, we need to decide if this is useful result for us!

In [None]:
# for flag='VALID' only the full-overlapping part
# is returned as result, thus no zero-padding
# required
x = tf.constant([1, 1, 2, 0, -1],
                dtype=tf.int32, name='x')
h = tf.constant([2, 1, -1],
                dtype=tf.int32, name='h')

#### correlation

In [None]:
data = tf.reshape(x, [1, int(x.shape[0]), 1], name='data')

kernel = tf.reshape(h, [int(h.shape[0]), 1, 1], name='kernel')

# conv1d is actually a correlation
res = tf.squeeze(tf.nn.conv1d(data, kernel, 1, 'VALID'))
res = np.array(res)
# thus check with np.correlate
print('       ', res)
print('       ', np.correlate(x, h, 'VALID'))
print(np.correlate(x, h, 'FULL'))

#### convolution

In [None]:
data = tf.reshape(x, [1, int(x.shape[0]), 1], name='data')

# flip h to go for a real convolution
kernel = tf.reshape(np.flip(h), [int(h.shape[0]), 1, 1], name='kernel')

# conv1d with flipped h is a convolution
res = tf.squeeze(tf.nn.conv1d(data, kernel, 1, 'VALID'))
res = np.array(res)
# thus check with np.convolve
print('     ', res)
print('     ', np.convolve(x, h, mode='VALID'))
print(np.convolve(x, h, mode='FULL'))

## How to handle circular convolutions with tf.nn.conv1d?

- this is probably not possible by data pre-arranging only, as the conv/corr kernel does not consider signal repetitions?!
- do we actually need circ convs in machine learning applications?
- when do we deal with really periodic signals in ML practice?

## Copyright

- the notebooks are provided as [Open Educational Resources](https://en.wikipedia.org/wiki/Open_educational_resources)
- the text is licensed under [Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/)
- the code of the IPython examples is licensed under the [MIT license](https://opensource.org/licenses/MIT)
- feel free to use the notebooks for your own purposes
- please attribute the work as follows: *Frank Schultz, Data Driven Audio Signal Processing - A Tutorial Featuring Computational Examples, University of Rostock* ideally with relevant file(s), github URL https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise, commit number and/or version tag, year.