Sascha Spors,
Professorship Signal Theory and Digital Signal Processing,
Institute of Communications Engineering (INT),
Faculty of Computer Science and Electrical Engineering (IEF),
University of Rostock,
Germany

# Tutorial Selected Topics in Audio Signal Processing

Winter Semester 2022/23 (Master Course)

- lecture: https://github.com/spatialaudio/selected-topics-in-audio-signal-processing-lecture
- tutorial: https://github.com/spatialaudio/selected-topics-in-audio-signal-processing-exercises

WIP...
The project is currently under heavy development while adding new material for the winter term 2022/23

Feel free to contact lecturer frank.schultz@uni-rostock.de

## Least Squares Regression vs. Orthogonal Regression (via SVD)

using 2D toy example

In [None]:
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(10)
N = 2**10

mean = [0, 0]

cov = [[1, 0.75], [0.75, 1]]
#cov = [[1, 0.2], [0.2, 0.25]]
#cov = [[1, 0.2], [0.2, 0.75]]
#cov = [[1, 0.9999999], [0.9999999, 1]]
#cov = [[1, 0.0001], [0.0001, 1]]

data = np.random.multivariate_normal(mean, cov, N)
# make mean free for fair comparison of both approaches
data = data - np.mean(data)
print('dim of data', data.shape)

# Least Squares Regression X beta = meas, we want to find beta
X = np.array([np.ones(N), data[:, 0]]).T
print('dim of matrix A for LS:', X.shape)
meas = data[:, 1]
# analytical LS solution using left inverse
beta = (np.linalg.inv((X.T@X)) @ X.T) @ meas
print('intercept, slope for LS', beta)
# unit vector along the slope of LS reg line
v_ls = [1, beta[1]]
v_ls = v_ls / np.linalg.norm(v_ls, 2)

# SVD Regression
u, s, vh = np.linalg.svd(data)
# right sing vector with largest sing value indicates the slope of regression line
v_max_sing = vh[:, 0].T
m_svd = v_max_sing[1] / v_max_sing[0]
n_svd = 0  # by definition of spanning U,V spaces
print('intercept, slope for svd', [n_svd, m_svd])

# plot
x_predict = np.linspace(-10, 10, 2)
plt.figure(figsize=(6, 6))
plt.plot(data[:, 0], data[:, 1],
         'C0o', ms=2, label='data')
plt.plot(x_predict, beta[0] + beta[1] * x_predict,
         'C1-', label='LS regression line fit')
# plt.plot(v_max_sing[0] * x_predict, v_max_sing[1] * x_predict,
#         'C3-', label='SVD regression line fit')  # ==
plt.plot(x_predict, n_svd + m_svd * x_predict,
         'C3-', label='SVD regression line fit')
plt.axis('square')
plt.xlim(-4, 4)
plt.ylim(-4, 4)
plt.xticks(np.arange(-4, 5))
plt.yticks(np.arange(-4, 5))
plt.xlabel('data[:,0]')
plt.ylabel('data[:,1]')
plt.legend()
plt.grid(True)

# we hope to find at least one data point:
tmp1 = data[data[:,0] > 3,:]
tmp1 = tmp1[0]  # get first data point of it
plt.plot(tmp1[0],tmp1[1], 'C0o', ms=7)  # draw it large

# connection line between this single data point and the
# regression lines
# SVD line regression: data point projection -> orthogonal
tmp2 = np.inner(tmp1, v_max_sing) * v_max_sing
plt.plot([tmp1[0], tmp2[0]],[tmp1[1], tmp2[1]], 'C3-.', lw=2)
# LS line regression: x of data point == x of regression line 
tmp2 = beta[0] + beta[1]* tmp1[0]
plt.plot([tmp1[0], tmp1[0]],[tmp1[1], tmp2], 'C1-.', lw=2);

# check the minimization criteria:
# for squared sums
ss_ls, ss_svd = 0, 0  
for n in range(data.shape[0]):
    x_data, y_data = data[n, 0], data[n, 1]
    # get squared sums for both approaches
    ss_ls += (y_data - beta[0] + beta[1] * x_data)**2
    ss_svd += (y_data - n_svd + m_svd * x_data)**2
    # compare data points vs. prediction
    # print(y_data, x[0] + x[1]*x_data)  # LS reg
    # print(y_data, n_svd + m_svd*x_data+)  # SVD reg
ss_ls *= 1 / data.shape[0]
ss_svd *= 1 / data.shape[0]
print('\nsquared sums:')
print('ss_ls =', ss_ls, '< ss_svd =', ss_svd)

# for squared orth dist
sod_ls, sod_svd = 0, 0  
for n in range(data.shape[0]):
    # get actual data point
    tmp1 = data[n, :]

    # for SVD
    # project down to v => length, create weighted v
    tmp2 = np.inner(tmp1, v_max_sing) * v_max_sing
    # squared distance between tmp1 and tmp2
    sod_svd += np.linalg.norm(tmp2 - tmp1, 2)**2

    # for LS
    # offset data by intercept of LS regression line
    tmp1[1] += beta[0]
    # then we can use straightforward projection
    tmp2 = np.inner(tmp1, v_ls) * v_ls
    # squared distance between tmp1 and tmp2
    sod_ls += np.linalg.norm(tmp2 - tmp1, 2)**2

sod_ls *= 1 / data.shape[0]
sod_svd *= 1 / data.shape[0]
print('\nsquared orthogonal distances:')
print('sod_ls =', sod_ls, '> sod_svd =', sod_svd)

## Copyright

- the notebooks are provided as [Open Educational Resources](https://en.wikipedia.org/wiki/Open_educational_resources)
- the text is licensed under [Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/)
- the code of the IPython examples is licensed under the [MIT license](https://opensource.org/licenses/MIT)
- feel free to use the notebooks for your own purposes
- please attribute the work as follows: *Frank Schultz, Data Driven Audio Signal Processing - A Tutorial Featuring Computational Examples, University of Rostock* ideally with relevant file(s), github URL https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise, commit number and/or version tag, year.