Sascha Spors,
Professorship Signal Theory and Digital Signal Processing,
Institute of Communications Engineering (INT),
Faculty of Computer Science and Electrical Engineering (IEF),
University of Rostock,
Germany

# Data Driven Audio Signal Processing - A Tutorial with Computational Examples

Winter Semester 2021/22 (Master Course #24512)

- lecture: https://github.com/spatialaudio/data-driven-audio-signal-processing-lecture
- tutorial: https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise

Feel free to contact lecturer frank.schultz@uni-rostock.de

# Exercise 4: SVD and Left Matrix Inverse

## Objectives

- SVD for tall/thin, full column rank matrix
- Four subspaces in SVD domain
- Projection matrices
- Least squares / normal equation
- Left inverse

## Special Python Packages

Some convenient functions are found in `scipy.linalg`, some in `numpy.linalg` 

## Some Initial Python Stuff

In [None]:
import numpy as np
from scipy.linalg import svd, diagsvd, inv, pinv, null_space, norm
from numpy.linalg import matrix_rank

np.set_printoptions(precision=2, floatmode='fixed', suppress=True)

rng = np.random.default_rng(1234)
mean, stdev = 0, 1

# we might try out that all works for complex data as well
# for complex data the ^H operator (conj().T) needs used instead of .T only!!!
use_complex = False

## SVD of Tall/Thin, Full Column Rank Matrix A

In [None]:
M = 7  # number of rows
N = 3  # number of cols

rank = min(M, N)  # set desired rank == full column rank == independent columns
print('desired rank of A:', rank)

if use_complex:
    dtype = 'complex128'
    A = np.zeros([M, N], dtype=dtype)
    for i in range(rank):
        col = rng.normal(mean, stdev, M) + 1j*rng.normal(mean, stdev, M)
        row = rng.normal(mean, stdev, N) + 1j*rng.normal(mean, stdev, N)
        A += np.outer(col, row)  # superposition of rank-1 matrices
else:
    dtype = 'float64'
    A = np.zeros([M, N], dtype=dtype)
    for i in range(rank):
        col = rng.normal(mean, stdev, M)
        row = rng.normal(mean, stdev, N)
        A += np.outer(col, row)  # superposition of rank-1 matrices
# check if rng produced desired rank
print('        rank of A:', matrix_rank(A))
print('tall/thin matrix with full column rank')
print('-> matrix V contains only the row space')
print('-> null space is only the zero vector')
print('A =\n', A)

In [None]:
[U, s, Vh] = svd(A)
S = diagsvd(s, M, N)
V = Vh.conj().T

print('U =\n', U)
# number of non-zero sing values along diag must match rank
print('non-zero singular values: ', s[:rank])
print('S =\n', S)
print('V =\n', V)

## Four Subspaces in SVD Domain

The null space $N(\mathbf{A})$ of the tall/thin, full column rank matrix $\mathbf{A}$ is only $\mathbf{0}$, i.e. $N(\mathbf{A})=\mathbf{0}$. Except for $\mathbf{x}=\mathbf{0}$ all other $\mathbf{x}$ are mapped to the column space $C(\mathbf{A})$. This, however, requires, that the $\mathbf{V}$ matrix completely spans the row space and no $\mathbf{v}$ vectors span a dedicated null space.

The tall/thin, full column rank matrix $\mathbf{A}$ spans a rather large left null space $N(\mathbf{A}^\mathrm{H})$ with dimension $M-\mathrm{rank}(A)$.

We, therefore, here deal with a linear set of equations with **more equations than unknowns** ($M>N$, more rows than columns) , i.e. the **over-determined** case. For this case, we can find a solution in **least-sqaures** sense by help of the **left inverse** as discussed below.

In [None]:
# all stuff that is in matrix U
print('U =\n', U)

# Column Space C(A)
print('\ncolumn space (ortho to left null space):')
print(U[:, :rank])

# Left Null Space, if empty only 0 vector
print('left null space (ortho to column space):')
print(U[:, rank:])

print('###')

# all stuff that is in matrix V
print('\nV =\n', V)

# Row Space
print('\nrow space (ortho to null space):')
print(V[:, :rank])

# Null Space N(A), if empty only 0 vector
print('null space (ortho to row space):')
print(V[:, rank:])  # for full column rank this is only the zero vector

## Left Inverse via SVD

In [None]:
Si = diagsvd(1/s, N, M)  # works if array s has only non-zero entries
print('Inverse singular value matrix with right zero block')
print('Si =\n', Si)
# left inverse using 'inverse' SVD:
Ali = V @ Si @ U.conj().T
# left inverse using a dedicated pinv algorithm
# proper choice is done by pinv() itself
Ali_pinv = pinv(A)
print('pinv == inverse SVD?', np.allclose(Ali, Ali_pinv))
print('Si @ S\n', Si @ S, '\nyields NxN identity matrix')

## Projection Matrices for the Left Inverse Problem

In [None]:
u_col = U[:,0]
u_lnull = U[:,rank]
u_tmp = u_col + u_lnull

# projection onto row space == I_NxN
# full rank and identity since we don have a null space
# so each vector of the row space is projected onto itself
P_CAH = Ali @ A
print('projection matrix P_CAH projects a row space vector onto itself:\n', P_CAH @ V[:,0], '==', V[:,0])

# projection onto column space
P_CA = A @ Ali
print('projection matrix P_CA projects U-space stuff to column space:\n', P_CA @ (u_tmp), '==', u_col)

# projection onto null space == null matrix
P_NA = np.eye(N, N) - P_CAH

# projection onto left null space
P_NAH = np.eye(M, M) - P_CA
print('projection matrix P_NAH projects U-space stuff to left null space:\n', P_NAH @ (u_tmp), '==', u_lnull)

In [None]:
# design a vector: one entry from column space + one entry from left null space
# so we take some special left singular vectors:
b = U[:, 0] + U[:, rank]  # same as u_tmp above

# the vector b is now a linear combination and lives in column space+left null space
# we can project b
# (i) onto column space and
# (ii) onto left null space
# with above introduced projection matrices
bhat = P_CA @ b  # project b to column space C(A)
e = P_NAH @ b  # project b to left null space N(A^H)

# bring b to row space via left inverse:
# only the column space part of b is brought to the row space
# we can never map back a zero
print(Ali @ b)
print(Ali @ bhat)
# we expect that this is the scaled first right singular vector
print(V[:, 0] / S[0, 0])

## Left Inverse from Normal Equation / Least Squares Optimization

### Derivation 1

Vector addition from example above

$\hat{\mathbf{b}} + \mathbf{e} = \mathbf{b} \rightarrow \mathbf{e} = \mathbf{b} - \hat{\mathbf{b}}$

We know, that pure row space $\hat{\mathbf{x}}$ maps to pure column space $\hat{\mathbf{b}}$

$\hat{\mathbf{b}} = \mathbf{A} \hat{\mathbf{x}}$

Inserting this

$\mathbf{e} = \mathbf{b} - \hat{\mathbf{b}} = \mathbf{b} - \mathbf{A} \hat{\mathbf{x}}$

The vector $\mathbf{A} \mathbf{x}$ is always living in the column space no matter what $\mathbf{x}$ is constructed of.

The vector $\mathbf{e}$ is orthogonal to column space (since it lives in left null space).

So, we know that the inner product must solve to zero

$(\mathbf{A} \mathbf{x})^\mathrm{H} \mathbf{e} = 0 \rightarrow \mathbf{x}^\mathrm{H} \mathbf{A}^\mathrm{H} (\mathbf{b} - \mathbf{A} \hat{\mathbf{x}}) = 0$

Rearranging yields

$\mathbf{x}^\mathrm{H} \mathbf{A}^\mathrm{H} \mathbf{b} = \mathbf{x}^\mathrm{H} \mathbf{A}^\mathrm{H} \mathbf{A} \hat{\mathbf{x}}$

and by canceling $\mathbf{x}^\mathrm{H} $ the famous normal equation is obtained

$\mathbf{A}^\mathrm{H} \mathbf{b} = \mathbf{A}^\mathrm{H} \mathbf{A} \hat{\mathbf{x}}$

This can be solved using the left inverse of $\mathbf{A}^\mathrm{H} \mathbf{A}$ (this matrix is full rank and therefore invertible)

$(\mathbf{A}^\mathrm{H} \mathbf{A})^{-1} \mathbf{A}^\mathrm{H} \mathbf{b} = (\mathbf{A}^\mathrm{H} \mathbf{A})^{-1} (\mathbf{A}^\mathrm{H} \mathbf{A}) \hat{\mathbf{x}}$

Since for left inverse $(\mathbf{A}^\mathrm{H} \mathbf{A})^{-1} (\mathbf{A}^\mathrm{H} \mathbf{A}) = \mathbf{I}$ holds, we get the least-squares sense solution for $\mathbf{x}$ in the row space of $\mathbf{A}$

$(\mathbf{A}^\mathrm{H} \mathbf{A})^{-1} \mathbf{A}^\mathrm{H} \mathbf{b} = \hat{\mathbf{x}}$

We find the **left inverse** of $\mathbf{A}$ as

$\mathbf{A}^{+L} = (\mathbf{A}^\mathrm{H} \mathbf{A})^{-1} \mathbf{A}^\mathrm{H}$

### Derivation 2

Here, we see from where the term least squares comes from.

We set up an optimization problem defining **least** amount of **squared** length of the error vector

$\mathrm{min}_\mathbf{x} ||\mathbf{e}||^2_2 = \mathrm{min}_\mathbf{x} ||\mathbf{b} - \mathbf{A} {\mathbf{x}}||_2^2$

We could solve this with help of calculus. But we have a nice tool, i.e. the properties of subspaces, that not requires pages of calculation:

We must find the minimum distance from $\mathbf{b}$ to the column space of $\mathbf{A}$.

This can be only achieved if the error vector $\mathbf{e}=\mathbf{b} - \mathbf{A} {\mathbf{x}}$ is orthogonal to the column space of $\mathbf{A}$. 

This in turn means that $\mathbf{e}$ must live in the left null space of $\mathbf{A}$, i.e. $\mathbf{e} \in N(\mathbf{A}^\mathrm{H})$. 

By definition of left nullspace we have $\mathbf{A}^\mathrm{H} \mathbf{e} = \mathbf{0}$. 

So, let us insert $\mathbf{e}$ into $\mathbf{A}^\mathrm{H} \mathbf{e} = \mathbf{0}$:

$\mathbf{A}^\mathrm{H} (\mathbf{b} - \mathbf{A} {\mathbf{x}}) = \mathbf{0}$

$\mathbf{A}^\mathrm{H} \mathbf{b} = \mathbf{A}^\mathrm{H} \mathbf{A} {\mathbf{x}}$

The optimum $\mathbf{x}$ which solves the problem is just as above in derivation 1

$(\mathbf{A}^\mathrm{H} \mathbf{A})^{-1} \mathbf{A}^\mathrm{H} \mathbf{b} = \hat{\mathbf{x}}$

And again, we find the **left inverse** of $\mathbf{A}$ as

$\mathbf{A}^{+L} = (\mathbf{A}^\mathrm{H} \mathbf{A})^{-1} \mathbf{A}^\mathrm{H}$


In [None]:
Ali_normaleq = inv(A.conj().T @ A) @ A.conj().T

xhat = Ali_normaleq @ b  # LS solution in row space
print('xhat = ', xhat)
bhat = A @ xhat  # map to column space

# thus this is the projection matrix that maps b to the column space of A in LS sense
P_CA2 = A @ Ali_normaleq

print('P_CA == P_CA2 ?', np.allclose(P_CA, P_CA2))  # check with above result

print('bhat = ', P_CA2 @ b)  # check that both outputs are equal
print('bhat = ', bhat)

## Copyright

- the notebooks are provided as [Open Educational Resources](https://en.wikipedia.org/wiki/Open_educational_resources)
- feel free to use the notebooks for your own purposes
- the text is licensed under [Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/)
- the code of the IPython examples is licensed under under the [MIT license](https://opensource.org/licenses/MIT)
- please attribute the work as follows: *Frank Schultz, Data Driven Audio Signal Processing - A Tutorial Featuring Computational Examples, University of Rostock* ideally with relevant file(s), github URL https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise, commit number and/or version tag, year.
