SVD

\begin{align*}
\underset{m\times n}{A}  &= U \Sigma V^T \\
\end{align*}

where $U$ and $V^T$ are orthonormal matrices, and $\Sigma$ is a diagonal matrix.

One common way of calculating SVD is as follows

\begin{align*}
\underset{n\times n}{A^TA} 
&= (V \Sigma^T U^T)U \Sigma V^T \\
&= V \Sigma^T \Sigma V^T \\
\end{align*}

So calculating the eigenvalues and eigenvectors and $A^TA$ would yield $\Sigma$ and $V^T$, which are of shapes $k \times k$ and $k \times n$, respectively, where $k$ is the number of positive (non-zero) eigenvalues, $k \le n$.

Then, $U$ could be obtained by

\begin{align*}
\underset{m\times n}{A}\; \underset{n \times k}{V}  &= \underset{m\times n}{U} \; \underset{k \times k}{\Sigma} \\
\underset{m\times k}{U} &= \underset{m\times n}{A}\; \underset{n \times k}{V}\; \underset{k \times k}{\Sigma^{-1}} \\
\end{align*}

So the solution is

\begin{align*}
\underset{m\times n}{A}  &= \underset{m \times k}{U}\; \underset{k \times k}{\Sigma}\; \underset{k \times n}{V^T} \\
\end{align*}

Since resulting $U$ and $V$ are non-square with uninformative columns in $U$ and rows in $V^T$ discarded, this version of SVD is named reduced SVD. 

To conduct full SVD, just leave the $n-k$ rows in $V.T$ intact, and keep all 0 eigenvalues in $\Sigma$. In an analogous manner, The $U$ shape $m \times m$ could be calculated by eigendecomposition of $AA^T = U\Sigma V^T(V\Sigma U^T) = U \Sigma U^T$.

**A question not understood yet**, what does uninformative mean, why ignoring those columns and rows won't affect reconstruction?

# Experiment

In [1]:
import numpy as np

In [2]:
np.set_printoptions(suppress=True, precision=4)

Below is the SVD of two matrices of shapes $m < n$ and $m > n$, respectively. For each matrix, we conduct SVD first, and then reconstruct the original matrix with decomposed parts.

### $m < n$

In [3]:
# An example taken from http://www.d.umn.edu/~mhampton/m4326svd_example.pdf
A = np.array([[3, 2, 2], 
              [2, 3, -2]])

In [4]:
eigvals, eigvecs = np.linalg.eig(A.T @ A)

In [5]:
sorted_indices = np.argsort(eigvals)[::-1]
eigvals = eigvals[sorted_indices]
eigvecs = eigvecs[:, sorted_indices]    # ignore the elements in eigenvectors corresponding to zero-eigenvalues

In [6]:
positive_indices = eigvals > 1e-8

In [7]:
S = np.sqrt(eigvals[positive_indices])    # Sigma
V = eigvecs[:,positive_indices]

In [8]:
U = A @ V @ np.diag(1 / S)

In [9]:
print(A, A.shape)

[[ 3  2  2]
 [ 2  3 -2]] (2, 3)


In [10]:
print(U, U.shape)

[[-0.7071  0.7071]
 [-0.7071 -0.7071]] (2, 2)


In [11]:
print(np.diag(S), np.diag(S).shape)

[[5. 0.]
 [0. 3.]] (2, 2)


In [12]:
print(V.T, V.T.shape)

[[-0.7071 -0.7071 -0.    ]
 [ 0.2357 -0.2357  0.9428]] (2, 3)


In [13]:
print(U @ np.diag(S) @ V.T)

[[ 3.  2.  2.]
 [ 2.  3. -2.]]


In [14]:
np.allclose(A, U @ np.diag(S) @ V.T)

True

### $m > n$

In [15]:
A = np.array([[ 3,  2],
              [ 2,  3],
              [ 2, -2]])

In [16]:
eigvals, eigvecs = np.linalg.eig(A.T @ A)

In [17]:
sorted_indices = np.argsort(eigvals)[::-1]
eigvals = eigvals[sorted_indices]
eigvecs = eigvecs[:, sorted_indices]    # ignore the elements in eigenvectors corresponding to zero-eigenvalues

In [18]:
positive_indices = eigvals > 1e-8

In [19]:
S = np.sqrt(eigvals[positive_indices])    # Sigma
V = eigvecs[:,positive_indices]

In [20]:
U = A @ V @ np.diag(1 / S)

In [21]:
print(A, A.shape)

[[ 3  2]
 [ 2  3]
 [ 2 -2]] (3, 2)


In [22]:
print(U, U.shape)

[[ 0.7071 -0.2357]
 [ 0.7071  0.2357]
 [ 0.     -0.9428]] (3, 2)


In [23]:
print(np.diag(S), np.diag(S).shape)

[[5. 0.]
 [0. 3.]] (2, 2)


In [24]:
print(V.T, V.T.shape)

[[ 0.7071  0.7071]
 [-0.7071  0.7071]] (2, 2)


In [25]:
print(U @ np.diag(S) @ V.T)

[[ 3.  2.]
 [ 2.  3.]
 [ 2. -2.]]


In [26]:
np.allclose(A, U @ np.diag(S) @ V.T)

True