## basics

- Spectral Normalization for Generative Adversarial Networks
    - https://arxiv.org/abs/1802.05957
- pytorch 的两个接口
    - old：`torch.nn.utils.spectral_norm`
    - new：`torch.nn.utils.parametrizations.spectral_norm`

## 矩阵的谱范数

$$
\|A\|_2 = \max_{\|x\|\neq 0}\frac{\|Ax\|_2}{\|x\|_2}=\sqrt{\lambda_\max(A^TA)}=\sigma_\max(A)
$$

- A的谱范数 = A的最大奇异值 = A^T·A的最大特征值的平方根
- The spectral norm (also know as Induced 2-norm) is the maximum singular value of a matrix. Intuitively, you can think of it as the maximum 'scale', by which the matrix can 'stretch' a vector.

- The maximum singular value is the square root of the maximum eigenvalue or the maximum eigenvalue if the matrix is symmetric/hermitian

- 两种计算方法
    - svd 分解
    - 幂迭代法 (Power Iteration Method)

In [32]:
import numpy as np
A = np.random.randint(0, 5, (5, 4))
A

array([[1, 1, 1, 2],
       [1, 0, 0, 1],
       [4, 4, 3, 0],
       [2, 1, 4, 4],
       [1, 2, 1, 4]])

In [49]:
x = np.random.randn(4, 1)
x

array([[0.3655181 ],
       [1.86415228],
       [0.98671532],
       [0.14176211]])

In [50]:
np.linalg.norm(x, 2)

2.14531369176711

In [51]:
np.linalg.norm(A.dot(x), 2)

15.363845704020635

In [52]:
np.linalg.svd(A)

(array([[-0.27555358,  0.15562145,  0.16779797, -0.10907471, -0.92725333],
        [-0.11022474,  0.08816295,  0.11996284, -0.96553443,  0.18283868],
        [-0.57781611, -0.80835168,  0.09880305,  0.01432065,  0.05223962],
        [-0.61963188,  0.36789398, -0.68201428,  0.04185011,  0.11753915],
        [-0.44057418,  0.42335663,  0.6946562 ,  0.23214104,  0.30037784]]),
 array([9.17512311, 4.43478733, 2.08144212, 0.90408843]),
 array([[-0.47703782, -0.44550882, -0.5371158 , -0.53428778],
        [-0.41275435, -0.42012793, -0.08444623,  0.80373827],
        [ 0.00653425,  0.610302  , -0.75389739,  0.24316146],
        [-0.77590339,  0.50253943,  0.368801  , -0.09702511]]))

## pytorch api

In [43]:
import numpy as np
import torch 
from torch import nn

In [44]:
m = nn.Linear(5, 4)

In [45]:
W = m.weight.clone()
W

tensor([[ 0.2564,  0.0280, -0.3555, -0.1643, -0.3643],
        [-0.0410, -0.4005,  0.2989, -0.2165,  0.2943],
        [ 0.2819,  0.1352,  0.4254,  0.1120,  0.1435],
        [-0.2132, -0.3698, -0.2455, -0.0125, -0.3623]],
       grad_fn=<CloneBackward0>)

In [46]:
m_sn = nn.utils.parametrizations.spectral_norm(m)
m_sn.weight

tensor([[ 0.2870,  0.0313, -0.3978, -0.1838, -0.4076],
        [-0.0459, -0.4482,  0.3345, -0.2423,  0.3293],
        [ 0.3154,  0.1513,  0.4760,  0.1253,  0.1606],
        [-0.2386, -0.4139, -0.2747, -0.0140, -0.4054]], grad_fn=<DivBackward0>)

In [47]:
U, s, V = np.linalg.svd(W.detach().numpy())
s

array([0.89362526, 0.6472171 , 0.37808868, 0.25057557], dtype=float32)

In [48]:
W/s[0]

tensor([[ 0.2870,  0.0313, -0.3978, -0.1838, -0.4076],
        [-0.0459, -0.4482,  0.3345, -0.2423,  0.3293],
        [ 0.3154,  0.1513,  0.4760,  0.1253,  0.1606],
        [-0.2386, -0.4139, -0.2747, -0.0140, -0.4054]], grad_fn=<DivBackward0>)