LAD的目标函数为$$min\ ||Ax - b||_1$$
$A_{n\times p}$为自变量矩阵,$b_{p\times1}$为因变量向量，$x_{p\times1}$为需要估计的回归系数。

将其转化为ADMM算法框架下的形式，$f(x) = 0,g(z) = ||z||_1, B = -I, c = b$          
目标函数变为$$min\ ||z||_1$$
约束条件为$$AX - b = z$$


In [1]:
import numpy as np
np.set_printoptions(linewidth=100)

In [2]:
np.random.seed(123)
n = 1000
p = 10
A = np.random.normal(size=(n,p))
xtrue = np.random.normal(size=p)
b = A.dot(xtrue) + np.random.normal(size=n)

In [3]:
xtrue

array([-1.24096967, -0.31294679, -0.84894679,  2.37795259,  0.65750062,  0.21308689, -0.49097031,
       -1.0815104 ,  0.00480111, -0.36079657])

In [4]:
#设定初值，注意维数
x = np.zeros(p)
z = np.zeros(n)
u = np.zeros(n)


In [5]:
#理论上rho可取任意正数；这里设定的是1.0，而非1
rho = 1.0

In [6]:
#编写soft-thresholding函数，使其可以直接作用于向量(使用numpy函数)
#此处0，写入0.0
def soft_thresholding(a, k):
    return np.sign(a) * np.maximum(0.0, np.abs(a) - k)

其迭代公式为

$$
\begin{align*}
x^{k+1} & =(A'A)^{-1}A'(b+z^{k}-u^{k})\\
z^{k+1} & =S_{1/\rho}(Ax^{k+1}-b+u^{k})\\
u^{k+1} & =u^{k}+Ax^{k+1}-z^{k+1}-b,
\end{align*}
$$

In [8]:
#在写迭代的时候先进行debug式的尝试有助于保证编程的正确性
xnew = np.linalg.solve(A.T.dot(A), A.T.dot(b + z - u))
xnew

array([-1.24034079, -0.25873666, -0.90518866,  2.33812078,  0.69147325,  0.15743223, -0.4450978 ,
       -1.12812669, -0.02567582, -0.36984311])

In [10]:
znew = soft_thresholding(A.dot(xnew) -b + u, 1 / rho)
znew[ :10]

array([-0.        , -0.01952984,  0.        ,  0.73800732,  0.        , -0.        ,  0.87067037,
        0.72896027, -0.44397696, -0.        ])

In [11]:
unew = u + A.dot(xnew) - znew -b
unew[ :10]

array([-0.14818386, -1.        ,  0.37729366,  1.        ,  0.22272854, -0.99080063,  1.        ,
        1.        , -1.        , -0.38022739])

LAD残差为 $r^{k+1}=Ax^{k+1}-z^{k+1}-b$ ，对偶问题残差为 $s^{k+1}=-\rho A'(z^{k+1}-z^{k})$ 

In [12]:
resid_r = unew - u
np.linalg.norm(resid_r)

22.870132559316534

In [13]:
resid_s = -rho * A.T.dot(znew - z)
np.linalg.norm(resid_s)

11.613498072547442

In [15]:
#再将迭代问题进行整合，设定最大迭代次数为10000，判定收敛的上界为0.001
max_iter = 10000
tol = 0.001

for i in range(max_iter):
    xnew = np.linalg.solve(A.T.dot(A), A.T.dot(b + z - u))
    Axnew = A.dot(xnew)
    znew = soft_thresholding(Axnew -b + u, 1 / rho)
    unew = u + Axnew - znew -b
    resid_r_norm = np.linalg.norm(unew - u)
    resid_s_norm = np.linalg.norm(-rho * A.T.dot(znew - z))
    x = xnew
    z = znew
    u = unew
    #每100次迭代，输出1次残差和对偶问题残差，以便于debug和观察是否收敛
    if i % 100 == 0:
        print(f"Iteration{i},||r|| = {resid_r_norm: .6f}, ||s|| = {resid_s_norm: .6f}")
    if resid_r_norm <= tol and resid_s_norm <= tol:
        break

Iteration0,||r|| =  22.870133, ||s|| =  11.613498
Iteration100,||r|| =  0.035877, ||s|| =  1.151433
Iteration200,||r|| =  0.017872, ||s|| =  0.373020
Iteration300,||r|| =  0.015117, ||s|| =  0.207696
Iteration400,||r|| =  0.009081, ||s|| =  0.150409
Iteration500,||r|| =  0.007534, ||s|| =  0.109300
Iteration600,||r|| =  0.005753, ||s|| =  0.115333
Iteration700,||r|| =  0.004239, ||s|| =  0.113253
Iteration800,||r|| =  0.003713, ||s|| =  0.090421
Iteration900,||r|| =  0.002634, ||s|| =  0.059329
Iteration1000,||r|| =  0.001756, ||s|| =  0.059668
Iteration1100,||r|| =  0.001794, ||s|| =  0.038889
Iteration1200,||r|| =  0.001600, ||s|| =  0.027407
Iteration1300,||r|| =  0.001165, ||s|| =  0.033147
Iteration1400,||r|| =  0.001115, ||s|| =  0.024602
Iteration1500,||r|| =  0.000734, ||s|| =  0.029877
Iteration1600,||r|| =  0.000734, ||s|| =  0.025740
Iteration1700,||r|| =  0.000589, ||s|| =  0.024537
Iteration1800,||r|| =  0.000775, ||s|| =  0.014789
Iteration1900,||r|| =  0.000645, ||s|| = 

In [16]:
x

array([-1.19230848, -0.28642899, -0.89053513,  2.35251214,  0.66217182,  0.14198784, -0.43247972,
       -1.11299057, -0.01374415, -0.38485577])

In [None]:
#优化算法：每步能优化的尽量优化，比如求每个xnew的时候要算个(A'A)^(-1),可以用cholesky先分解了再求；
#比如，发现更新z和u的时候都要算一个Axnew就只算一遍