# 4. 矩阵运算

#### 矩阵求导
1. 考虑矩阵乘法 $$ A \cdot B = C $$
2. 考虑Loss函数 $$ L = \sum^m_{i}\sum^n_{j}{(C_{ij} - p)^2} $$
3. 考虑C的每一项导数 $$ \triangledown C_{ij} = \frac{\partial L}{\partial C_{ij}} $$
4. 考虑ABC都为2x2矩阵时，定义G为L对C的导数
$$A = \begin{bmatrix}
a & b\\
c & d
\end{bmatrix}
\quad
B = 
\begin{bmatrix} 
e & f \\
g & h
\end{bmatrix}
\quad
C = \begin{bmatrix} 
i & j \\
k & l 
\end{bmatrix}
\quad
G = \frac{\partial L}{\partial C} = \begin{bmatrix} 
\frac{\partial L}{\partial i} & \frac{\partial L}{\partial j} \\
\frac{\partial L}{\partial k} & \frac{\partial L}{\partial l} 
\end{bmatrix} = \begin{bmatrix} 
w & x \\
y & z 
\end{bmatrix}
$$

5. 展开左边 A x B
$$C = \begin{bmatrix}
i = ae + bg & j = af + bh\\
k = ce + dg & l = cf + dh
\end{bmatrix}
$$

6. L对于每一个A的导数$$ \triangledown A_{ij} = \frac{\partial L}{\partial A_{ij}} $$

$$ \frac{\partial L}{\partial a} = \frac{\partial L}{\partial i} * \frac{\partial i}{\partial a} + \frac{\partial L}{\partial j} * \frac{\partial j}{\partial a} $$

$$ \frac{\partial L}{\partial b} = \frac{\partial L}{\partial i} * \frac{\partial i}{\partial b} + \frac{\partial L}{\partial j} * \frac{\partial j}{\partial b} $$

$$ \frac{\partial L}{\partial c} = \frac{\partial L}{\partial k} * \frac{\partial k}{\partial c} + \frac{\partial L}{\partial l} * \frac{\partial l}{\partial c} $$

$$ \frac{\partial L}{\partial d} = \frac{\partial L}{\partial k} * \frac{\partial k}{\partial d} + \frac{\partial L}{\partial l} * \frac{\partial l}{\partial d} $$

$$ \frac{\partial L}{\partial a} = we + xf \\ $$
$$ \frac{\partial L}{\partial b} = wg + xh \\ $$
$$ \frac{\partial L}{\partial c} = ye + zf \\ $$
$$ \frac{\partial L}{\partial d} = yg + zh $$

7. 因此A的导数为$$ 
\triangledown A = \begin{bmatrix}
we + xf & wg + xh\\
ye + zf & yg + zh
\end{bmatrix}
\quad
\triangledown A = \begin{bmatrix}
w & x\\
y & z
\end{bmatrix}
\begin{bmatrix}
e & g\\
f & h
\end{bmatrix}
$$

$$
\triangledown A = G \cdot B^T
$$

8. 同理B的导数为:
$$ \frac{\partial L}{\partial e} = wa + yc \\ $$
$$ \frac{\partial L}{\partial f} = xa + zc \\ $$
$$ \frac{\partial L}{\partial g} = wb + yd \\ $$
$$ \frac{\partial L}{\partial h} = xb + zd $$

$$ 
\triangledown A = \begin{bmatrix}
wa + yc & xa + zc\\
wb + yd & xb + zd
\end{bmatrix}
\quad
\triangledown A = \begin{bmatrix}
a & c\\
b & d
\end{bmatrix}
\begin{bmatrix}
w & x\\
y & z
\end{bmatrix}
$$

$$
\triangledown B = A^T \cdot G
$$

In [None]:
import numpy as np
import pandas as pd

def get_data(file = "上海二手房价.csv"):
    datas = pd.read_csv(file,names=["y","x1","x2","x3","x4","x5","x6"],skiprows=1)   
    y = datas["y"].values.reshape(-1,1) # 只有一列，具体多少行视数据情况自适应而定
    #print("y:",y)
    #print("y[-5:]:",y[-5:])
    X = datas[[f"x{i}" for i in range(1,7)]].values
    #print("x:",x)
    #print("len(x):",len(x))
    #print("x[0]:",x[0])
    
    # z_score: (x - mean_x) / std  对x进行归一化操作
    mean_y = np.mean(y)
    #print("mean_y:",mean_y)
    #print("len(mean_y):",len(mean_y))
    std_y = np.std(y)
    
    mean_X = np.mean(X,axis =0,keepdims = True) # axis=0，那么输出矩阵是1行，求每一列的平均
    std_X = np.std(X,axis=0,keepdims = True)
    
    y = (y-mean_y) / std_y
    X = (X-mean_X) / std_X
    
    return X,y,mean_y,std_y,mean_X,std_X   

if __name__ == "__main__":
    X,y,mean_y,std_y,mean_X,std_X = get_data()
    K = np.random.random((6,1))
    
    epoch = 1000
    lr = 0.1
    b = 0
    
    for e in range(epoch):
        pre = X @ K + b  # 在某些应用场景的需求情况，有 b 的函数比没 b 的函数要好
        loss = np.sum((pre - y)**2)/len(X) # 对 loss 取平均值
        # G = (2*(pre - y))
        G = (pre - y)/len(X) # loss 对 pre 的导数,梯度的常数项一般都不要
                             # G为一串列表，G与X.T相乘再相加，G太大可以导致delta_K梯度爆炸
                             # 为了不与多少条数据相关，即不让数据多就大，数据少就小，所以G除以量纲
        #print("G:",G)
        #print("len(G):",len(G))
        delta_K = X.T @ G  
        #print("delta_K:",delta_K)
        delta_b = np.mean(G)
        
        K = K - lr * delta_K
        b = b - lr * delta_b
        
        #print(f"loss:{loss:.3f}")
        
    while True:
        bedroom = (int(input("请输入卧室数量:")))
        ting = (int(input("请输入客厅数量:")))
        wei = (int(input("请输入卫生间数量:")))
        area = (int(input("请输入面积:")))
        floor = (int(input("请输入楼层:")))
        year = (int(input("请输入建成年份:")))
        test_x = (np.array([bedroom,ting,wei,area,floor,year]).reshape(1,-1)- mean_X) / std_X  # 训练后的k、b对应的输入是归一化后的X       
        p = test_x @ K + b
        print("房价为:", p * std_y + mean_y) # 训练后的k、b对应的输出是归一化后的Y,因此真实输出要返归一化                    

请输入卧室数量:3
请输入客厅数量:1
请输入卫生间数量:1
请输入面积:130
请输入楼层:11
请输入建成年份:2010
房价为: [[66476.90245092]]
