### 房价预测
* 影响因素 卧室 卫生间 居住面积 个数 $x_1,x_2,x_3$
* 成交价 $y = w_1x_1+w_2x_2+w_3x_3+b$

* $X = [x_1,x_2,...,x_n]^T$
* $W = [w_1,w_2,...,w_n]^T$
* $y_{pred} = w_1x_1+w_2x_2+...+w_nx_n+b =  <W,X>+b$

*  **可以看作单层神经网络**

* 损失 $L = \frac{1}{2n} (y_{true}-y_{pred})^2 = \frac{1}{2n} || y_{true}-Xw||^2$
* $X = [X,1]  w = [w,b]^T$   找到最优参数
* $\frac{\partial}{\partial w}L=0$
* $=> \frac{1}{n}(y_{true}-Xw)^TX=0$
* $=> w^*=(X^TX)^{-1}X^Ty_{true}$

In [None]:
%matplotlib inline
import random
import torch
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import torch.optim as optim

In [None]:
path = r'/kaggle/input/bostonhoustingmlnd/housing.csv'
import pandas as pd 
data = pd.read_csv(path)
data.head()

In [None]:
data.isnull().sum()

In [None]:
data.shape

In [None]:
X = data.iloc[:,:3].values
y = data.iloc[:,3].values
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X) 
y_scaled = scaler.fit_transform(y.reshape(-1, 1))
X_tensor = torch.tensor(X_scaled,dtype = torch.float32)
y_tensor = torch.tensor(y_scaled,dtype = torch.float32).reshape(-1,1)

In [None]:
batch_size = 30
import torch
import random

def data_iter(batch_size, features, labels):
    num = len(features)
    indices = list(range(num))
    random.shuffle(indices)  # 随机打乱索引
    
    for i in range(0, num, batch_size):
        # 获取当前批次的索引
        batch_indices = indices[i:min(i+batch_size, num)]
        batch_features = features[batch_indices]
        batch_labels = labels[batch_indices]
        
        yield batch_features, batch_labels

In [None]:
for i,z in data_iter(batch_size,X_tensor,y_tensor):
    print(i.shape,z.shape)
    break

In [None]:
w = torch.normal(0,0.01,size=(3,1),requires_grad=True)
b = torch.zeros(1,requires_grad=True)

In [None]:
def linear(X,w,b):
    return torch.matmul(X,w)+b

In [None]:
def loss(p,y):
    return (p-y.reshape(p.shape))**2 / 2

In [None]:
def sgd(params,lr,batch_size):
    with torch.no_grad():
        for param in params:
            param -= lr*param.grad / batch_size
            param.grad.zero_()
    

In [None]:
lr = 0.01
epochs = 5
opt = optim.Adam([w,b], lr=0.01)
for epoch in range(epochs):
    for X,y in data_iter(batch_size,X_tensor,y_tensor):
        l = loss(linear(X,w,b),y)
        l.sum().backward()
        sgd([w,b],lr,batch_size)
        # opt.step()#更新
        # opt.zero_grad()#请梯度
    with torch.no_grad():
        l = loss(linear(X_tensor,w,b),y_tensor).mean()
        print("epoch",epoch,"\tloss",l.item(),"\n")

In [None]:
!jupyter nbconvert --to markdown learn_2.ipynb