# pytorch

## day1
## pytorch 核心，设立计算图并自动计算
### 梯度下降法是理解神经网络的核心
### 具体步骤
- 设定初始值
- 求梯度
- 在梯度方向进行更新

## 简单梯度下降法

In [79]:
w=1
learning_rate=0.1
epochs=100
#find minimum
J_w=lambda w:w**2+2*w+1
J_w(0)

1

In [80]:
for epoch in range(epochs):
    dw=2*w+2
    w=w-learning_rate*dw
    
w

-0.9999999995925928

In [81]:
# pytorch 实现
import torch
from torch.autograd import Variable

# 定义一个torch.float tensor 类似于array
w=torch.Tensor([1])
# 定义一个w，是一个变量，是建立计算图的起点
w=Variable(w,requires_grad=True)
#增加一个部分叫gradient，w.data和之前相同
print('grad',w.grad,'data',w.data)

grad None data tensor([1.])


In [82]:
y=w**2
z=y**2

w->y->z的计算图

In [83]:
# backward做反向传导,实现z对w的导数
z.backward()

In [84]:
w.grad

tensor([4.])

pytorch 的本质，建立计算图，自动求梯度

In [85]:
w=torch.Tensor([10])
w=Variable(w,requires_grad=True) #这是最基础重要的
learning_rate=0.1
epochs=100

for epoch in range(epochs):
    J=w**2+2*w+1
    #求导数
    J.backward()
    print('grad',w.grad.data)
    #参数更新，放在梯度里
    w.data=w.data-learning_rate*w.grad.data
    #每一步导数都会积累在之前的梯度上
    w.grad.data.zero_() # 梯度清零
    
w.data

grad tensor([22.])
grad tensor([17.6000])
grad tensor([14.0800])
grad tensor([11.2640])
grad tensor([9.0112])
grad tensor([7.2090])
grad tensor([5.7672])
grad tensor([4.6137])
grad tensor([3.6910])
grad tensor([2.9528])
grad tensor([2.3622])
grad tensor([1.8898])
grad tensor([1.5118])
grad tensor([1.2095])
grad tensor([0.9676])
grad tensor([0.7741])
grad tensor([0.6192])
grad tensor([0.4954])
grad tensor([0.3963])
grad tensor([0.3171])
grad tensor([0.2536])
grad tensor([0.2029])
grad tensor([0.1623])
grad tensor([0.1299])
grad tensor([0.1039])
grad tensor([0.0831])
grad tensor([0.0665])
grad tensor([0.0532])
grad tensor([0.0426])
grad tensor([0.0340])
grad tensor([0.0272])
grad tensor([0.0218])
grad tensor([0.0174])
grad tensor([0.0139])
grad tensor([0.0112])
grad tensor([0.0089])
grad tensor([0.0071])
grad tensor([0.0057])
grad tensor([0.0046])
grad tensor([0.0037])
grad tensor([0.0029])
grad tensor([0.0023])
grad tensor([0.0019])
grad tensor([0.0015])
grad tensor([0.0012])
grad tenso

tensor([-1.0000])

## 从一维到多维

使用梯度下降法，优化$J(w,b)=w^2$

In [87]:
w=Variable(torch.FloatTensor([[[2],[3]]]),requires_grad=True)
# 必须改成如下
#w=Variable(torch.FloatTensor([1]),requires_grad=True)
#b=Variable(torch.FloatTensor([2]),requires_grad=True)
J=torch.sum(w**2)
print(J)
J.backward()
w.grad.data

tensor(13., grad_fn=<SumBackward0>)


tensor([[[4.],
         [6.]]])

In [92]:
#pytorch 梯度下降法最快速的实现
w=Variable(torch.FloatTensor([2,5]),requires_grad=True)
lr=0.1
epochs=1000

for epoch in range(epochs):
    J=torch.sum(w**2)
    J.backward()
    
    #参数更新
    assert(isinstance(w,Variable))
    w.data=w.data-lr*w.grad.data
    w.grad.data.zero_()
    
print('w=',w[0].data.numpy(),'b=',w[1].data.numpy())

w= 3e-45 b= 3e-45


## day2


## 线性模型的三种代码对比
- numpy
- pytorch 求导
- pytorch 神经网络

In [119]:
# numpy
import numpy as np
x_data=np.array([1,2,3])
y_data=np.array([2,4,6])

epochs=3
lr=0.1
w=0
cost=[]
for epoch in range(epochs):
    #计算梯度
    yhat=x_data*w
    loss=np.average((yhat-y_data)**2)
    print(loss)
    cost.append(loss)
    #优化公式, x.shape[0]显示行数
    dw=-2*(y_data-yhat)@x_data.T/(x_data.shape[0])
    # ？上面这行不是很明白
    #参数更新
    w=w-lr*dw
    
w

18.666666666666668
0.08296296296296272
0.00036872427983540356


1.9994074074074075

In [127]:
# pytorch 基本方法
torch.manual_seed(2)
x_data=Variable(torch.Tensor([[1.0],[2.0],[3.0]]))
y_data=Variable(torch.Tensor([[2.0],[4.0],[6.0]]))

epochs=10
lr=0.1
w=Variable(torch.FloatTensor([0]),requires_grad=True)
cost=[]

for epoch in range(epochs):
    #算梯度
    yhat=x_data*w
    loss=torch.mean((yhat-y_data)**2)
    cost.append(loss.data.numpy())
    loss.backward()
    
    #参数更新
    w.data=w.data-lr*w.grad.data
    w.grad.data.zero_()
    
w.data

tensor([2.])

pytorch 类方法，这一次，我们全面引入pytorch内的工具，一个是torch.nn.module写网络，MSEloss写损失，一个是torch.optim来进行梯度下降，使线性回归特别简单

In [129]:
# pytorch 求导来进行
torch.manual_seed(2)
x_data=Variable(torch.Tensor([[1.0],[2.0],[3.0]]))
y_data=Variable(torch.Tensor([[2.0],[4.0],[6.0]]))

# 定义模型
class Model(torch.nn.Module):
    def __init__(self):
        super(Model.self).__init__()
        self.linear=torch.nn.Linear(1,1,bias=False)  # one in and one out
    # 前项传播   
    def forward(self,x):
        y_pred=self.linear(x)
        return y_pred
model=Model()

#定义loss和优化方法
criterion=torch.nn.MSELoss(size_average=False)
#等价公式
#criterion=lambda yhat,y:torch.sum((yhat-y)**2)
optimizer=torch.optim.SGD(model.parameters(),lr=0.01)

epoch=20
cost=[]
for epoch in range(epochs):
    #计算梯度
    y_pred=model(x_data)
    loss=criterion(y_pred,y_data)
    cost.append(loss.data[0])
    optimizer.zero_grad()
    loss.backward()
    

AttributeError: type object 'Model' has no attribute 'self'