In [3]:
# default_exp algo.dl.pytorch

%reload_ext autoreload
%autoreload 2

In [None]:
algo_dl_pytorch

# pytorch


PyTorch是由Facebook的人工智能部门开发的机器学习和深度学习工具。 

它是使用Python和C ++语言编写的。 

PyTorch是一个Python软件包，提供两个高级功能：

     具有强大GPU加速功能的Tensor计算（如NumPy）

     基于a tape-based autograd system构建的深度神经网络

您可以在需要时重用自己喜欢的Python软件包（例如NumPy，SciPy和Cython）来扩展PyTorch。

https://pytorch.org/

https://github.com/pytorch/pytorch

https://pytorch.org/tutorials/

In [7]:
# !pip install torch -i https://pypi.tuna.tsinghua.edu.cn/simple

!pip freeze | grep torch

torch==1.5.0
torchtext==0.2.3
torchvision==0.4.2


In [1]:
import torch

In [2]:
torch.version.__version__

'1.5.0'

# Deep Learning with PyTorch: A 60 Minute Blitz
https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html



## What is PyTorch?
It’s a Python-based scientific computing package targeted at two sets of audiences:

    A replacement for NumPy to use the power of GPUs
    a deep learning research platform that provides maximum flexibility and speed

### Tensors

Tensors are similar to NumPy’s ndarrays, with the addition being that Tensors can also be used on a GPU to accelerate computing.

#### Construct a randomly initialized matrix:
声明了一个未初始化的矩阵，但在使用前不包含确定的已知值。

In [4]:
x = torch.empty(5, 3)
x

tensor([[0.0000e+00, 8.5899e+09, 0.0000e+00],
        [8.5899e+09, 1.2612e-44, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00]])

#### Construct a randomly initialized matrix:

In [5]:
x = torch.rand(5, 3)
x

tensor([[0.7391, 0.9054, 0.3135],
        [0.4242, 0.3435, 0.2045],
        [0.7847, 0.9551, 0.5965],
        [0.8906, 0.2464, 0.2008],
        [0.1333, 0.1737, 0.9430]])

#### Construct a matrix filled zeros and of dtype long:

In [6]:
x = torch.zeros(5, 3, dtype=torch.long)
x

tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])

#### Construct a tensor directly from data:

In [7]:
x = torch.tensor([5.5, 3])
x

tensor([5.5000, 3.0000])

或基于现有张量创建张量。 这些方法将重用输入张量的属性，例如 dtype，除非用户提供新值

In [8]:
x = x.new_ones(5, 3, dtype=torch.double)      # new_* methods take in sizes
x

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)

In [9]:
x = torch.randn_like(x, dtype=torch.float)    # override dtype!
x

tensor([[-0.0423, -2.0605,  0.8602],
        [-1.4862, -0.5069, -0.6064],
        [ 0.4644,  1.2580, -0.0618],
        [-1.2663, -2.2096,  3.1719],
        [-0.3130,  1.2931, -0.4638]])

#### Get its size:

In [11]:
x.size()

torch.Size([5, 3])

### Operations

In [12]:
y = torch.rand(5, 3)

#### Addition: syntax 1

In [13]:
x + y

tensor([[ 0.7032, -1.2390,  1.5754],
        [-1.3122, -0.1154, -0.2655],
        [ 1.1924,  1.6143,  0.1924],
        [-0.5584, -1.9609,  3.2666],
        [-0.1971,  1.6196, -0.4365]])

#### Addition: syntax 2

In [14]:
torch.add(x, y)

tensor([[ 0.7032, -1.2390,  1.5754],
        [-1.3122, -0.1154, -0.2655],
        [ 1.1924,  1.6143,  0.1924],
        [-0.5584, -1.9609,  3.2666],
        [-0.1971,  1.6196, -0.4365]])

#### Addition: providing an output tensor as argument

In [15]:
result = torch.empty(5, 3)
torch.add(x, y, out=result)
result

tensor([[ 0.7032, -1.2390,  1.5754],
        [-1.3122, -0.1154, -0.2655],
        [ 1.1924,  1.6143,  0.1924],
        [-0.5584, -1.9609,  3.2666],
        [-0.1971,  1.6196, -0.4365]])

#### Addition: in-place
Any operation that mutates a tensor in-place is post-fixed with an `_`. For example: `x.copy_(y), x.t_()`, will change x.

In [16]:
y.add_(x)
y

tensor([[ 0.7032, -1.2390,  1.5754],
        [-1.3122, -0.1154, -0.2655],
        [ 1.1924,  1.6143,  0.1924],
        [-0.5584, -1.9609,  3.2666],
        [-0.1971,  1.6196, -0.4365]])

您可以使用标准的与numty类似的索引来实现所有功能

In [18]:
x[:, 1]

tensor([-2.0605, -0.5069,  1.2580, -2.2096,  1.2931])

#### Resizing: If you want to resize/reshape tensor, you can use torch.view:

In [19]:
x = torch.randn(4, 4)
y = x.view(16)
z = x.view(-1, 8)  # the size -1 is inferred from other dimensions
print(x.size(), y.size(), z.size())

torch.Size([4, 4]) torch.Size([16]) torch.Size([2, 8])


If you have a one element tensor, use .item() to get the value as a Python number

In [20]:
x = torch.randn(1)
print(x)
print(x.item())

tensor([-0.3911])
-0.3911091685295105


### NumPy Bridge

Converting a Torch Tensor to a NumPy array and vice versa is a breeze.

The Torch Tensor and NumPy array will share their underlying memory locations (if the Torch Tensor is on CPU), and changing one will change the other.

#### Converting a Torch Tensor to a NumPy Array

In [21]:
a = torch.ones(5)
print(a)

tensor([1., 1., 1., 1., 1.])


In [22]:
b = a.numpy()
b

array([1., 1., 1., 1., 1.], dtype=float32)

In [24]:
a.add_(1)
print(a)
print(b)



tensor([2., 2., 2., 2., 2.])
[2. 2. 2. 2. 2.]


#### Converting NumPy Array to Torch Tensor

In [25]:
import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print(a)
print(b)


[2. 2. 2. 2. 2.]
tensor([2., 2., 2., 2., 2.], dtype=torch.float64)


### CUDA Tensors
ensors can be moved onto any device using the .to method.



In [None]:
# let us run this cell only if CUDA is available
# We will use ``torch.device`` objects to move tensors in and out of GPU
if torch.cuda.is_available():
    device = torch.device("cuda")          # a CUDA device object
    y = torch.ones_like(x, device=device)  # directly create a tensor on GPU
    x = x.to(device)                       # or just use strings ``.to("cuda")``
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       # ``.to`` can also change dtype together!

Out:

    tensor([-0.2550], device='cuda:0')
    tensor([-0.2550], dtype=torch.float64)

## Autograd: Automatic Differentiation
https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html#sphx-glr-beginner-blitz-autograd-tutorial-py

PyTorch中所有神经网络的核心是autograd软件包。 让我们先简要地介绍一下，然后再训练第一个神经网络。

autograd软件包为Tensor上的所有操作提供自动区分。 这是一个按运行定义的框架，这意味着您的backprop是由代码的运行方式定义的，并且每次迭代都可以不同。

让我们通过一些示例以更简单的方式看待这一点。


### Tensor
Torch.Tensor是程序包的中心类。 如果将其属性.requires_grad设置为True，它将开始跟踪对其的所有操作。 完成计算后，您可以调用.backward（）并自动计算所有gradients。 该张量的梯度将累积到.grad属性中。

要停止张量跟踪历史记录，可以调用.detach（）将其与计算历史记录分离，并防止跟踪将来的计算。

为了防止跟踪历史记录（和使用内存），您还可以使用torch.no_grad（）：包装代码块。 这在评估模型时特别有用，因为模型可能具有可训练的参数，且require_grad = True，但我们不需要gradients。

There’s one more class which is very important for autograd implementation - a Function.

Tensor and Function are interconnected and build up an acyclic graph(无环图), that encodes a complete history of computation. 每个张量都有一个.grad_fn属性，该属性引用创建了张量的函数（用户创建的张量除外-它们的grad_fn为None）。

如果要计算导数，可以在Tensor上调用.backward（）。 如果Tensor是标量（即，它包含一个元素数据），则无需为Backward（）指定任何参数，但是，如果Tensor具有更多元素，则需要指定渐变参数，该参数是形状匹配的张量 。

In [26]:
x = torch.ones(2, 2, requires_grad=True)
print(x)

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)


In [27]:
y = x + 2
print(y)

tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward0>)


In [28]:
print(y.grad_fn)

<AddBackward0 object at 0x120f0c198>


In [29]:
z = y * y * 3
out = z.mean()

print(z, out)

tensor([[27., 27.],
        [27., 27.]], grad_fn=<MulBackward0>) tensor(27., grad_fn=<MeanBackward0>)


.requires_grad_( ... ) changes an existing Tensor’s requires_grad flag in-place. The input flag defaults to False if not given.

In [30]:
a = torch.randn(2, 2)
a = ((a * 3) / (a - 1))
print(a.requires_grad)
a.requires_grad_(True)
print(a.requires_grad)
b = (a * a).sum()
print(b.grad_fn)

False
True
<SumBackward0 object at 0x120f0cd30>


### Gradients
Let’s backprop now. Because out contains a single scalar, out.backward() is equivalent to out.backward(torch.tensor(1.)).

In [31]:
out

tensor(27., grad_fn=<MeanBackward0>)

In [32]:
out.backward()

Print gradients d(out)/dx

In [33]:
x.grad

tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])

## Neural Networks
https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#sphx-glr-beginner-blitz-neural-networks-tutorial-py

可以使用torch.nn包构建神经网络。

现在您已经了解了autograd，nn依靠autograd定义模型并对其进行区分。 nn.Module包含图层以及返回输出的方法forward（input）。

这是一个简单的前馈网络。 它获取输入，将其一层又一层地馈入，然后最终给出输出。

神经网络的典型训练过程如下：

定义具有一些可学习参数（或权重）的神经网络

遍历输入数据集

通过网络处理输入

计算损失（输出正确的距离有多远）

将渐变传播回网络参数

通常使用简单的更新规则来更新网络的权重：权重=权重-learning_rate *梯度

### Define the network

In [34]:
import torch
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 3x3 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 3)
        self.conv2 = nn.Conv2d(6, 16, 3)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 6 * 6, 120)  # 6*6 from image dimension
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square you can only specify a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    def num_flat_features(self, x):
        size = x.size()[1:]  # all dimensions except the batch dimension
        num_features = 1
        for s in size:
            num_features *= s
        return num_features


net = Net()
print(net)

Net(
  (conv1): Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(3, 3), stride=(1, 1))
  (fc1): Linear(in_features=576, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)


You just have to define the forward function, and the backward function (where gradients are computed) is automatically defined for you using autograd. You can use any of the Tensor operations in the forward function.

The learnable parameters of a model are returned by net.parameters()



In [35]:
params = list(net.parameters())
print(len(params))
print(params[0].size())  # conv1's .weight


10
torch.Size([6, 1, 3, 3])


In [38]:
type(params[0])

torch.nn.parameter.Parameter

In [37]:
params[0]

Parameter containing:
tensor([[[[ 0.0294, -0.1984, -0.3040],
          [-0.1912,  0.2660, -0.0095],
          [-0.1220,  0.0849, -0.1303]]],


        [[[ 0.3041, -0.0132,  0.3122],
          [-0.0951,  0.0540, -0.0318],
          [ 0.3284, -0.2981, -0.2515]]],


        [[[-0.2222, -0.1646, -0.2835],
          [-0.2663, -0.2340,  0.1298],
          [ 0.0592, -0.0970,  0.0317]]],


        [[[ 0.2669,  0.2694,  0.1669],
          [ 0.1060, -0.3272, -0.3226],
          [-0.0579, -0.0754, -0.2268]]],


        [[[-0.0391, -0.2034,  0.2577],
          [-0.2776, -0.0932,  0.2575],
          [-0.2209, -0.2535,  0.2967]]],


        [[[-0.3170, -0.0817, -0.2218],
          [ 0.0960, -0.1737, -0.2646],
          [ 0.0075, -0.0719, -0.1432]]]], requires_grad=True)

让我们尝试一个32x32随机输入。 注意：该网络的预期输入大小（LeNet）为32x32。 要在MNIST数据集上使用此网络，请将图像从数据集中调整为32x32。

In [39]:
input = torch.randn(1, 1, 32, 32)
out = net(input)
print(out)

tensor([[ 0.0570, -0.1429,  0.0542, -0.0648, -0.0954,  0.0221, -0.1047, -0.0779,
         -0.0952,  0.0130]], grad_fn=<AddmmBackward>)


In [40]:
# 将梯度缓冲区归零， 用随机梯度进行反向传播：
net.zero_grad()
out.backward(torch.randn(1, 10))

### Loss Function

In [41]:
output = net(input)
target = torch.randn(10)  # a dummy target, for example
target = target.view(1, -1)  # make it the same shape as output
criterion = nn.MSELoss()

loss = criterion(output, target)
print(loss)


tensor(0.9588, grad_fn=<MseLossBackward>)


## Training a Classifier
https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#sphx-glr-beginner-blitz-cifar10-tutorial-py

# nb_export

In [2]:
from nbdev.export import *
notebook2script()

Converted 00_core.ipynb.
Converted 00_template.ipynb.
Converted active_learning.ipynb.
Converted algo_dl_keras.ipynb.
Converted algo_ml_eda.ipynb.
Converted algo_ml_tree_catboost.ipynb.
Converted algo_ml_tree_lgb.ipynb.
Converted algo_rs_associated_rules.ipynb.
Converted algo_rs_match_deepmatch.ipynb.
Converted algo_rs_matrix.ipynb.
Converted algo_rs_search_vector_faiss.ipynb.
Converted algo_seq_embeding.ipynb.
Converted algo_seq_features_extraction_text.ipynb.
Converted datastructure_dict_list_set.ipynb.
Converted datastructure_matrix_sparse.ipynb.
Converted engineering_concurrency.ipynb.
Converted engineering_nbdev.ipynb.
Converted engineering_panel.ipynb.
Converted engineering_snorkel.ipynb.
Converted index.ipynb.
Converted math_func_basic.ipynb.
Converted math_func_loss.ipynb.
Converted operating_system_command.ipynb.
Converted plot.ipynb.
Converted utils_functools.ipynb.
Converted utils_json.ipynb.
Converted utils_pickle.ipynb.
Converted utils_time.ipynb.


In [7]:
!nbdev_build_docs

No notebooks were modified
converting /Users/luoyonggui/PycharmProjects/nbdevlib/index.ipynb to README.md
