In [1]:
import os
import random
import math
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import torch.optim as optim

# 工具操作

## vscode

1. 查看函数或者类的定义
`Ctrl`+`鼠标左键`点击函数名或者类名即可跳转到定义处，在函数名或者类名上按`F12`也可以实现同样功能


2. 命名重构：
在变量名上按`F2`即可实现重命名变量


3. 方法重构:
选中某一段代码，这个时候，代码的左侧会出现一个「灯泡图标」，点击这个图标，就可以把这段代码提取为一个单独的函数


4. python断点调试:
在行号的左边点击即可设置断点，在左边的调试界面可以查看变量的变化


5. 函数在哪被调用了：
选中`函数`（或者将光标放置在`函数`上），然后按住快捷键「Shift + F12」，就能看到`函数`在哪些地方被调用了，比较实用。

## jupyter notebook

- 恢复原来写过的代码:

  场景：在某个窗口写了很多代码，又删除了很多单元格，想找回原来的代码。
  
  解决方法：直接在一个单元格中写入`history`就会展示出历史代码（前提是你运行过的，否则不会打印出来）
  
- Move selected cells

  Move selected cell*s* using keybaord shortcuts `Alt + up` and `Alt + down` 

## win10相关

- Sticky Note：Go to the Windows Ink Workspace > Sticky Notes to create reminders for yourself. 

- Stay focused：Select and hold the window you want to stay open, then give your mouse (or finger) a little back-and-forth shake.

# 文本数据处理

## \r和\n
`\r`是回车，`\n`是换行，前者使光标到行首，后者使光标下移一格。

通常用的Enter是两个加起来的，即`\r\n`，也就是说`\r\n`算两个字符

## str.strip()和str.split()
- 按某一个字符分割，如‘.':
```python
>>> str = ('www.google.com')
>>> print str
www.google.com
>>> str_split = str.split('.')
>>> print(str_split) # 得到的结果是一个list
['www', 'google', 'com']
```

- 按某一个字符分割，且分割n次。如按‘.'分割1次
```python
>>> str_split = str.split('.',1)
>>> print str_split
['www', 'google.com']
```

- split分隔后是一个列表，[0]表示取其第一个元素： 
```python
>>> str_split = str.split('.')[0]
>>> print str_split
www
```
> split()函数后面还可以加正则表达式

------
- **我曾经用这两句取出str中的前200个单词（单词是用空格隔开的），并从新组成str**
```python
text = str.lower().split(' ')[:200]
text = ' '.join(text) # 一个字符串列表（列表的元素是字符串）变成一个字符串
```

## str.title():把字符串变成标题的形式

In [21]:
my_string = "10 awesome python tricks"
print(my_string.title())

10 Awesome Python Tricks


# Pytorch的一些知识

## 在理解模型时一些有效的方法

### net.parameters( )

In [2]:
class Generator(nn.Module):
    def __init__(self, num_emb, emb_dim, hidden_dim, use_cuda):
        super(Generator, self).__init__()
        self.num_emb = num_emb
        self.emb_dim = emb_dim
        self.hidden_dim = hidden_dim
        self.use_cuda = use_cuda
        self.emb = nn.Embedding(num_emb, emb_dim) 
        self.lstm = nn.LSTM(emb_dim, hidden_dim, batch_first=True) 
        self.lin = nn.Linear(hidden_dim, num_emb)
        self.softmax = nn.LogSoftmax(dim = 1)
        self.init_params()

    def forward(self, x):
        emb = self.emb(x)
        h0, c0 = self.init_hidden(x.size(0))
        output, (h, c) = self.lstm(emb, (h0, c0)) 
        pred = self.softmax(self.lin(output.contiguous().view(-1, self.hidden_dim))) 
        return pred

    def init_hidden(self, batch_size):
        h = Variable(torch.zeros((1, batch_size, self.hidden_dim)))
        c = Variable(torch.zeros((1, batch_size, self.hidden_dim)))
        if self.use_cuda:
            h, c = h.cuda(), c.cuda()
        return h, c
    
    def init_params(self):
        for param in self.parameters():
            param.data.uniform_(-0.05, 0.05)

generator = Generator(num_emb = 5000,emb_dim = 128,hidden_dim = 64,use_cuda = 'Ture')
generator = generator.cuda()
x = torch.LongTensor([[2,50,100],
                      [40,3,1000]]).cuda()
pred = generator(x)
params = list(generator.parameters())
len(params)

7

In [3]:
# Examining a Model's Structure
print(generator)

Generator(
  (emb): Embedding(5000, 128)
  (lstm): LSTM(128, 64, batch_first=True)
  (lin): Linear(in_features=64, out_features=5000, bias=True)
  (softmax): LogSoftmax()
)


In [3]:
for name,parameters in generator.named_parameters():
    print(name,':',parameters.size())
# 需要值的话:parameters.data

emb.weight : torch.Size([5000, 128])
lstm.weight_ih_l0 : torch.Size([256, 128])
lstm.weight_hh_l0 : torch.Size([256, 64])
lstm.bias_ih_l0 : torch.Size([256])
lstm.bias_hh_l0 : torch.Size([256])
lin.weight : torch.Size([5000, 64])
lin.bias : torch.Size([5000])


In [4]:
pred

tensor([[-8.4876, -8.4922, -8.5338,  ..., -8.5332, -8.5663, -8.5036],
        [-8.4900, -8.4903, -8.5346,  ..., -8.5317, -8.5671, -8.5047],
        [-8.4903, -8.4888, -8.5338,  ..., -8.5304, -8.5693, -8.5053],
        [-8.4885, -8.4912, -8.5348,  ..., -8.5337, -8.5656, -8.5034],
        [-8.4893, -8.4897, -8.5355,  ..., -8.5316, -8.5680, -8.5050],
        [-8.4893, -8.4881, -8.5349,  ..., -8.5301, -8.5694, -8.5057]],
       device='cuda:0', grad_fn=<LogSoftmaxBackward>)

## 对于随处可见的retain_graph参数

如果retain_graph=true,就会每次运行时重新生成图。也就是说，每次 backward() 时，默认会把整个计算图free掉。一般情况下是每次迭代，只需一次 forward() 和一次 backward() , 前向运算forward() 和反向传播backward()是成对存在的，一般一次backward()也是够用的。但是不排除，由于自定义loss等的复杂性，需要一次forward()，多个不同loss的backward()来累积同一个网络的grad来更新参数。于是，若在当前backward()后，不执行forward() 而可以执行另一个backward()，需要在当前backward()时，指定保留计算图，即backward(retain_graph)。

In [1]:
import torch
x = torch.randn(1, requires_grad=True)
print("x = ",x,"\n",
     "x.grad = ",x.grad,"\n")
y = x*2
y.backward()
print("x's grad after the first opertion:\n",x.grad,"\n")
y.backward()
print("x's grad:\n",x.grad,"\n")
# buffers 是什么，这里可以了解retain_graph的用法

x =  tensor([-0.4168], requires_grad=True) 
 x.grad =  None 

x's grad after the first opertion:
 tensor([2.]) 



RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.

为了了解上述Error里面的`buffer`这个词，可以参照官网在Backprop章节的这段话：
> To backpropagate the error all we have to do is to `loss.backward()`. You need to clear the existing gradients though(use`net.zero_grad()`to zeroes the gradient **buffers** of all parameters), else gradients will be accumulated to existing gradients.

## torch.Tensor.detach()

如果 x 为中间输出，y = x.detach 表示创建一个与 x 相同，但requires_grad==False 的tensor, 实际上就是把y 以前的计算图 grad_fn 都消除，y自然也就成了叶节点。原先反向传播时，回传到x时还会继续，而现在回到y处后，就结束了，不继续回传求到了。另外值得注意, x和y指向同一个Tensor ,即 x.data 。而detach_() 表示不创建新变量，而是直接修改 x 本身。

## torch.nn.functional.pad(input, pad, mode='constant', value=0)
> Pads tensor 我只试验了2维的pad

* 格式：torch.nn.functional.pad(input, pad, mode='constant', value=0)
  - **input (Tensor)** – 2-dimensional tensor
  - **pad (tuple)** – 2-elements tuple,我只试验了;`(0，i)`
  - **mode** - 我只用了Defaull:`constant`
  - **value** - fill value for padding.Default:`0`

In [3]:
a = Variable(nn.init.constant_(torch.zeros(3, 5), 2)).long()
print("a = ",a,"\n"
     "a.size = ",a.size(),"\n")
b = nn.functional.pad(a, (0,20 - 5), value=3)
print("b = ",b,"\n"
     "b.size = ",b.size(),"\n")

a =  tensor([[2, 2, 2, 2, 2],
        [2, 2, 2, 2, 2],
        [2, 2, 2, 2, 2]]) 
a.size =  torch.Size([3, 5]) 

b =  tensor([[2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
        [2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
        [2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]]) 
b.size =  torch.Size([3, 20]) 



## torch.nn.init.constant_()

> Fills the input Tensor with the value val.

* 格式：torch.nn.init.constant_(tensor, val)
  - **tensor** – an n-dimensional torch.Tensor
  - **val** – the value to fill the tensor with

* Examples
```python
>>> w = torch.empty(3, 5)
>>> nn.init.constant_(w, 0.3)
```

## TensorDataset() &DataLoader()

In [4]:
x = torch.linspace(1,10,10) # linspace(star,end,step):从1到10，10个step走完
y = torch.linspace(10,1,10)

import torch.utils.data as Data
torch_dataset = Data.TensorDataset(x,y)
loader = Data.DataLoader(
    dataset = torch_dataset,
    batch_size = 3,
    shuffle = True,
    num_workers = 4)

for epoch in range(3):
    for step,(batch_x,batch_y) in enumerate(loader):
        print('Epoch:',epoch,'|Step:',step,'|batch_x:',batch_x.numpy(),
             '|batch_y:',batch_y.numpy())

Epoch: 0 |Step: 0 |batch_x: [5. 7. 1.] |batch_y: [ 6.  4. 10.]
Epoch: 0 |Step: 1 |batch_x: [ 4. 10.  2.] |batch_y: [7. 1. 9.]
Epoch: 0 |Step: 2 |batch_x: [6. 9. 3.] |batch_y: [5. 2. 8.]
Epoch: 0 |Step: 3 |batch_x: [8.] |batch_y: [3.]
Epoch: 1 |Step: 0 |batch_x: [5. 9. 2.] |batch_y: [6. 2. 9.]
Epoch: 1 |Step: 1 |batch_x: [4. 6. 8.] |batch_y: [7. 5. 3.]
Epoch: 1 |Step: 2 |batch_x: [1. 3. 7.] |batch_y: [10.  8.  4.]
Epoch: 1 |Step: 3 |batch_x: [10.] |batch_y: [1.]
Epoch: 2 |Step: 0 |batch_x: [4. 7. 9.] |batch_y: [7. 4. 2.]
Epoch: 2 |Step: 1 |batch_x: [6. 2. 3.] |batch_y: [5. 9. 8.]
Epoch: 2 |Step: 2 |batch_x: [1. 5. 8.] |batch_y: [10.  6.  3.]
Epoch: 2 |Step: 3 |batch_x: [10.] |batch_y: [1.]


## torch.Tensor.detach()

假设有模型A和模型B，我们需要将A的输出作为B的输入，但训练时我们只训练模型B. 那么可以这样做：
```python
input_B = output_A.detach()
```
它可以使两个计算图的梯度传递断开，从而实现我们所需的功能。

## torch.nn.Dropout(p=0.5, inplace=False)
During training, randomly zeroes some of the elements of the input tensor with probability `p` using samples from a Bernoulli distribution. Each channel will be zeroed out independently on every forward call.

Furthermore, the outputs are scaled by a factor of $\frac{1}{1-p}$ during training. This means that during evaluation the module simply computes an identity function.

* Parameters:
  - **p** – probability of an element to be zeroed. Default: 0.5
  - **inplace** – If set to True, will do this operation in-place. Default: False
* Shape:
  - Input: (*).Input can be of any shape
  - Output: (*).Output is of the same shape as input
  
> **NOTE:**
`Dropout` should take place only during training. If it was happening during inference time, you'd lose a chunk of your network's reasoning power, which is not what we want! Thankfully, PyTorch's implementation of `Dropout` works out which mode you're running in and passes all the data through the Dropout layer at inference time.

In [3]:
m = nn.Dropout(p=0.2)
input_1 = torch.randn(4, 5)
output_1 = m(input_1)
output_2 = input_1 * (1 / (1 - 0.2))
print(input_1,'\n',output_1,'\n',output_2) #注意output_1中不是0的数字都和output_2一样

tensor([[-0.4651, -1.5681, -0.1346,  0.3834,  0.5632],
        [ 0.6371, -0.0706,  1.1372,  0.7815, -1.2001],
        [ 0.5383,  0.5353, -0.2802, -0.9498, -1.0700],
        [ 1.2053, -0.4515,  0.7759, -0.8358,  0.1974]]) 
 tensor([[-0.5814, -0.0000, -0.1683,  0.0000,  0.7040],
        [ 0.7964, -0.0883,  1.4215,  0.9769, -1.5001],
        [ 0.6728,  0.6692, -0.0000, -1.1872, -1.3375],
        [ 0.0000, -0.0000,  0.9699, -1.0447,  0.2467]]) 
 tensor([[-0.5814, -1.9601, -0.1683,  0.4793,  0.7040],
        [ 0.7964, -0.0883,  1.4215,  0.9769, -1.5001],
        [ 0.6728,  0.6692, -0.3502, -1.1872, -1.3375],
        [ 1.5066, -0.5644,  0.9699, -1.0447,  0.2467]])


## torch.nn.Conv2d()
**Parameters:**
torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros')

注意输入维度必须是4维：batch_size * in_channels * x * y 

## nn.CrossEntropyLoss()和nn.NLLLoss()

- **Softmax Activation Function:**

$$S(f_{y_i}) = \dfrac{e^{f_{y_i}}}{\sum_{j}e^{f_j}}$$

> In practice, the softmax function is used with the `negative log-likelihood (NLL)`

- **NLLLoss(negative log likelihood loss):**

$$L(\mathbf{y}) = -\log(\mathbf{y})$$

> 注意到$ y=-log(x) $函数在[0,1]上是单调减函数，所以自变量越小，loss越大；自变量越大，loss越小。所以higher confidence(which come from the Softmax Activation Function) at the correct class leads to lower loss and vice versa.

- **CrossEntropyLoss:**

$ Y = (y_1,y_2,...) $是target，而且是one-hot标签(即0或者1，这里是数学运算表示，注意代码的taget是一个比$C-1$小的整数就可以了)，$ P = (P_1,P_2,...) $是经过Softmax层输出的pred

$$ E = -\sum_{1}^{n}y_{i} * log(P_{i})$$ 

**以下是pytorch代码解释，注意数学公式和代码表示的区别：**

CrossEntropyLoss() = log_softmax() + NLLLoss()：
> This criterion combines `nn.LogSoftmax()` and `nn.NLLLoss()` in one single class.

$$CrossEntropyLoss(X,class) = - log\frac{e^{x_{class}}}{\sum e^{x_i}}$$

> input:(N,C),N = Batch_size,C = num_classes #其中每个datapoint也就是每一横排，就是上面的X

> Target:(N)  #一定要注意这不是(N,1)，而是(N,)，对应 target.view(-1)，而且这里不是one-hot标签，这里满足$ 0 \leq target[i]\leq C-1$

> Output:scalar # 只和相应的$ x_{class}$大小相关,因为从数学表达式中我们可以看到其他的$y_{i}$是等于0的

In [3]:
target = torch.LongTensor([1,0,0,1]) #能不能出现除了0、1之外的数字
pred = torch.Tensor([-988,-0.01,-0.0005,-1080,-0.009,-3880,-180,-0.001]).view(4,2)
# 上面改成pred = torch.Tensor([-98,-0.01,-0.0005,-100,-0.009,-300,-10,-0.001]).view(4,2)结果一样
dis_criterion = nn.NLLLoss(reduction='sum')
a = dis_criterion(pred,target)
a

tensor(0.0205)

In [4]:
pred

tensor([[-9.8800e+02, -1.0000e-02],
        [-5.0000e-04, -1.0800e+03],
        [-9.0000e-03, -3.8800e+03],
        [-1.8000e+02, -1.0000e-03]])

In [6]:
loss = nn.CrossEntropyLoss()
input1 = torch.randn(3, 5, requires_grad=True)
target = torch.empty(3, dtype=torch.long).random_(5) # 应为每一行有5个预测值
input1

tensor([[-0.0444,  0.0298,  0.4169,  1.1406, -0.8716],
        [-1.2420,  0.8809, -1.2497, -0.5093,  0.1507],
        [ 0.1278, -1.8586, -0.4689,  0.1224, -0.0527]], requires_grad=True)

In [7]:
target

tensor([2, 4, 4])

In [8]:
aaa = loss(input1,target)
aaa

tensor(1.4608, grad_fn=<NllLossBackward>)

In [9]:
# 针对target，我修改一下input1，使得loss变小
input1[0][2],input1[1][4],input1[2][4] = 10,10,10
bbb = loss(input1,target)
bbb

tensor(0.0002, grad_fn=<NllLossBackward>)

## torch.nn.LSTM & torch.nn.LSTMCell 

* **Inputs**: input, (h_0, c_0)
  - `input` of shape (seq_len, batch, input_size)
  - `h_0` of shape (num_layers * num_directions, batch, hidden_size)
  - `c_0` of shape (num_layers * num_directions, batch, hidden_size)
  - If (h_0, c_0) is not provided, both h_0 and c_0 default to zero.
* **Outputs**: output, (h_n, c_n)
  - `output` of shape (seq_len, batch, num_directions * hidden_size)
  - `h_n` of shape (num_layers * num_directions, batch, hidden_size)
  - `c_n` of shape (num_layers * num_directions, batch, hidden_size)

In [2]:
# 一个input_size = 1 ， hidden_size = 20 ，num_layers = 1
rnn = nn.LSTM(10,20,1,batch_first = True) # batch_first不影响h和c,影响output和input
input_1 = torch.randn(3, 5, 10)
h0 = torch.randn(1, 3, 20)
c0 = torch.randn(1, 3, 20)
output, (hn, cn) = rnn(input_1, (h0, c0))
output.size() #[batch_size,seq_len,feature]

torch.Size([3, 5, 20])

为什么RNN类都有默认batch_size不排第一的反人类设定呢？我觉得神书第六章的代码片段可以解释（因为方便用RNNcell)：
```python
class ElmanRNN(nn.Module):
    """ an Elman RNN built using the RNNCell """
    def __init__(self, input_size, hidden_size, batch_first=False):
        """
        Args:
            input_size (int): size of the input vectors
            hidden_size (int): size of the hidden state vectors
            bathc_first (bool): whether the 0th dimension is batch
        """
        super(ElmanRNN, self).__init__()
        
        self.rnn_cell = nn.RNNCell(input_size, hidden_size)
        
        self.batch_first = batch_first
        self.hidden_size = hidden_size

    def _initial_hidden(self, batch_size):
        return torch.zeros((batch_size, self.hidden_size))

    def forward(self, x_in, initial_hidden=None):
        """The forward pass of the ElmanRNN
        
        Args:
            x_in (torch.Tensor): an input data tensor. 
                If self.batch_first: x_in.shape = (batch, seq_size, feat_size)
                Else: x_in.shape = (seq_size, batch, feat_size)
            initial_hidden (torch.Tensor): the initial hidden state for the RNN
        Returns:
            hiddens (torch.Tensor): The outputs of the RNN at each time step. 
                If self.batch_first: hiddens.shape = (batch, seq_size, hidden_size)
                Else: hiddens.shape = (seq_size, batch, hidden_size)
        """
        if self.batch_first:
            batch_size, seq_size, feat_size = x_in.size()
            x_in = x_in.permute(1, 0, 2)
        else:
            seq_size, batch_size, feat_size = x_in.size()
        # 这个if&else不管三七二十一，先把x_size的格式先变成seq_size, batch_size, feat_size
        hiddens = []

        if initial_hidden is None:
            initial_hidden = self._initial_hidden(batch_size)
            initial_hidden = initial_hidden.to(x_in.device)

        hidden_t = initial_hidden
                    
        for t in range(seq_size):
            hidden_t = self.rnn_cell(x_in[t], hidden_t) #这里就是为什么batch_size不排第一的原因吧
            hiddens.append(hidden_t)
            
        hiddens = torch.stack(hiddens)

        if self.batch_first:
            hiddens = hiddens.permute(1, 0, 2)

        return hiddens
```

## torch.multinomial()

> torch.multinomial(input, num_samples,replacement=False, out=None) → LongTensor

```python
>>> weights = torch.Tensor([0, 10, 3, 0]) # create a Tensor of weights
>>> torch.multinomial(weights, 4) #可以试试重复运行这条命令，发现只会有2种结果：[1 2 0 0]以及[2 1 0 0]，以[1 2 0 0]这种情况居多。
 1
 2
 0
 0
[torch.LongTensor of size 4]
 
>>> torch.multinomial(weights, 4, replacement=True)
 1
 2
 1
 2
[torch.LongTensor of size 4]
```
- input张量可以看成一个权重张量，每一个元素代表其在该行中的权重。如果有元素为0，那么在其他不为0的元素被取干净之前，这个元素是不会被取到的。
- n_samples是每一行的取值次数，该值不能大于每一样的元素数，否则会报错。
- replacement指的是取样时是否是有放回的取样，True是有放回，False无放回。
- 输入二维张量，则返回的也会成为一个二维张量，行数为输入的行数，列数为n_samples，即每一行都取了n_samples次，取法和一维张量相同。

# Python

## 单行条件语句

**格式：**`[on_true] if [expression] else [on_false]`

**例如：**
```python
x = "Success!" if (y==2) else "Failed!"
```
------
**也可以多个判断：**
```python
x = int(input())
if x >= 10:
    print("horse")
elif 1 < x < 10:
    print("Duck")
else:
    print("other")
```
上面的代码一行可以写完：

`print('horse' if x >= 10 else "Duck" if 1 < x < 10 else "other")`
    

## assert函数

* 格式：assert expression [, arguments]

```python
>>> assert 1==2, '1 不等于 2'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError: 1 不等于 2
```

## \*args和\**kwargs

In [2]:
'''
 *args 用来将参数打包成tuple给函数体调用
 **kwargs 打包关键字参数成dict给函数体调用
'''
def function(arg,*args,**kwargs):
    print(arg,args,kwargs)

function(6,7,8,9,a=1, b=2, c=3)

6 (7, 8, 9) {'a': 1, 'b': 2, 'c': 3}


## enumerate(str)

In [3]:
surname = "jian" 
for position_index, character in enumerate(surname):
    print(position_index,character)

0 j
1 i
2 a
3 n


# 不同的tpye

## 不同type之间的转换

- **Tensor与Numpy Array之间的转换：**

  Tensor----> Numpy  可以使用 data.numpy()，data为Tensor变量

  Numpy ----> Tensor 可以使用 torch.from_numpy(data)，data为numpy变量
  
  

- **List类型与numpy.array类型的互相转换：**

  temp = np.array(list) 

  arr = temp.tolist() 

In [4]:
import numpy as np
import torch
new_array_1 = np.array([1,2, 3,4, 5,6])
new_array_2 = np.array([2, 3,4, 5,6,7])
tensor_1 = torch.from_numpy(new_array_1)
tensor_2 = torch.from_numpy(new_array_2)
tensor_1.numpy()[:-1]

array([1, 2, 3, 4, 5])

## list类（一些字符串型在“文本处理”部分）

### 列表表达式
```python
[ expression for item in list if conditional ]
```

In [11]:
mylist = [i for i in range(10)]
squares = [x**2 for x in range(10)]
def my_function(a):
    return (a + 5) / 2
my_formula = [my_function(x) for x in range(10)]
filtered = [x for x in range(20) if x%2==0]
print('mylist=',mylist,'\n',
     'squares=',squares,'\n',
     'my_formula=',my_formula,'\n'
      'filtered=',filtered)

mylist= [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 
 squares= [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] 
 my_formula= [2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0] 
filtered= [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]


### list切片
```python
a[start:stop:step]
```

In [24]:
rev_string = "abcdefg"[::-1]
rev_array = [1,2,3,4,5][::-1]
print("rev_string=",rev_string,"\n\r",
     "rev_array=",rev_array)

rev_string= gfedcba 
 rev_array= [5, 4, 3, 2, 1]


### 统计函数：set(list),max(list),list.count,Counter(list)

In [30]:
list_test = [1,1,2,3,4,5,5,5,6,6]
print(set(list_test))

{1, 2, 3, 4, 5, 6}


In [31]:
max(list_test)

6

In [38]:
# list.count是list的内置函数
list_test.count(6)

2

In [36]:
list_test.count(5)

3

In [40]:
from collections import Counter

c = Counter(list_test) # 显然可以用在str上统计字符个数
print(c)

Counter({5: 3, 1: 2, 6: 2, 2: 1, 3: 1, 4: 1})


### map()函数

* 语法：
  `map(function, iterable, ...)`
* 描述：第一个参数 function 以参数序列中的每一个元素调用 function 函数，返回包含每次 function 函数返回值的新列表；
* 参数：
    - function --> 函数
    - iterable --> 一个或多个序列
* 返回值：迭代器
* 实例：

```python
>>>def square(x) :            # 计算平方数
...     return x ** 2

>>> map(square, [1,2,3,4,5])   # 计算列表各个元素的平方
out:[1, 4, 9, 16, 25]
```
------

```python
>>> map(lambda x: x ** 2, [1, 2, 3, 4, 5])  # 使用 lambda 匿名函数
out:[1, 4, 9, 16, 25]
 
# 提供了两个列表，对相同位置的列表数据进行相加
>>> map(lambda x, y: x + y, [1, 3, 5, 7, 9], [2, 4, 6, 8, 10])
out:[3, 7, 11, 15, 19]

>>> def upper(s):
    return s.upper()

>>> mylist = list(map(upper,['sentence','fragment']))
>>> print(mylist)
out:['SENTENCE','FRAGMENT']

# Convert a string representation of a number into a list of ints.
>>> list_of_ints = list(map(int,"123456")))
>>> print(list_of_ints)
out: [1,2,3,4,5,6]
```

### zip()函数

```python
>>> a = [1,2,3]
>>> b = [4,5,6]
>>> c = [4,5,6,7,8]
>>> zipped = zip(a,b) # 返回一个对象
>>> zipped
<zip object at 0x103abc288>
>>> list(zipped)  # list() 转换为列表
[(1, 4), (2, 5), (3, 6)]
>>> list(zip(a,c))  # 元素个数与最短的列表一致
[(1, 4), (2, 5), (3, 6)]
 
>>> a1, a2 = zip(*zip(a,b)) # 与 zip 相反，zip(*) 可理解为解压，返回二维矩阵式
>>> list(a1)
[1, 2, 3]
>>> list(a2)
[4, 5, 6]
```

## 数据类（python3.7开始支持）

In [15]:
# 具体详见：https://realpython.com/python-data-classes/
from dataclasses import dataclass

@dataclass
class Card:
    rank:str
    suit:str
        
card = Card("Q","hearts")

print('card.rank=',card.rank,'\n',
      '\n',
     'card=',card)

card.rank= Q 
 
 card= Card(rank='Q', suit='hearts')


## dict类

### dict.items()

In [16]:
my_dict =  {'Google': 'www.google.com', 'taobao': 'www.taobao.com', 'Runoob': 'www.runoob.com'}
for key,value in my_dict.items():
    print(key,value)

Google www.google.com
taobao www.taobao.com
Runoob www.runoob.com


### 合并dictionary

In [17]:
dict1 = {'a':1,'b':2}
dict2 = {'b':3,'c':4}
merged = { **dict1,**dict2 } # 如果有重复的key，那么第一个词典的这个key对应的value会被覆盖掉
print(merged)

{'a': 1, 'b': 3, 'c': 4}


# 一些模型细节

## residual connection
```python
# 来自《The Annotated Transformer》by harvestnlp
```

## text Data Augmentation 
来自<EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks> in 2019

* three augmentation strategies: 
  - random insertion
  - random swap
  - random deletion
  
> The techniques in the EDA paper average about a 3% improvement in accuracy when used with small amounts of labeled examples (roughly 500). If you have more than 5,000 examples in your dataset, the paper suggests that this improvement may fall to 0.8% or lower, due to the model obtaining better generalization from the larger amounts of data available over the improvements that EDA can provide.

* Back Translation

```python
pip install googletrans
# Then, we can translate our sentence from English to French, and then back to English
import googletrans
import googletrans.Translator

translator = Translator()

sentences = ['The cat sat on the mat']

translation_fr = translator.translate(sentences, dest='fr')
fr_text = [t.text for t in translations_fr]
translation_en = translator.translate(fr_text, dest='en')
en_text = [t.text for t in translation_en]
print(en_text)

[out]:['The cat sat on the carpet']
```