# Collision Avoidance - Train Model
# 避障 - 训练模型

Welcome to this host side Jupyter Notebook!  This should look familiar if you ran through the notebooks that run on the robot.  In this notebook we'll train our image classifier to detect two classes
``free`` and ``blocked``, which we'll use for avoiding collisions.  For this, we'll use a popular deep learning library *PyTorch*

欢迎来到这个在主机运行的Jupyter Notebook！ 这应该看起来非常熟悉，如果你从头到尾地运行JetBot上notebook的话。我们将使用图像分类器来训练两个类``free``和``blocked``，我们将用这个训练完的模型来避免碰撞。为此，我们将使用一个流行深度学习库 *PyTorch*

In [None]:
import torch
import torch.optim as optim
import torch.nn.functional as F
import torchvision
import torchvision.datasets as datasets
import torchvision.models as models
import torchvision.transforms as transforms

### Upload and extract dataset
### 上传并提取数据集

Before you start, you should upload the ``dataset.zip`` file that you created in the ``data_collection.ipynb`` notebook on the robot.  
在开始之前，你应该上传``dataset.zip``

You should then extract this dataset by calling the command below  
然后通过调用shell （命令行）命令来提取（解压缩）此数据集

In [None]:
!unzip -q dataset.zip

You should see a folder named ``dataset`` appear in the file browser.  
你应该见到一个名为``dataset``的文件夹出现在文件浏览器上。

### Create dataset instance
### 创建数据集实例

Now we use the ``ImageFolder`` dataset class available with the ``torchvision.datasets`` package.  We attach transforms from the ``torchvision.transforms`` package to prepare the data for training.  

现在我们使用``torchvision.datasets`` package中的``ImageFolder``数据集类。里面有个附加``torchvision.transforms``package用于转换数据，为训练模型做准备。

In [None]:
dataset = datasets.ImageFolder(
    'dataset',
    transforms.Compose([
        transforms.ColorJitter(0.1, 0.1, 0.1, 0.1),
        transforms.Resize((224, 224)),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])
)

### Split dataset into train and test sets
### 将数据集拆分为训练集和测试集

Next, we split the dataset into *training* and *test* sets.  The test set will be used to verify the accuracy of the model we train.  
接下来，我们将数据集拆分为 *训练集* 和 *测试集*。测试集将用于验证我们训练完的模型准确性。

In [None]:
train_dataset, test_dataset = torch.utils.data.random_split(dataset, [len(dataset) - 50, 50])

### Create data loaders to load data in batches
### 创建数据加载器以批量加载数据

We'll create two ``DataLoader`` instances, which provide utilities for shuffling data, producing *batches* of images, and loading the samples in parallel with multiple workers.  
我们将创建两个``DataLoader``实例，它们为洗牌数据提供实用程序，生成*批次*图像，并与多个工作并行加载样本。

In [None]:
train_loader = torch.utils.data.DataLoader(
    train_dataset,
    batch_size=16,
    shuffle=True,
    num_workers=4
)

test_loader = torch.utils.data.DataLoader(
    test_dataset,
    batch_size=16,
    shuffle=True,
    num_workers=4
)

### Define the neural network
### 定义神经网络

Now, we define the neural network we'll be training.  The *torchvision* package provides a collection of pre-trained models that we can use.  
现在，我们定义我们将要训练的神经网络。 *torchvision* package提供了一系列我们可以使用的预训练模型。

In a process called *transfer learning*, we can repurpose a pre-trained model (trained on millions of images) for a new task that has possibly much less data available.  
在一个称为*转移学习*的过程中，我们可以重新利用预先训练的模型（在数百万图像上进行训练），以获得可能的数据可能少得多的新任务。

Important features that were learned in the original training of the pre-trained model are re-usable for the new task.  We'll use the ``alexnet`` model.  
在预训练模型的原始训练中学到的重要特征可重复用于新任务。 我们将使用``alexnet``模型。

In [None]:
model = models.alexnet(pretrained=True)

The ``alexnet`` model was originally trained for a dataset that had 1000 class labels, but our dataset only has two class labels!  We'll replace
the final layer with a new, untrained layer that has only two outputs.  
``alexnet``模型最初是针对具有1000个类标签的数据集进行训练的，但我们的数据集只有两个类标签！ 我们将把最好的层替换为最新的，未经训练的层只有两个输出。


In [None]:
model.classifier[6] = torch.nn.Linear(model.classifier[6].in_features, 2)

Finally, we transfer our model for execution on the GPU  
最后，我们将模型转移到GPU上执行

In [None]:
device = torch.device('cuda')
model = model.to(device)

### Train the neural network
### 训练神经网络

Using the code below we will train the neural network for 30 epochs, saving the best performing model after each epoch.  
使用下面的代码，将开始训练我们的神经网络，在运行完每个世代后，保存表现最佳的模型。

> An epoch is a full run through our data.  
> 一个时代是所有数据运行一遍

In [None]:
NUM_EPOCHS = 30
BEST_MODEL_PATH = 'best_model.pth'
best_accuracy = 0.0

optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

for epoch in range(NUM_EPOCHS):
    
    for images, labels in iter(train_loader):
        images = images.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = model(images)
        loss = F.cross_entropy(outputs, labels)
        loss.backward()
        optimizer.step()
    
    test_error_count = 0.0
    for images, labels in iter(test_loader):
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        test_error_count += float(torch.sum(torch.abs(labels - outputs.argmax(1))))
    
    test_accuracy = 1.0 - float(test_error_count) / float(len(test_dataset))
    print('%d: %f' % (epoch, test_accuracy))
    if test_accuracy > best_accuracy:
        torch.save(model.state_dict(), BEST_MODEL_PATH)
        best_accuracy = test_accuracy

Once that is finished, you should see a file ``best_model.pth`` in the Jupyter Lab file browser.  Select ``Right click`` -> ``Download`` to download the model to your workstation

当这完成后，你应该会见到一个文件``best_model.pth``在Jupyter Lab的文件浏览器上，鼠标右键可以选择下载，可以保存这个模型在你的系统平台上。