# ResNet Training

В этом ноутбуке будет произведено обучение модели `Resnet34` на полученном наборе данных из ноутбука
`./notebooks/01. get_train_data`.

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [21]:
!pip install -r drive/MyDrive/detectime/requirements.txt

In [4]:
cd drive/MyDrive/detectime/

/content/drive/MyDrive/detectime


In [5]:
import os
import sys
import logging
import numpy as np
import random
from tqdm import tqdm

sys.path.append('.')
from definitions import ROOT_DIR
from detectime.utils import (
    save_checkpoint,
    train_valid_split
)
from detectime.models import load_model
from detectime.loss_function import get_loss
from detectime.optimizers import (
    get_optimizer,
    get_scheduler
)
from detectime.dataset import get_data_loaders
from detectime.train import train, validation

%matplotlib inline
%load_ext autoreload
%autoreload 2

log = logging.getLogger(__name__)

{"asctime": "2021-07-07 10:23:08", "name": "matplotlib.pyplot", "filename": "pyplot.py", "levelname": "DEBUG", "message": "Loaded backend module://ipykernel.pylab.backend_inline version unknown."}


In [6]:
DATA_PATH = ROOT_DIR / 'data'
NOTEBOOK_PATH = ROOT_DIR / 'notebooks'
INPUT_DATA = DATA_PATH / 'INPUT_DATA'
INPUT_IMAGES_FOLDER = INPUT_DATA / 'TRAIN_DATA'
TRAIN_IMG_FOLDER = INPUT_DATA / 'TRAIN_IMG'
SAVE_TRAIN_IMAGES_HANDS = TRAIN_IMG_FOLDER / 'HANDS'
SAVE_TRAIN_IMAGES_FACES= TRAIN_IMG_FOLDER / 'FACES'

JSON_FOLDER = INPUT_DATA / 'JSON'
FACES_JSON_PRETRAINED = JSON_FOLDER / 'train_with_bboxes.json'
TRAIN_LABELS = INPUT_DATA / 'train.csv'
HAND_DETECTION_FOLDER = ROOT_DIR / 'model' / 'mask_rcnn_hand_detection.h5'
ANNOTATION_DATA = JSON_FOLDER / 'hands.json'

Подгрузим известный нам конфиг и будем использовать `device` для обучения - либо `cpu`, либо `cuda`.

In [19]:
import torch
import yaml
from detectime.utils import convert_dict_to_tuple

CONFIG_PATH = ROOT_DIR / 'config.yml'

with open(CONFIG_PATH) as f:
    data = yaml.safe_load(f)
config = convert_dict_to_tuple(dictionary=data)

device_name = 'cuda' if torch.cuda.is_available() else 'cpu'
device = torch.device(device_name)
print(f'device: {device_name}')

device: cuda


Обеспечим консистентность и воспроизводимость системы.

In [8]:
seed = config.dataset.seed
torch.manual_seed(seed)
np.random.seed(seed)
random.seed(seed)

torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = True

os.environ['CUDA_VISIBLE_DEVICES'] = config.cuda_id

Подгрузим веса модели `ResNet34`.

In [9]:
print("Loading model...")
net = load_model(config, device=device_name)
print("Done.")

Loading model...
{"asctime": "2021-07-07 10:23:11", "name": "detectime.models", "filename": "models.py", "levelname": "INFO", "message": "ResNet34"}
Done.


Подгрузим optimizer, scheduler и loss aggregator.

In [10]:
criterion, criterion_val = get_loss(config, device=device_name)

In [11]:
optimizer = get_optimizer(config, net)

{"asctime": "2021-07-07 10:23:14", "name": "detectime.optimizers", "filename": "optimizers.py", "levelname": "INFO", "message": "0.002"}
{"asctime": "2021-07-07 10:23:14", "name": "detectime.optimizers", "filename": "optimizers.py", "levelname": "INFO", "message": "Opt: SGD"}


In [12]:
n_epoch = config.train.n_epoch
scheduler = get_scheduler(config, optimizer)
train_epoch = tqdm(range(config.train.n_epoch),
                   dynamic_ncols=True,
                   desc='Epochs',
                   position=0)

Epochs:   0%|          | 0/21 [00:00<?, ?it/s]

Разделим данные на обучение и валидацию по ключу `video_name`.

In [22]:
train_data, val_data = train_valid_split(
    ANNOTATION_DATA,
    JSON_FOLDER,
    val_size=0.15
)

{"asctime": "2021-07-07 11:05:07", "name": "detectime.utils", "filename": "utils.py", "levelname": "INFO", "message": "3402"}
{"asctime": "2021-07-07 11:05:07", "name": "detectime.utils", "filename": "utils.py", "levelname": "INFO", "message": "132"}
{"asctime": "2021-07-07 11:05:07", "name": "detectime.utils", "filename": "utils.py", "levelname": "INFO", "message": "train data length=3029"}
{"asctime": "2021-07-07 11:05:07", "name": "detectime.utils", "filename": "utils.py", "levelname": "INFO", "message": "validation data length=373"}
{"asctime": "2021-07-07 11:05:07", "name": "detectime.utils", "filename": "utils.py", "levelname": "INFO", "message": "Savedir train and valid data: /content/drive/MyDrive/detectime/data/INPUT_DATA/JSON"}


In [23]:
ANNOTATIONS_TRAIN = JSON_FOLDER / 'train.json'
ANNOTATIONS_VALID = JSON_FOLDER / 'valid.json'

In [24]:
dt, dv = get_data_loaders(config,
                          INPUT_IMAGES_FOLDER,
                          ANNOTATIONS_TRAIN,
                          ANNOTATIONS_VALID
                          )

out_dir = str(INPUT_DATA / os.path.join(config.outdir, config.exp_name))
print("Savedir: {}".format(out_dir))
if not os.path.exists(out_dir):
    os.makedirs(out_dir)

{"asctime": "2021-07-07 11:05:13", "name": "detectime.dataset", "filename": "dataset.py", "levelname": "INFO", "message": "Preparing train reader..."}
{"asctime": "2021-07-07 11:05:13", "name": "detectime.dataset", "filename": "dataset.py", "levelname": "INFO", "message": "Done."}
{"asctime": "2021-07-07 11:05:13", "name": "detectime.dataset", "filename": "dataset.py", "levelname": "INFO", "message": "Preparing valid reader..."}
{"asctime": "2021-07-07 11:05:13", "name": "detectime.dataset", "filename": "dataset.py", "levelname": "INFO", "message": "Done."}
Savedir: /content/drive/MyDrive/detectime/data/INPUT_DATA/EXPERIMENTS/detectime


  cpuset_checked))


In [None]:
for epoch in train_epoch:
    train(net, dt, criterion, optimizer, config, epoch)
    validation(net, dv, criterion_val, epoch)
    save_checkpoint(net, optimizer, scheduler, epoch, out_dir)
    scheduler.step()


  cpuset_checked))
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)

Train:   2%|▏         | 1/63 [00:26<27:33, 26.67s/it][A
Train:   3%|▎         | 2/63 [00:27<11:34, 11.39s/it][A
Train:   5%|▍         | 3/63 [00:27<06:18,  6.30s/it][A
Train:   6%|▋         | 4/63 [00:27<03:49,  3.90s/it][A
Train:   8%|▊         | 5/63 [00:28<02:30,  2.59s/it][A
Train:  10%|▉         | 6/63 [00:28<01:41,  1.78s/it][A
Train:  11%|█         | 7/63 [00:28<01:10,  1.26s/it][A
Train:  13%|█▎        | 8/63 [00:28<00:52,  1.04it/s][A
Train:  14%|█▍        | 9/63 [00:29<00:39,  1.38it/s][A
Train:  16%|█▌        | 10/63 [00:29<00:30,  1.74it/s][A
Train:  17%|█▋        | 11/63 [00:29<00:26,  1.99it/s][A
Train:  19%|█▉        | 12/63 [00:29<00:20,  2.50it/s][A
Train:  21%|██        | 13/63 [00:29<00:17,  2.91it/s][A
Train:  22%|██▏       | 14/63 [00:30<00:14,  3.35it/s][A
Train:  24%|██▍       | 15/63 [00:30<00:13,  3.51it/s][A
Train:  25%|██▌       | 16/63 [00:

Train process of epoch: 0 is done; 
 loss: 0.8038; acc: 0.73




Val:  12%|█▎        | 1/8 [00:05<00:36,  5.23s/it][A[A

Val:  25%|██▌       | 2/8 [00:05<00:13,  2.31s/it][A[A

Val:  50%|█████     | 4/8 [00:12<00:11,  2.96s/it][A[A

Val:  62%|██████▎   | 5/8 [00:16<00:09,  3.28s/it][A[A

Val:  75%|███████▌  | 6/8 [00:16<00:04,  2.48s/it][A[A

Val:  88%|████████▊ | 7/8 [00:18<00:02,  2.24s/it][A[A

Val: 100%|██████████| 8/8 [00:18<00:00,  2.37s/it]


Validation of epoch: 0 is done; 
 loss: 0.4213; acc: 0.86


Epochs:   5%|▍         | 1/21 [43:53<14:37:51, 2633.59s/it]
Train:   0%|          | 0/63 [00:00<?, ?it/s][A
Train:   2%|▏         | 1/63 [00:20<21:27, 20.77s/it][A
Train:   3%|▎         | 2/63 [00:21<09:00,  8.85s/it][A
Train:   5%|▍         | 3/63 [00:21<04:56,  4.94s/it][A
Train:   6%|▋         | 4/63 [00:21<03:01,  3.08s/it][A
Train:   8%|▊         | 5/63 [00:22<02:00,  2.08s/it][A
Train:  10%|▉         | 6/63 [00:22<01:23,  1.46s/it][A
Train:  11%|█         | 7/63 [00:22<00:59,  1.06s/it][A
Train:  13%|█▎        | 8/63 [00:22<00:43,  1.25it/s][A
Train:  14%|█▍        | 9/63 [00:23<00:33,  1.60it/s][A
Train:  16%|█▌        | 10/63 [00:23<00:27,  1.91it/s][A
Train:  17%|█▋        | 11/63 [00:23<00:22,  2.28it/s][A
Train:  19%|█▉        | 12/63 [00:23<00:19,  2.63it/s][A
Train:  21%|██        | 13/63 [00:24<00:17,  2.88it/s][A
Train:  22%|██▏       | 14/63 [00:24<00:15,  3.11it/s][A
Train:  24%|██▍       | 15/63 [00:24<00:14,  3.32it/s][A
Train:  25%|██▌       | 16/63 

Train process of epoch: 1 is done; 
 loss: 0.1914; acc: 0.97




Val:  12%|█▎        | 1/8 [00:07<00:52,  7.53s/it][A[A

Val:  25%|██▌       | 2/8 [00:07<00:19,  3.23s/it][A[A

Val:  62%|██████▎   | 5/8 [00:07<00:02,  1.06it/s][A[A

Val: 100%|██████████| 8/8 [00:08<00:00,  1.03s/it]


Validation of epoch: 1 is done; 
 loss: 0.2139; acc: 0.94


Epochs:  10%|▉         | 2/21 [45:18<5:59:10, 1134.24s/it] 
Train:   0%|          | 0/63 [00:00<?, ?it/s][A
Train:   2%|▏         | 1/63 [00:21<21:46, 21.07s/it][A
Train:   3%|▎         | 2/63 [00:21<09:02,  8.90s/it][A
Train:   5%|▍         | 3/63 [00:21<04:56,  4.93s/it][A
Train:   6%|▋         | 4/63 [00:21<03:02,  3.10s/it][A
Train:   8%|▊         | 5/63 [00:22<02:00,  2.08s/it][A
Train:  10%|▉         | 6/63 [00:22<01:23,  1.46s/it][A
Train:  11%|█         | 7/63 [00:22<00:59,  1.06s/it][A
Train:  13%|█▎        | 8/63 [00:23<00:44,  1.23it/s][A
Train:  14%|█▍        | 9/63 [00:23<00:35,  1.54it/s][A
Train:  16%|█▌        | 10/63 [00:23<00:28,  1.88it/s][A
Train:  17%|█▋        | 11/63 [00:23<00:23,  2.23it/s][A
Train:  19%|█▉        | 12/63 [00:24<00:19,  2.58it/s][A
Train:  21%|██        | 13/63 [00:24<00:17,  2.83it/s][A
Train:  22%|██▏       | 14/63 [00:24<00:16,  3.00it/s][A
Train:  24%|██▍       | 15/63 [00:24<00:14,  3.24it/s][A
Train:  25%|██▌       | 16/63 

Train process of epoch: 2 is done; 
 loss: 0.1261; acc: 0.98




Val:  12%|█▎        | 1/8 [00:07<00:52,  7.57s/it][A[A

Val:  38%|███▊      | 3/8 [00:07<00:10,  2.00s/it][A[A

Val: 100%|██████████| 8/8 [00:08<00:00,  1.02s/it]


Validation of epoch: 2 is done; 
 loss: 0.1703; acc: 0.95


Epochs:  14%|█▍        | 3/21 [46:42<3:16:27, 654.84s/it] 
Train:   0%|          | 0/63 [00:00<?, ?it/s][A
Train:   2%|▏         | 1/63 [00:18<19:26, 18.82s/it][A
Train:   3%|▎         | 2/63 [00:19<08:03,  7.92s/it][A
Train:   5%|▍         | 3/63 [00:19<04:25,  4.43s/it][A
Train:   6%|▋         | 4/63 [00:19<02:45,  2.81s/it][A
Train:   8%|▊         | 5/63 [00:21<02:14,  2.32s/it][A
Train:  10%|▉         | 6/63 [00:21<01:32,  1.63s/it][A
Train:  11%|█         | 7/63 [00:21<01:05,  1.18s/it][A
Train:  13%|█▎        | 8/63 [00:21<00:48,  1.14it/s][A
Train:  14%|█▍        | 9/63 [00:22<00:37,  1.43it/s][A
Train:  16%|█▌        | 10/63 [00:22<00:29,  1.79it/s][A
Train:  17%|█▋        | 11/63 [00:22<00:24,  2.15it/s][A
Train:  19%|█▉        | 12/63 [00:23<00:21,  2.32it/s][A
Train:  21%|██        | 13/63 [00:23<00:19,  2.56it/s][A
Train:  22%|██▏       | 14/63 [00:23<00:16,  2.99it/s][A
Train:  24%|██▍       | 15/63 [00:23<00:15,  3.07it/s][A
Train:  25%|██▌       | 16/63 [

Train process of epoch: 3 is done; 
 loss: 0.1026; acc: 0.99




Val:  12%|█▎        | 1/8 [00:07<00:52,  7.51s/it][A[A

Val:  25%|██▌       | 2/8 [00:07<00:19,  3.19s/it][A[A

Val:  62%|██████▎   | 5/8 [00:07<00:02,  1.07it/s][A[A

Val: 100%|██████████| 8/8 [00:08<00:00,  1.03s/it]


Validation of epoch: 3 is done; 
 loss: 0.1658; acc: 0.94


Epochs:  19%|█▉        | 4/21 [48:06<2:01:39, 429.41s/it]
Train:   0%|          | 0/63 [00:00<?, ?it/s][A
Train:   2%|▏         | 1/63 [00:20<21:19, 20.64s/it][A
Train:   3%|▎         | 2/63 [00:20<08:47,  8.65s/it][A
Train:   5%|▍         | 3/63 [00:21<04:50,  4.84s/it][A
Train:   6%|▋         | 4/63 [00:21<02:58,  3.02s/it][A
Train:   8%|▊         | 5/63 [00:21<01:57,  2.02s/it][A
Train:  10%|▉         | 6/63 [00:21<01:21,  1.42s/it][A
Train:  11%|█         | 7/63 [00:22<00:57,  1.02s/it][A
Train:  13%|█▎        | 8/63 [00:22<00:43,  1.27it/s][A
Train:  14%|█▍        | 9/63 [00:22<00:32,  1.64it/s][A
Train:  16%|█▌        | 10/63 [00:22<00:26,  2.02it/s][A
Train:  17%|█▋        | 11/63 [00:23<00:21,  2.39it/s][A
Train:  19%|█▉        | 12/63 [00:23<00:19,  2.62it/s][A
Train:  21%|██        | 13/63 [00:23<00:17,  2.84it/s][A
Train:  22%|██▏       | 14/63 [00:24<00:16,  3.02it/s][A
Train:  24%|██▍       | 15/63 [00:24<00:14,  3.25it/s][A
Train:  25%|██▌       | 16/63 [0

Train process of epoch: 4 is done; 
 loss: 0.0908; acc: 0.99




Val:  12%|█▎        | 1/8 [00:07<00:51,  7.40s/it][A[A

Val:  25%|██▌       | 2/8 [00:07<00:19,  3.19s/it][A[A

Val:  62%|██████▎   | 5/8 [00:07<00:02,  1.07it/s][A[A

Val: 100%|██████████| 8/8 [00:08<00:00,  1.02s/it]


Validation of epoch: 4 is done; 
 loss: 0.1283; acc: 0.96


Epochs:  24%|██▍       | 5/21 [49:29<1:21:14, 304.67s/it]
Train:   0%|          | 0/63 [00:00<?, ?it/s][A
Train:   2%|▏         | 1/63 [00:20<21:11, 20.51s/it][A
Train:   3%|▎         | 2/63 [00:20<08:46,  8.63s/it][A
Train:   5%|▍         | 3/63 [00:21<04:47,  4.79s/it][A
Train:   6%|▋         | 4/63 [00:21<02:56,  3.00s/it][A
Train:   8%|▊         | 5/63 [00:21<01:56,  2.01s/it][A
Train:  10%|▉         | 6/63 [00:21<01:20,  1.40s/it][A
Train:  11%|█         | 7/63 [00:22<00:57,  1.03s/it][A
Train:  13%|█▎        | 8/63 [00:22<00:42,  1.28it/s][A
Train:  14%|█▍        | 9/63 [00:22<00:34,  1.58it/s][A
Train:  16%|█▌        | 10/63 [00:22<00:27,  1.94it/s][A
Train:  17%|█▋        | 11/63 [00:23<00:23,  2.23it/s][A
Train:  19%|█▉        | 12/63 [00:23<00:19,  2.58it/s][A
Train:  21%|██        | 13/63 [00:23<00:17,  2.80it/s][A
Train:  22%|██▏       | 14/63 [00:23<00:16,  2.94it/s][A
Train:  24%|██▍       | 15/63 [00:24<00:15,  3.10it/s][A
Train:  25%|██▌       | 16/63 [0