# Lyft: Training with multi-mode confidence

![](http://www.l5kit.org/_images/av.jpg)
<cite>The image from L5Kit official document: <a href="http://www.l5kit.org/README.html">http://www.l5kit.org/README.html</a></cite>

Continued from the previous kernel:
 - [Lyft: Comprehensive guide to start competition](https://www.kaggle.com/corochann/lyft-comprehensive-guide-to-start-competition)
 - [Lyft: Deep into the l5kit library](https://www.kaggle.com/corochann/lyft-deep-into-the-l5kit-library)

In this kernel, I will run **pytorch CNN model training**. Especially, followings are new items to try:
 - Predict **multi-mode with confidence**: As written in [evaluation metric](https://www.kaggle.com/c/lyft-motion-prediction-autonomous-vehicles/overview/evaluation) page, we can predict **3 modes** of motion trajectory.
 - Training loss with **competition evaluation metric**
 - Use Training abstraction library **`pytorch-ignite` and `pytorch-pfn-extras`**.


[Update 2020/9/6]<br/>
Published prediction kernel: [Lyft: Prediction with multi-mode confidence](https://www.kaggle.com/corochann/lyft-prediction-with-multi-mode-confidence)<br/>
Try yourself how good score you can get using only single model without ensemble! :)

# Environment setup

 - Please add [pestipeti/lyft-l5kit-unofficial-fix](https://www.kaggle.com/pestipeti/lyft-l5kit-unofficial-fix) as utility script.
    - Official utility script "[philculliton/kaggle-l5kit](https://www.kaggle.com/mathurinache/kaggle-l5kit)" does not work with pytorch GPU.
 - Please add [lyft-config-files](https://www.kaggle.com/jpbremer/lyft-config-files) as dataset
 
See previous kernel [Lyft: Comprehensive guide to start competition](https://www.kaggle.com/corochann/lyft-comprehensive-guide-to-start-competition) for details.

In [11]:
# https://github.com/pfnet/pytorch-pfn-extras/releases/tag/v0.3.1
# !pip install pytorch-pfn-extras==0.6.1

In [12]:
import gc
import os
import time
from pathlib import Path
import random
import sys

from tqdm.notebook import tqdm
import numpy as np
import pandas as pd
import scipy as sp


import matplotlib.pyplot as plt
import seaborn as sns

from IPython.core.display import display, HTML

# --- plotly ---
from plotly import tools, subplots
import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
import plotly.express as px
import plotly.figure_factory as ff
import plotly.io as pio
pio.templates.default = "plotly_dark"

# --- models ---
from sklearn import preprocessing
from sklearn.model_selection import KFold
import lightgbm as lgb
import xgboost as xgb
import catboost as cb

# --- setup ---
# pd.set_option('max_columns', 50)



Importing display from IPython.core.display is deprecated since IPython 7.14, please import from IPython display



In [13]:
import zarr

import l5kit
from l5kit.data import ChunkedDataset, LocalDataManager
from l5kit.dataset import EgoDataset, AgentDataset

from l5kit.rasterization import build_rasterizer
from l5kit.configs import load_config_data
from l5kit.visualization import draw_trajectory, TARGET_POINTS_COLOR
from l5kit.geometry import transform_points
from tqdm import tqdm
from collections import Counter
from l5kit.data import PERCEPTION_LABELS
from prettytable import PrettyTable

from matplotlib import animation, rc
from IPython.display import HTML

rc('animation', html='jshtml')
print("l5kit version:", l5kit.__version__)

l5kit version: 1.5.0


In [14]:
import torch
from pathlib import Path

from torch.utils.data import DataLoader
from torch.utils.data.dataset import Subset
import pytorch_pfn_extras.training.extensions as E

## Function

To define loss function to calculate competition evaluation metric **in batch**.<br/>
It works with **pytorch tensor, so it is differentiable** and can be used for training Neural Network.

In [15]:
# --- Function utils ---
# Original code from https://github.com/lyft/l5kit/blob/20ab033c01610d711c3d36e1963ecec86e8b85b6/l5kit/l5kit/evaluation/metrics.py
import numpy as np

import torch
from torch import Tensor


def pytorch_neg_multi_log_likelihood_batch(
    gt: Tensor, pred: Tensor, confidences: Tensor, avails: Tensor
) -> Tensor:
    """
    Compute a negative log-likelihood for the multi-modal scenario.
    log-sum-exp trick is used here to avoid underflow and overflow, For more information about it see:
    https://en.wikipedia.org/wiki/LogSumExp#log-sum-exp_trick_for_log-domain_calculations
    https://timvieira.github.io/blog/post/2014/02/11/exp-normalize-trick/
    https://leimao.github.io/blog/LogSumExp/
    Args:
        gt (Tensor): array of shape (bs)x(time)x(2D coords)
        pred (Tensor): array of shape (bs)x(modes)x(time)x(2D coords)
        confidences (Tensor): array of shape (bs)x(modes) with a confidence for each mode in each sample
        avails (Tensor): array of shape (bs)x(time) with the availability for each gt timestep
    Returns:
        Tensor: negative log-likelihood for this example, a single float number
    """
    assert len(pred.shape) == 4, f"expected 3D (MxTxC) array for pred, got {pred.shape}"
    batch_size, num_modes, future_len, num_coords = pred.shape
    # print(f"pred.shape : {pred.shape}")
    assert gt.shape == (batch_size, future_len, num_coords), f"wrong shape for gt, got {gt.shape}"
    assert confidences.shape == (batch_size, num_modes), f"expected 1D (Modes) array for gt, got {confidences.shape}"
    assert torch.allclose(torch.sum(confidences, dim=1), confidences.new_ones((batch_size,))), "confidences should sum to 1"
    assert avails.shape == (batch_size, future_len), f"expected 1D (Time) array for gt, got {avails.shape}"
    # assert all data are valid
    assert torch.isfinite(pred).all(), "invalid value found in pred"
    assert torch.isfinite(gt).all(), "invalid value found in gt"
    assert torch.isfinite(confidences).all(), "invalid value found in confidences"
    assert torch.isfinite(avails).all(), "invalid value found in avails"

    # convert to (batch_size, num_modes, future_len, num_coords)
    gt = torch.unsqueeze(gt, 1)  # add modes
    avails = avails[:, None, :, None]  # add modes and cords

    # error (batch_size, num_modes, future_len)
    error = torch.sum(((gt - pred) * avails) ** 2, dim=-1)  # reduce coords and use availability

    with np.errstate(divide="ignore"):  # when confidence is 0 log goes to -inf, but we're fine with it
        # error (batch_size, num_modes)
        error = torch.log(confidences) - 0.5 * torch.sum(error, dim=-1)  # reduce time

    # use max aggregator on modes for numerical stability
    # error (batch_size, num_modes)
    max_value, _ = error.max(dim=1, keepdim=True)  # error are negative at this point, so max() gives the minimum one
    error = -torch.log(torch.sum(torch.exp(error - max_value), dim=-1, keepdim=True)) - max_value  # reduce modes
    # print("error", error)
    return torch.mean(error)


def pytorch_neg_multi_log_likelihood_single(
    gt: Tensor, pred: Tensor, avails: Tensor
) -> Tensor:
    """

    Args:
        gt (Tensor): array of shape (bs)x(time)x(2D coords)
        pred (Tensor): array of shape (bs)x(time)x(2D coords)
        avails (Tensor): array of shape (bs)x(time) with the availability for each gt timestep
    Returns:
        Tensor: negative log-likelihood for this example, a single float number
    """
    # pred (bs)x(time)x(2D coords) --> (bs)x(mode=1)x(time)x(2D coords)
    # create confidence (bs)x(mode=1)
    batch_size, future_len, num_coords = pred.shape
    confidences = pred.new_ones((batch_size, 1))
    return pytorch_neg_multi_log_likelihood_batch(gt, pred.unsqueeze(1), confidences, avails)

## Model

pytorch model definition. Here model outputs both **multi-mode trajectory prediction & confidence of each trajectory**.

In [16]:
# --- Model utils ---
import torch
from torchvision.models import resnet18
from torch import nn
from typing import Dict



In [17]:
# --- Utils ---
import yaml


def save_yaml(filepath, content, width=120):
    with open(filepath, 'w') as f:
        yaml.dump(content, f, width=width)


def load_yaml(filepath):
    with open(filepath, 'r') as f:
        content = yaml.safe_load(f)
    return content


class DotDict(dict):
    """dot.notation access to dictionary attributes

    Refer: https://stackoverflow.com/questions/2352181/how-to-use-a-dot-to-access-members-of-dictionary/23689767#23689767
    """  # NOQA

    __getattr__ = dict.get
    __setattr__ = dict.__setitem__
    __delattr__ = dict.__delitem__

    

## Configs

In [18]:
# --- Lyft configs ---
cfg = {
    'format_version': 4,
    'model_params': {
        'model_architecture': 'resnet50',
        'history_num_frames': 10,
        'history_step_size': 1,
        'history_delta_time': 0.1,
        'future_num_frames': 50,
        'future_step_size': 1,
        'future_delta_time': 0.1,
        'render_ego_history':True,
        'step_time':0.1
    },

    'raster_params': {
        'raster_size': [224, 224],
        'pixel_size': [0.5, 0.5],
        'ego_center': [0.25, 0.5],
        'map_type': 'py_semantic',
        'satellite_map_key': 'aerial_map/aerial_map.png',
        'semantic_map_key': 'semantic_map/semantic_map.pb',
        'dataset_meta_key': 'meta.json',
        'filter_agents_threshold': 0.5,

        'set_origin_to_bottom': True,
        'disable_traffic_light_faces':False
    },

    'train_data_loader': {
        'key': 'scenes/train.zarr',
        'batch_size': 12,
        'shuffle': True,
        'num_workers': 4
    },

    'valid_data_loader': {
        'key': 'scenes/validate.zarr',
        'batch_size': 32,
        'shuffle': False,
        'num_workers': 4
    },

    'train_params': {
        'max_num_steps': 10000,
        'checkpoint_every_n_steps': 5000,

        # 'eval_every_n_steps': -1
    }
}





In [19]:
flags_dict = {
    "debug": True,
    # --- Data configs ---
    "l5kit_data_folder": "/home/zhuhe/kaggle/input/lyft-motion-prediction-autonomous-vehicles",
    # --- Model configs ---
    "pred_mode": "multi",
    # --- Training configs ---
    "device": "cuda:0",
    "out_dir": "results/multi_train",
    "epoch": 2,
    "snapshot_freq": 50,
}

# Main script

Now finished defining all the util codes. Let's start writing main script to train the model!

## Loading data

Here we will only use the first dataset from the sample set. (sample.zarr data is used for visualization, please use train.zarr / validate.zarr / test.zarr for actual model training/validation/prediction.)<br/>
We're building a `LocalDataManager` object. This will resolve relative paths from the config using the `L5KIT_DATA_FOLDER` env variable we have just set.

In [20]:
flags = DotDict(flags_dict)
out_dir = Path(flags.out_dir)
os.makedirs(str(out_dir), exist_ok=True)
print(f"flags: {flags_dict}")
save_yaml(out_dir / 'flags.yaml', flags_dict)
save_yaml(out_dir / 'cfg.yaml', cfg)
debug = flags.debug



flags: {'debug': True, 'l5kit_data_folder': '/home/zhuhe/kaggle/input/lyft-motion-prediction-autonomous-vehicles', 'pred_mode': 'multi', 'device': 'cuda:0', 'out_dir': 'results/multi_train', 'epoch': 2, 'snapshot_freq': 50}


In [21]:
# set env variable for data
os.environ["L5KIT_DATA_FOLDER"] = flags.l5kit_data_folder
dm = LocalDataManager(None)

print("Load dataset...")
train_cfg = cfg["train_data_loader"]
valid_cfg = cfg["valid_data_loader"]

# Rasterizer
rasterizer = build_rasterizer(cfg, dm)
train_path = "scenes/train.zarr" if debug else train_cfg["key"]
test_path = "scenes/test.zarr"
train_zarr = ChunkedDataset(dm.require(train_path)).open()
test_zarr = ChunkedDataset(dm.require(test_path)).open()
print("train_zarr", type(train_zarr))
print("test_zarr", type(test_zarr))
train_ego_dataset = EgoDataset(cfg, train_zarr, rasterizer)
train_agent_dataset = AgentDataset(cfg, train_zarr, rasterizer)
test_ego_dataset = EgoDataset(cfg, test_zarr, rasterizer)
test_agent_dataset = AgentDataset(cfg, test_zarr, rasterizer)

# valid_path = "scenes/sample.zarr" if debug else valid_cfg["key"]
# valid_zarr = ChunkedDataset(dm.require(valid_path)).open()
# print("valid_zarr", type(train_zarr))
# valid_agent_dataset = AgentDataset(cfg, valid_zarr, rasterizer)
# valid_dataset = TransformDataset(valid_agent_dataset, transform)


Load dataset...
train_zarr <class 'l5kit.data.zarr_dataset.ChunkedDataset'>
test_zarr <class 'l5kit.data.zarr_dataset.ChunkedDataset'>


In [22]:
def get_distance(centroid1, centroid2):
    return np.sqrt(np.square(centroid1[0]-centroid2[0]) + np.square(centroid1[1]-centroid2[1]))
# def change_all_list_to_ndarray(dic_in):

def transform_one_scene_dataset(scene_ego_dataset, scene_agent_dataset, debug: False):
    def add_element_into(ele_dict, ele):
        temp_x = []
        temp_x.extend(ele["curr_speed"].flatten())
        temp_x.extend(ele["history_positions"].flatten())
        temp_x.extend(ele["history_yaws"].flatten())
        temp_x.extend(ele["history_availabilities"].flatten())
        ele_dict["x"].append(temp_x)
        if len(ele_dict["index"])== 0:
            cur_index = 0
        else:
            cur_index = ele_dict["index"][-1] + 1
        for i in range(0,cur_index):
            ele_dict["edge_attr"].append(get_distance(ele["centroid"], ele_dict["centroid"][i])) # 双向图
            ele_dict["edge_index"][0].append(i)
            ele_dict["edge_index"][1].append(cur_index)
            ele_dict["edge_attr"].append(get_distance(ele["centroid"], ele_dict["centroid"][i]))
            ele_dict["edge_index"][1].append(i)
            ele_dict["edge_index"][0].append(cur_index)
        ele_dict["target_positions"].append(ele["target_positions"])
        ele_dict["target_availabilities"].append(ele["target_availabilities"])
        ele_dict["centroid"].append(ele["centroid"])  # 这个元素是为了建图方便
        ele_dict["index"].append(cur_index)  # 这个元素是为了建图方便

    return_np = [] # 这个的长度应该是247或者248的样子，并且这个就是frame_index

    for ele in scene_ego_dataset:
        ele_dict = {}
        for name in ["x","edge_index","edge_attr","centroid","index","target_positions","target_availabilities"]:
            if name == "edge_index":
                ele_dict[name] = [[],[]]
            else:
                ele_dict[name] = []
        add_element_into(ele_dict,ele)
        return_np.append(ele_dict)
    if debug:
        print(f"len(return_np) : {len(return_np)}")
    for ele in scene_agent_dataset:
        curr_frame_index = ele["frame_index"]
        if debug:
            print(ele["frame_index"])
        cur_dict = return_np[curr_frame_index]
        add_element_into(cur_dict, ele)

    for ele in return_np:
        for key_ in ele.keys():
            ele[key_] = np.array(ele[key_])

        ele["x"] = torch.tensor(ele["x"])
        ele["edge_index"] = torch.tensor(ele["edge_index"],dtype=torch.long)
        ele["edge_attr"] = torch.tensor(ele["edge_attr"])
        ele["target_positions"] = torch.tensor(ele["target_positions"])
        ele["target_availabilities"] = torch.tensor(ele["target_availabilities"])

    return return_np

## Prepare model & optimizer

In [23]:

import torch.nn.functional as F
from torch_geometric.nn import GCNConv

class GCN(torch.nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim, num_layers,
                 dropout, return_embeds=False, whether_dropout= False):
        # TODO: Implement this function that initializes self.convs,
        # self.bns, and self.softmax.

        super(GCN, self).__init__()

        # A list of GCNConv layers
        self.convs = None

        # A list of 1D batch normalization layers
        self.bns = None

        # The log softmax layer
        self.softmax = None
        self.dropout = dropout

        self.convs = torch.nn.ModuleList()

        for i in range(num_layers - 1):
            self.convs.append(GCNConv(input_dim, hidden_dim))
            input_dim = hidden_dim

        self.convs.append(GCNConv(input_dim, output_dim))
        self.whether_dropout = whether_dropout

        self.softmax=torch.nn.LogSoftmax()
        # Skip classification layer and return node embeddings
        self.return_embeds = return_embeds



    # def forward(self, x, edge_index,edge_attr):
    #     # x = x.to(device) # 在模型to_divice很慢？
    #     edge_attr = edge_attr.to(device)
    #     edge_index = edge_index.to(device)
    #     out = None
    #     for layer in range(len(self.convs)-1):  #layer：层数
    #         x=self.convs[layer](x,edge_index,edge_attr).to(torch.float)   #叠GCNConv
    #         # x= x.to(torch.float)# 这个是因为他这的输出搞成了float64,导致大家数据形式不兼容
    #         x=F.relu(x)  #叠relu,这个不会导致数据被转化成float,
    #         if self.whether_dropout is True:
    #             x=F.dropout(x,self.dropout,self.training)  #叠dropout。这个self.dropout看下文是概率。
    #     #最后一层
    #     out=self.convs[-1](x,edge_index,edge_attr)  #GCNVonv
    #     if not self.return_embeds:
    #         out=self.softmax(out)
    def forward(self, batch_data):
        out = None
        x = batch_data.x
        for layer in range(len(self.convs)-1):  #layer：层数
            x=self.convs[layer](x,batch_data.edge_index,batch_data.edge_attr).to(torch.float)   #叠GCNConv

            # x= x.to(torch.float)# 这个是因为他这的输出搞成了float64,导致大家数据形式不兼容
            x=F.relu(x)  #叠relu,这个不会导致数据被转化成float,
            if self.whether_dropout is True:
                x=F.dropout(x,self.dropout,self.training)  #叠dropout。这个self.dropout看下文是概率。
        #最后一层

        out=self.convs[-1](x,batch_data.edge_index,batch_data.edge_attr)  #GCNVonv
        if not self.return_embeds:
            out=self.softmax(out)

        return out




In [58]:
args = {
    'device' : 'cuda' if torch.cuda.is_available() else 'cpu',
    # 'device' = 'cpu'
    'num_layers': 2,
    'hidden_dim': 256,
    'dropout': 0,
    'lr': 0.001,
    'epochs': 30,
    'train_scene_start': 11,
    'train_scene_end': 100,
    'test_scene_start': 10,
    'test_scene_end': 30,
    'batch_size_gcn':512,
}

In [36]:
#dict_keys(['frame_index', 'image', 'target_positions', 'target_yaws', 'target_velocities', 'target_availabilities', 'history_positions', 'history_yaws', 'history_velocities', 'history_availabilities', 'world_to_image', 'raster_from_agent', 'raster_from_world', 'agent_from_world', 'world_from_agent', 'centroid', 'yaw', 'extent', 'history_extents', 'future_extents', 'curr_speed', 'scene_index', 'host_id', 'timestamp', 'track_id'])

In [37]:
scene_ego_dataset = train_ego_dataset.get_scene_dataset(60)
scene_agent_dataset = train_agent_dataset.get_scene_dataset(60)
frame_dic_array = transform_one_scene_dataset(scene_ego_dataset,scene_agent_dataset,debug=False)

In [38]:
from tqdm import tqdm,trange

input_dim = len(frame_dic_array[0]["x"][0])
model_gcn = GCN(input_dim=input_dim, hidden_dim=args["hidden_dim"], output_dim = cfg["model_params"]["future_num_frames"] * 2,num_layers=args["num_layers"],dropout=args["dropout"],return_embeds = True, whether_dropout=False)

model_gcn = model_gcn.to(device=args['device'])

print(model_gcn.eval())
epoch = flags.epoch

optimizer = torch.optim.Adam(model_gcn.parameters(), lr=args["lr"])

GCN(
  (convs): ModuleList(
    (0): GCNConv(45, 256)
    (1): GCNConv(256, 100)
  )
  (softmax): LogSoftmax(dim=None)
)


In [72]:
def train_with_batch(model, device, data_loader, optimizer, loss_fn):
    model.train()
    loss = 0

    # for step, batch in enumerate(tqdm(data_loader, desc="Iteration")):
    for batch in data_loader:
        batch = batch.to(device)
        if batch.x.shape[0] == 1 or batch.batch[-1] == 0:
            pass
        else:
            optimizer.zero_grad()
            out=model(batch)
            train_output=out.view([-1,50,2]).to(device)
            train_label = batch.y[:,:,0:2]
            train_availabilities = torch.squeeze(batch.y[:,:,0:1],dim=-1)
            loss=loss_fn(train_label,train_output,train_availabilities)
            loss.backward()
            optimizer.step()

    return loss.item()

def train(model, value_dic, train_idx, optimizer, loss_fn):
    model.train()
    loss = 0
    optimizer.zero_grad()

    out=model(torch.tensor(value_dic["x"]),edge_index = torch.tensor(value_dic["edge_index"],dtype=torch.long), edge_attr = torch.tensor(value_dic["edge_attr"]))
    if len(out.shape) <= 2:
        out = torch.unsqueeze(out,0)

    train_output=out.view([-1,50,2]).to(args['device'])  # 这里暂时是全部训练
    train_label= torch.tensor(value_dic["target_positions"], dtype = torch.float).to(args['device'])
    train_availabilities = torch.tensor(value_dic["target_availabilities"],dtype= torch.int).to(args['device'])
    loss=loss_fn(train_label,train_output,train_availabilities) # 只预测一条路
    loss.backward()
    optimizer.step()

    return loss.item()

In [71]:
@torch.no_grad()
def test_with_batch(model, device, data_loader, loss_fn):

    model.eval()
    y_true = []
    y_pred = []
    loss_array = []

    # for step, batch in enumerate(tqdm(data_loader, desc="Iteration")):
    for batch in data_loader:
        batch = batch.to(device)
        if batch.x.shape[0] == 1 or batch.batch[-1] == 0:
            pass
        else:
            out=model(batch)
            train_output=out.view([-1,50,2]).to(device)
            train_label = batch.y[:,:,0:2]
            train_availabilities = torch.squeeze(batch.y[:,:,0:1],dim=-1)
            loss=loss_fn(train_label,train_output,train_availabilities)
            y_true.append(train_label.detach().cpu())
            y_pred.append(train_output.detach().cpu())
            loss_array.append(loss.item())



    return y_true,y_pred,loss_array

In [59]:
from torch_geometric.data import Data
from torch_geometric.loader import  DataLoader
def convert_framearray_to_dataloader(frame_dic_array):
    pyg_data_list = []
    for ele in frame_dic_array:
        pyg_data_list.append(Data(x = ele["x"],edge_index=ele["edge_index"],edge_attr=ele["edge_attr"],y=torch.cat([ele["target_positions"],torch.unsqueeze(ele["target_availabilities"],dim=-1)],dim = -1)))
    return DataLoader(pyg_data_list,batch_size=args['batch_size_gcn'])



In [65]:
def load_all_data_into_memory(ego_dataset,agent_dataset,start_index,end_index):
    return_array = []
    for index_ in range(start_index,end_index):

        scene_ego_dataset = ego_dataset.get_scene_dataset(index_)
        scene_agent_dataset = agent_dataset.get_scene_dataset(index_)
        frame_dic_array = transform_one_scene_dataset(scene_ego_dataset,scene_agent_dataset,debug=False)
        dataloader = convert_framearray_to_dataloader(frame_dic_array)
        return_array.append(dataloader)
    return  return_array

def prepare_dataloader_array():
    train_dataloader = load_all_data_into_memory(train_ego_dataset,train_agent_dataset,args['train_scene_start'],args['train_scene_end'])
    test_dataloader = load_all_data_into_memory(test_ego_dataset,test_agent_dataset,args['test_scene_start'],args['test_scene_end'])
    return train_dataloader,test_dataloader


In [None]:
train_dataloader,test_dataloader = prepare_dataloader_array()

In [76]:
import sys
print(len(train_dataloader))
# print(sys.getsizeof(train_dataloader[0]) / 1024 / 1024, 'MB')
# for i in trange(0,100):
#     print(i)

89


In [87]:
import time




test_file = open( 'test_output.txt', 'w+',encoding = 'utf-8' )
for epoch in trange(1, 1 + args["epochs"]):
    loss_whole_scene_list = []
    loss_this_epoch = 0
    # for train_scene_index in trange(0,args['train_scene_end']-args['train_scene_start']):
    for train_scene_index in range(0,args['train_scene_end']-args['train_scene_start']):

        dataloader = train_dataloader[train_scene_index]


        loss_each_frame = train_with_batch(model_gcn,args["device"],dataloader,optimizer,pytorch_neg_multi_log_likelihood_single)
        loss_this_scene = np.mean(np.array(loss_each_frame))
        loss_this_epoch = loss_this_epoch + loss_this_scene

    # print(loss_this_epoch)
    if epoch % 10 == 1:
        for test_scene_index in range(0,args['test_scene_end']-args['test_scene_start']):

            dataloader = test_dataloader[test_scene_index]
            y_true,y_pred,loss = test_with_batch(model_gcn,args["device"],dataloader,pytorch_neg_multi_log_likelihood_single)
            test_file.write(str(loss))
            print(str(loss))

test_file.close()


  3%|██▊                                                                                   | 1/30 [00:00<00:28,  1.03it/s]

[647393.342000727]
[647002.188674229]
[266954.2976944361]
[179918.9125562549]
[287528.4271578373]
[47910.55611279118]
[47520.57730185112]
[16104.03513535693]
[252815.5855549566]
[171969.06117824133]
[13749.986412226992]
[1554404.1064684237]
[4760078.55407086]
[249091.41864932163]
[1030237.5139781014]
[728282.2960894003]
[1085630.1224063027]
[148973.63871449142]
[59225.46275559492]
[124765.53129293691]


 37%|███████████████████████████████▏                                                     | 11/30 [00:09<00:16,  1.15it/s]

[650625.7522564431]
[652877.1916116371]
[268943.0572492163]
[182512.56874179989]
[288742.4733879034]
[49074.28936321594]
[47204.82622315638]
[15614.920612944823]
[253275.20450429656]
[172629.22585301186]
[13972.72075478393]
[1540322.184937909]
[4724165.268985745]
[245291.2084626822]
[1020859.933374728]
[725364.6420098712]
[1089556.7431028723]
[148976.8217255361]
[58982.64408457332]
[124944.4216489022]


 70%|███████████████████████████████████████████████████████████▍                         | 21/30 [00:17<00:07,  1.15it/s]

[652774.6914742804]
[659071.9615227829]
[269467.47584993875]
[184583.53780154674]
[289357.4018570712]
[50628.924012689335]
[47047.371912733746]
[15548.008527054573]
[253032.12451184992]
[173458.8366506649]
[14107.142589349574]
[1528108.3969543693]
[4694049.073076369]
[240901.46151088143]
[1012455.5829319184]
[723594.6471719306]
[1091850.541712521]
[149526.8187291877]
[58765.16397829528]
[125447.96518156219]


100%|█████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:25<00:00,  1.17it/s]


In [None]:
print(frame_dic_array[11])

You can obtrain training history results really easily by just accessing `LogReport` class, which is useful for managing a lot of experiments during kaggle competitions.

The history log and model's weight are saved by "extensions" (`LogReport` and `E.snapshot_object` respectively) easily, which is a benefit of using training abstration.

In [None]:
# Let's see training results directory

!ls results/multi_train

# Items to try

This kernel shows demonstration run of the training (`debug=True`). You can try these things to see how the score changes at first
 - set debug=False to train with actual training dataset
 - change training hyperparameters (training epoch, change optimizer, scheduler learning rate etc...)
   - Especially, just training much longer time may improve the score.
 
To go further, these items may be nice to try:
 - Change the cnn model (now simple resnet18 is used as baseline modeling)
 - Training the model using full dataset: [lyft-full-training-set](https://www.kaggle.com/philculliton/lyft-full-training-set)
 - Write your own rasterizer to prepare input image as motivation explained in previous kernel.
 - Consider much better scheme to predict multi-trajectory
    - The model just predicts multiple trajectory at the same time in this kernel, but it is possible to collapse "trivial" solution where all trajectory converges to same. How to avoid this?

# Next to go

[Update 2020/9/6]<br/>
Published prediction kernel: [Lyft: Prediction with multi-mode confidence](https://www.kaggle.com/corochann/lyft-prediction-with-multi-mode-confidence)<br/>
Try yourself how good score you can get using only single model without ensemble! :)

To understand the competition in more detail, please refer my other kernels too.
 - [Lyft: Comprehensive guide to start competition](https://www.kaggle.com/corochann/lyft-comprehensive-guide-to-start-competition)
 - [Lyft: Deep into the l5kit library](https://www.kaggle.com/corochann/lyft-deep-into-the-l5kit-library)
 - [Save your time, submit without kernel inference](https://www.kaggle.com/corochann/save-your-time-submit-without-kernel-inference)
 - [Lyft: pytorch implementation of evaluation metric](https://www.kaggle.com/corochann/lyft-pytorch-implementation-of-evaluation-metric)

# Further reference

 - Paper of this Lyft Level 5 prediction dataset: [One Thousand and One Hours: Self-driving Motion Prediction Dataset](https://arxiv.org/abs/2006.14480)
 - [jpbremer/lyft-scene-visualisations](https://www.kaggle.com/jpbremer/lyft-scene-visualisations)

<h3 style="color:red">If this kernel helps you, please upvote to keep me motivated :)<br>Thanks!</h3>