Training using our own dataset #771

SaMeEr9597 · 2022-01-23T04:06:21Z

Good evening,
First I would like to thank you for this project, I had a query regarding training using our own dataset. I see that in Demo you have shown how to use our own dataset of point cloud and save to numpy format and test it on a pre-trained model, is there a way we can train using our own dataset, if so could you please guide me as to how to do it,
Thanks for your help

155cannon · 2022-01-27T09:29:22Z

我来分享一下如何训练自己的数据集以及如何标注点云获得3D标注框：

1.标注点云获得3D标注框
我使用cloudcompare软件，本来有一些.pcd文件，用cloudcompare打开后，选择cross section功能，把你要检测的目标框出来，然后形成新的点云，选择后在属性面板可以看到目标中心坐标和3D标注框的长宽高，将这6个值存在标注文件里（可以直接ctrl+c复制属性面板上的文字，然后编写简单的预处理程序完善一下，形成下面说的标注文件）。
使用cloudcompare另存.pcd文件，去除掉其他信息，只保留xyz坐标，另存为.txt文件。
最终形成：
点云文件0000.txt（每行x y z）~xxxx.txt
标注文件0000.txt(每行x, y, z, dx, dy, dz, heading）~xxxx.txt，x, y, z, dx, dy, dz就是我上面说的那6个值，按顺序放，heading我统一设为1.57，也就是90°，这个根据需要，我的情况是这样。
这样直接就得到了OpenPCDet统一坐标系下的点云和3D标注框。

2.训练自己的数据集
我之前看了好多人发的帖子，有的人模仿kitti数据集的形式制作自己的数据集，然后使用原程序，有的人写自己的dataset.py，经过考虑我选择了后者。我比较仔细地看了demo.py、kitti_dataset.py和dataset.py（网络上也有其他人写的OpenPCDet的注释，可以看一下），搞明白一些东西后，就开始模仿着写自己的dataset.py。主要一些代码见下：

import glob
import numpy as np
from pathlib import Path
from ..dataset import DatasetTemplate

class CoilsDataset(DatasetTemplate):
def init(self, dataset_cfg, class_names, training=True, root_path=None, logger=None):
"""
Args:
root_path:
dataset_cfg:
class_names:
training:
logger:
"""
super().init(
dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger
)

    points_file_list = glob.glob(str(self.root_path / 'training/points' / '*.txt'))
    labels_file_list = glob.glob(str(self.root_path / 'training/labels' / '*.txt'))

    points_file_list.sort()
    labels_file_list.sort()
    self.sample_file_list = points_file_list
    self.samplelabel_file_list = labels_file_list

def __len__(self):
    return len(self.sample_file_list)

def __getitem__(self, index):
    sample_idx = Path(self.sample_file_list[index]).stem  # 0000.txt -> 0000 样本id(文件编号) n：点的个数 m：标注的个数
    points = np.loadtxt(self.sample_file_list[index], dtype=np.float32).reshape(-1, 3)  # 每个点云文件里的所有点 n*3
    points_label = np.loadtxt(self.samplelabel_file_list[index], dtype=np.float32).reshape(-1, 7) # 每个点云标注文件里的所有点 m*7
    gt_names = np.array(['Coil']*points_label.shape[0])

    input_dict = {
        'points': points,
        'frame_id': sample_idx,
        'gt_names': gt_names,
        'gt_boxes': points_label
    }

    data_dict = self.prepare_data(data_dict=input_dict)
    return data_dict

以上是我的coils_dataset.py，此文件和kitti_dataset.py放在同一目录下。

DATASET: 'CoilsDataset'
DATA_PATH: '../data/coils'

POINT_CLOUD_RANGE: [0, -40, -3, 70.4, 40, 1]

DATA_SPLIT: {
'train': train,
'test': val
}

INFO_PATH: {
'train': [kitti_infos_train.pkl],
'test': [kitti_infos_val.pkl],
}

GET_ITEM_LIST: ["points"]
FOV_POINTS_ONLY: True

DATA_AUGMENTOR:
DISABLE_AUG_LIST: ['placeholder']
AUG_CONFIG_LIST:
- NAME: random_world_flip
ALONG_AXIS_LIST: ['x']

    - NAME: random_world_rotation
      WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]

    - NAME: random_world_scaling
      WORLD_SCALE_RANGE: [0.95, 1.05]

POINT_FEATURE_ENCODING: {
encoding_type: absolute_coordinates_encoding,
used_feature_list: ['x', 'y', 'z'],
src_feature_list: ['x', 'y', 'z'],
}

DATA_PROCESSOR:
- NAME: mask_points_and_boxes_outside_range
REMOVE_OUTSIDE_BOXES: True

- NAME: shuffle_points
  SHUFFLE_ENABLED: {
    'train': True,
    'test': False
  }

- NAME: transform_points_to_voxels
  VOXEL_SIZE: [0.05, 0.05, 0.1]
  MAX_POINTS_PER_VOXEL: 5
  MAX_NUMBER_OF_VOXELS: {
    'train': 16000,
    'test': 40000
  }

以上是我的coils_dataset.yaml，此文件和kitti_dataset.yaml放在同一目录下。

CLASS_NAMES: ['Coil']

DATA_CONFIG:
BASE_CONFIG: cfgs/dataset_configs/coils_dataset.yaml
POINT_CLOUD_RANGE: [24, 0, 0, 90.56, 12.8, 4.5]
DATA_PROCESSOR:
- NAME: mask_points_and_boxes_outside_range
REMOVE_OUTSIDE_BOXES: True

    - NAME: shuffle_points
      SHUFFLE_ENABLED: {
        'train': True,
        'test': False
      }

    - NAME: transform_points_to_voxels
      VOXEL_SIZE: [0.16, 0.16, 4.5]
      MAX_POINTS_PER_VOXEL: 32
      MAX_NUMBER_OF_VOXELS: {
        'train': 16000,
        'test': 40000
      }
DATA_AUGMENTOR:
    DISABLE_AUG_LIST: ['placeholder']
    AUG_CONFIG_LIST:
        - NAME: random_world_flip
          ALONG_AXIS_LIST: ['x']

        - NAME: random_world_rotation
          WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]

        - NAME: random_world_scaling
          WORLD_SCALE_RANGE: [0.95, 1.05]

MODEL:
NAME: PointPillar

VFE:
    NAME: PillarVFE
    WITH_DISTANCE: False
    USE_ABSLOTE_XYZ: True
    USE_NORM: True
    NUM_FILTERS: [64]

MAP_TO_BEV:
    NAME: PointPillarScatter
    NUM_BEV_FEATURES: 64

BACKBONE_2D:
    NAME: BaseBEVBackbone
    LAYER_NUMS: [3, 5, 5]
    LAYER_STRIDES: [2, 2, 2]
    NUM_FILTERS: [64, 128, 256]
    UPSAMPLE_STRIDES: [1, 2, 4]
    NUM_UPSAMPLE_FILTERS: [128, 128, 128]

DENSE_HEAD:
    NAME: AnchorHeadSingle
    CLASS_AGNOSTIC: False

    USE_DIRECTION_CLASSIFIER: True
    DIR_OFFSET: 0.78539
    DIR_LIMIT_OFFSET: 0.0
    NUM_DIR_BINS: 2

    ANCHOR_GENERATOR_CONFIG: [
        {
            'class_name': 'Coil',
            'anchor_sizes': [[3.9, 1.6, 1.56]],
            'anchor_rotations': [0, 1.57],
            'anchor_bottom_heights': [-1.78],
            'align_center': False,
            'feature_map_stride': 2,
            'matched_threshold': 0.6,
            'unmatched_threshold': 0.45
        }
    ]

    TARGET_ASSIGNER_CONFIG:
        NAME: AxisAlignedTargetAssigner
        POS_FRACTION: -1.0
        SAMPLE_SIZE: 512
        NORM_BY_NUM_EXAMPLES: False
        MATCH_HEIGHT: False
        BOX_CODER: ResidualCoder

    LOSS_CONFIG:
        LOSS_WEIGHTS: {
            'cls_weight': 1.0,
            'loc_weight': 2.0,
            'dir_weight': 0.2,
            'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
        }

POST_PROCESSING:
    RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
    SCORE_THRESH: 0.1
    OUTPUT_RAW_SCORE: False

    EVAL_METRIC: kitti

    NMS_CONFIG:
        MULTI_CLASSES_NMS: False
        NMS_TYPE: nms_gpu
        NMS_THRESH: 0.01
        NMS_PRE_MAXSIZE: 4096
        NMS_POST_MAXSIZE: 500

OPTIMIZATION:
BATCH_SIZE_PER_GPU: 4
NUM_EPOCHS: 80

OPTIMIZER: adam_onecycle
LR: 0.003
WEIGHT_DECAY: 0.01
MOMENTUM: 0.9

MOMS: [0.95, 0.85]
PCT_START: 0.4
DIV_FACTOR: 10
DECAY_STEP_LIST: [35, 45]
LR_DECAY: 0.1
LR_CLIP: 0.0000001

LR_WARMUP: False
WARMUP_EPOCH: 1

GRAD_NORM_CLIP: 10

以上是我的pointpillar.yaml，此文件放置在tools\cfgs\coils_models，去掉了gt_sampling数据增强，需要修改POINT_CLOUD_RANGE和VOXEL_SIZE，POINT_CLOUD_RANGE就是你的点云范围，相当于整个点云左下角xyz坐标和右上角xyz坐标，你也可以截取原始点云数据。
因为我使用pointpillar，VOXEL_SIZE第3个数设为POINT_CLOUD_RANGE的z方向范围值，还需要满足以下要求：
point cloud range along z-axis / voxel_size is 40
point cloud range along x,y -axis / voxel_size is the multiple of 16.
这些是OpenPCDet的团队成员说的。

还需要修改一些代码：
pcdet\datasets_init_.py里面加上写的CoilsDataset信息:
from .kitti.coils_dataset import CoilsDataset
all = {
'DatasetTemplate': DatasetTemplate,
'KittiDataset': KittiDataset,
'CoilsDataset': CoilsDataset,
'NuScenesDataset': NuScenesDataset,
'WaymoDataset': WaymoDataset,
'PandasetDataset': PandasetDataset,
'LyftDataset': LyftDataset
}

基本上要写、要改的代码就是这些，然后我把train.py里评估的代码都删掉了。可以开始训练了！

训练完，找到生成的.pth文件，用它来测试一下效果，简单修改了一下demo.py：

import argparse
import glob
from pathlib import Path

try:
import open3d
from visual_utils import open3d_vis_utils as V
OPEN3D_FLAG = True
except:
import mayavi.mlab as mlab
from visual_utils import visualize_utils as V
OPEN3D_FLAG = False

import numpy as np
import torch

from pcdet.config import cfg, cfg_from_yaml_file
from pcdet.datasets import DatasetTemplate
from pcdet.models import build_network, load_data_to_gpu
from pcdet.utils import common_utils

class DemoDataset(DatasetTemplate):
def init(self, dataset_cfg, class_names, training=True, root_path=None, logger=None, ext='.txt'):
"""
Args:
root_path:
dataset_cfg:
class_names:
training:
logger:
"""
super().init(
dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger
)
self.root_path = root_path
self.ext = ext
data_file_list = glob.glob(str(root_path / f'*{self.ext}')) if self.root_path.is_dir() else [self.root_path]

    data_file_list.sort()
    self.sample_file_list = data_file_list

def __len__(self):
    return len(self.sample_file_list)

def __getitem__(self, index):
    if self.ext == '.txt':
        points = np.loadtxt(self.sample_file_list[index], dtype=np.float32).reshape(-1, 3)
    else:
        raise NotImplementedError

    input_dict = {
        'points': points,
        'frame_id': index,
    }

    data_dict = self.prepare_data(data_dict=input_dict)
    return data_dict

def parse_config():
parser = argparse.ArgumentParser(description='arg parser')
parser.add_argument('--cfg_file', type=str, default='cfgs/kitti_models/second.yaml',
help='specify the config for demo')
parser.add_argument('--data_path', type=str, default='demo_data',
help='specify the point cloud data file or directory')
parser.add_argument('--ckpt', type=str, default=None, help='specify the pretrained model')
parser.add_argument('--ext', type=str, default='.txt', help='specify the extension of your point cloud data file')

args = parser.parse_args()

cfg_from_yaml_file(args.cfg_file, cfg)

return args, cfg

def main():
args, cfg = parse_config()
logger = common_utils.create_logger()
logger.info('-----------------Quick Demo of OpenPCDet-------------------------')
demo_dataset = DemoDataset(
dataset_cfg=cfg.DATA_CONFIG, class_names=cfg.CLASS_NAMES, training=False,
root_path=Path(args.data_path), ext=args.ext, logger=logger
)
logger.info(f'Total number of samples: \t{len(demo_dataset)}')

model = build_network(model_cfg=cfg.MODEL, num_class=len(cfg.CLASS_NAMES), dataset=demo_dataset)
model.load_params_from_file(filename=args.ckpt, logger=logger, to_cpu=True)
model.cuda()
model.eval()
with torch.no_grad():
    for idx, data_dict in enumerate(demo_dataset):
        logger.info(f'Visualized sample index: \t{idx + 1}')
        data_dict = demo_dataset.collate_batch([data_dict])
        load_data_to_gpu(data_dict)
        pred_dicts, _ = model.forward(data_dict)

        V.draw_scenes(
            points=data_dict['points'][:, 1:], ref_boxes=pred_dicts[0]['pred_boxes'],
            ref_scores=pred_dicts[0]['pred_scores'], ref_labels=pred_dicts[0]['pred_labels']
        )

        if not OPEN3D_FLAG:
            mlab.show(stop=True)

logger.info('Demo done.')

if name == 'main':
main()

然后就可以看效果了！

最后，感谢OpenPCDet团队成员的辛勤付出，向你们致敬！

jihanyang · 2022-01-27T10:56:58Z

@155cannon Many thanks for your kindly sharing! Hope you can also post this comments in this issue #253, which collect most discussion about "custom datasets". Also, it will be more helpful if you can post this with English, since it is more widely-used in our community.

VsionQing · 2022-02-23T02:52:30Z

Hi，We encountered a problem when training Kitti dataset. No error was reported, but we can't train. #820

yun9993 · 2022-02-23T07:54:39Z

你好 @155cannon ，请问方便分享你更改后的源文件嘛？训练自己的数据集搞得我头昏脑胀，2448115060@qq.com，这是我的邮箱，感谢！

VsionQing · 2022-02-23T09:05:22Z

你好 @155cannon ，请问方便分享你更改后的源文件嘛？训练自己的数据集搞得我头昏脑胀，2448115060@qq.com，这是我的邮箱，感谢！

请问你是在windows下训练kitti的吗？

yun9993 · 2022-02-23T09:08:58Z

你好@155cannon ，请问源昏方便分享你之后更改的文件吗？

请问你是在windows下训练kitti的吗？

我用PCTA做了一个小规模的数据集。想在ubuntu下用自己的数据集训练模型试试水，但是还不太清楚数据集该怎么处理。

jeacwen · 2022-02-25T02:13:00Z

hi 请问方便分享一下修改的完成源码吗？这个邮箱是我的邮箱[jeacwen@163.com]。

gashel · 2022-04-15T09:59:40Z

@155cannon 哈喽，您的代码可以实现训练，只是训练完成后没有性能评估，而且加上评估代码做训练会报错，请问您有做后续的修改吗？感谢~

155cannon · 2022-04-26T06:44:24Z

@155cannon 哈喽，您的代码可以实现训练，只是训练完成后没有性能评估，而且加上评估代码做训练会报错，请问您有做后续的修改吗？感谢~

性能评估的代码我没有看，加上评估代码做训练肯定会报错，所以去掉了性能评估。我不是搞学术科研的，单纯使用，直接看效果。

shandongchong · 2022-05-18T06:50:43Z

@155cannon
hello 请问您自定义这部分的数据是生存pkl格式的数据进行训练的吗？

VsionQing · 2022-05-18T06:51:04Z

收到谢谢

github-actions · 2022-06-18T02:12:03Z

This issue is stale because it has been open for 30 days with no activity.

ghb0224 · 2022-06-29T16:47:16Z

@155cannon 你好，请问可以分享一下您的源文件吗，我在这方面遇到很多问题，谢谢。gong20160317@163.com

Christina-Soda · 2022-07-04T04:14:58Z

@155cannon 请问您可以分享一下源文件吗？学术小白跪谢！xinxinran_0288@126.com

OrangeSodahub · 2022-07-18T03:48:49Z

@Christina-Soda @ghb0224 @ccsself @jeacwen @yun9993 @VsionQing
Hello, you can refer to my successful example using kitti format custom dataset. README describes how to label, train, inference it including transformation of coordinates. It may solve your problems!
https://github.com/OrangeSodahub/CRLFnet#lid-cam-fusion
https://github.com/OrangeSodahub/CRLFnet/blob/master/src/site_model/src/LidCamFusion/OpenPCDet/pcdet/datasets/custom/README.md

JulyLi2019 · 2022-08-10T02:24:21Z

您好，我在训练时遇到如下错误，请问您有遇到过吗？
RecursionError: maximum recursion depth exceeded in comparison

VsionQing · 2022-08-10T02:24:44Z

收到谢谢

brillint · 2022-09-16T04:01:18Z

大佬您好 @155cannon ，请问方便分享你更改后的源文件嘛？313089570@qq.com，这是我的邮箱，非常感谢！

github-actions · 2022-10-17T02:13:32Z

This issue is stale because it has been open for 30 days with no activity.

github-actions · 2022-11-01T02:13:15Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

xcc2731594 · 2022-12-02T12:49:52Z

您好，我在训练时遇到如下错误，请问您有遇到过吗？ RecursionError: maximum recursion depth exceeded in comparison

您好，你的问题解决了吗，我也遇到这个问题

LiangziWang9 · 2023-10-18T08:20:54Z

大佬您好，请问可以分享一下源文件吗？这是我的邮箱769487692@qq.com,感谢！ @155cannon

daofeng2007 · 2023-11-07T06:16:20Z

I followed the instructions provided by @155cannon and received the error below when I tried to train the model.

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 1250 but got size 1252 for tensor number 2 in the list.

point range and voxel size are set below.
POINT_CLOUD_RANGE: [-200, -200,-32,200, 200,32]
VOXEL_SIZE: [0.16, 0.16, 64]

How to remove the error?

LiangziWang9 · 2023-11-08T02:45:09Z

You need to modify the point_cloud_range, please refer to docs/CUSTOM_DATASET_TUTORIAL.md,point cloud range along x&y-axis / voxel_size is the multiple of 16. @daofeng2007

Leozyc-waseda mentioned this issue Apr 29, 2022

Problems using custom data sets #253

Closed

github-actions bot added the stale label Jun 18, 2022

github-actions bot removed the stale label Jun 30, 2022

OrangeSodahub mentioned this issue Jul 26, 2022

Custom dataset support #1032

Closed

github-actions bot added the stale label Oct 17, 2022

github-actions bot closed this as completed Nov 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training using our own dataset #771

Training using our own dataset #771

SaMeEr9597 commented Jan 23, 2022

155cannon commented Jan 27, 2022

jihanyang commented Jan 27, 2022

VsionQing commented Feb 23, 2022

yun9993 commented Feb 23, 2022

VsionQing commented Feb 23, 2022

yun9993 commented Feb 23, 2022

jeacwen commented Feb 25, 2022

gashel commented Apr 15, 2022

155cannon commented Apr 26, 2022

shandongchong commented May 18, 2022 •

edited

VsionQing commented May 18, 2022 via email

github-actions bot commented Jun 18, 2022

ghb0224 commented Jun 29, 2022

Christina-Soda commented Jul 4, 2022

OrangeSodahub commented Jul 18, 2022

JulyLi2019 commented Aug 10, 2022

VsionQing commented Aug 10, 2022 via email

brillint commented Sep 16, 2022

github-actions bot commented Oct 17, 2022

github-actions bot commented Nov 1, 2022

xcc2731594 commented Dec 2, 2022

LiangziWang9 commented Oct 18, 2023

daofeng2007 commented Nov 7, 2023

LiangziWang9 commented Nov 8, 2023

Training using our own dataset #771

Training using our own dataset #771

Comments

SaMeEr9597 commented Jan 23, 2022

155cannon commented Jan 27, 2022

jihanyang commented Jan 27, 2022

VsionQing commented Feb 23, 2022

yun9993 commented Feb 23, 2022

VsionQing commented Feb 23, 2022

yun9993 commented Feb 23, 2022

jeacwen commented Feb 25, 2022

gashel commented Apr 15, 2022

155cannon commented Apr 26, 2022

shandongchong commented May 18, 2022 • edited

VsionQing commented May 18, 2022 via email

github-actions bot commented Jun 18, 2022

ghb0224 commented Jun 29, 2022

Christina-Soda commented Jul 4, 2022

OrangeSodahub commented Jul 18, 2022

JulyLi2019 commented Aug 10, 2022

VsionQing commented Aug 10, 2022 via email

brillint commented Sep 16, 2022

github-actions bot commented Oct 17, 2022

github-actions bot commented Nov 1, 2022

xcc2731594 commented Dec 2, 2022

LiangziWang9 commented Oct 18, 2023

daofeng2007 commented Nov 7, 2023

LiangziWang9 commented Nov 8, 2023

shandongchong commented May 18, 2022 •

edited