About other datasets #18

Zoeeeing · 2022-03-08T06:01:02Z

Hi, have you experimented on some other outdoor datasets such as nuscenes? As i used SST to train on nuScenes dataset, the results i got were not ideal. I just modified the hyperparameters about the voxel size and replaced the head .I would like to ask whether there is a problem.
Thanks!

Abyssaledge · 2022-03-08T07:18:50Z

Thanks for using SST.
No, we have not tried SST on nuScenes. But If you share your config and detailed results, maybe we can help you.

Zoeeeing · 2022-03-08T08:13:52Z

Thanks!
The modified model is as follows：


voxel_size=(0.25, 0.25, 8),
window_shape = (16, 16, 1),
point_cloud_range=[-50, -50, -5, 50, 50, 3],
model = dict(
    type='DynamicVoxelNet',
    voxel_layer=dict(
        voxel_size=(0.25, 0.25, 8),
        max_num_points=-1,
        point_cloud_range=[-50, -50, -5, 50, 50, 3],
        max_voxels=(-1, -1)),
    voxel_encoder=dict(
        type='DynamicVFE',
        in_channels=4,
        feat_channels=[64, 128],
        with_distance=False,
        voxel_size=(0.25, 0.25, 8),
        with_cluster_center=True,
        with_voxel_center=True,
        point_cloud_range=[-50, -50, -5, 50, 50, 3],
        norm_cfg=dict(type='naiveSyncBN1d', eps=0.001, momentum=0.01)),
    middle_encoder=dict(
        type='SSTInputLayerV2',
        window_shape=(16, 16, 1),
        sparse_shape=(400, 400, 1),
        shuffle_voxels=True,
        debug=True,
        drop_info=({
            0: {
                'max_tokens': 100,
                'drop_range': (0, 100)
            },
            1: {
                'max_tokens': 200,
                'drop_range': (100, 200)
            },
            2: {
                'max_tokens': 250,
                'drop_range': (200, 10000)
            }
        }, {
            0: {
                'max_tokens': 100,
                'drop_range': (0, 100)
            },
            1: {
                'max_tokens': 200,
                'drop_range': (100, 200)
            },
            2: {
                'max_tokens': 256,
                'drop_range': (200, 10000)
            }
        }),
        pos_temperature=10000,
        normalize_pos=False),
    backbone=dict(
        type='SSTv2',
        d_model=[128, 128, 128, 128, 128, 128],
        nhead=[8, 8, 8, 8, 8, 8],
        num_blocks=6,
        dim_feedforward=[256, 256, 256, 256, 256, 256],
        output_shape=[400, 400],
        num_attached_conv=3,
        conv_kwargs=[
            dict(kernel_size=3, dilation=1, padding=1, stride=1),
            dict(kernel_size=3, dilation=1, padding=1, stride=1),
            dict(kernel_size=3, dilation=2, padding=2, stride=1)
        ],
        conv_in_channel=128,
        conv_out_channel=128,
        debug=True),
    neck=dict(
        type='SECONDFPN',
        norm_cfg=dict(type='naiveSyncBN2d', eps=0.001, momentum=0.01),
        in_channels=[128],
        upsample_strides=[1],
        out_channels=[384]),
    bbox_head=dict(
        type='Anchor3DHead',
        num_classes=10,
        in_channels=384,
        feat_channels=384,
        use_direction_classifier=True,
        anchor_generator=dict(
            type='AlignedAnchor3DRangeGenerator',
            ranges=[[-49.6, -49.6, -1.80032795, 49.6, 49.6, -1.80032795],
                    [-49.6, -49.6, -1.74440365, 49.6, 49.6, -1.74440365],
                    [-49.6, -49.6, -1.68526504, 49.6, 49.6, -1.68526504],
                    [-49.6, -49.6, -1.67339111, 49.6, 49.6, -1.67339111],
                    [-49.6, -49.6, -1.61785072, 49.6, 49.6, -1.61785072],
                    [-49.6, -49.6, -1.80984986, 49.6, 49.6, -1.80984986],
                    [-49.6, -49.6, -1.763965, 49.6, 49.6, -1.763965]],
            sizes=[[1.95017717, 4.60718145, 1.72270761],
                   [2.4560939, 6.73778078, 2.73004906],
                   [2.87427237, 12.01320693, 3.81509561],
                   [0.60058911, 1.68452161, 1.27192197],
                   [0.66344886, 0.7256437, 1.75748069],
                   [0.39694519, 0.40359262, 1.06232151],
                   [2.49008838, 0.48578221, 0.98297065]],
            custom_values=[0, 0],
            rotations=[0, 1.57],
            reshape_out=True),
        assigner_per_size=False,
        diff_rad_by_sin=True,
        dir_offset=0.7854,
        dir_limit_offset=0,
        bbox_coder=dict(type='DeltaXYZWLHRBBoxCoder', code_size=9),
        loss_cls=dict(
            type='FocalLoss',
            use_sigmoid=True,
            gamma=2.0,
            alpha=0.25,
            loss_weight=1.0),
        loss_bbox=dict(
            type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0),
        loss_dir=dict(
            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.2)),
    train_cfg=dict(
        assigner=dict(
            type='MaxIoUAssigner',
            iou_calculator=dict(type='BboxOverlapsNearest3D'),
            pos_iou_thr=0.6,
            neg_iou_thr=0.3,
            min_pos_iou=0.3,
            ignore_iof_thr=-1),
        allowed_border=0,
        code_weight=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.2, 0.2],
        pos_weight=-1,
        debug=False),
    test_cfg=dict(
        use_rotate_nms=True,
        nms_across_levels=False,
        nms_pre=1000,
        nms_thr=0.2,
        score_thr=0.05,
        min_bbox_size=0,
        max_num=500))

After training for 24 epochs, i got the detailed results as follows.

pts_bbox_NuScenes/car_AP_dist_0.5: 0.4701, pts_bbox_NuScenes/car_AP_dist_1.0: 0.6067, pts_bbox_NuScenes/car_AP_dist_2.0: 0.6618, pts_bbox_NuScenes/car_AP_dist_4.0: 0.6832, pts_bbox_NuScenes/car_trans_err: 0.2372, pts_bbox_NuScenes/car_scale_err: 0.1477, pts_bbox_NuScenes/car_orient_err: 0.1317, pts_bbox_NuScenes/car_vel_err: 0.2814, pts_bbox_NuScenes/car_attr_err: 0.2252, pts_bbox_NuScenes/mATE: 0.4841, pts_bbox_NuScenes/mASE: 0.2709, pts_bbox_NuScenes/mAOE: 0.5280, pts_bbox_NuScenes/mAVE: 0.3700, pts_bbox_NuScenes/mAAE: 0.1962, pts_bbox_NuScenes/truck_AP_dist_0.5: 0.0624, pts_bbox_NuScenes/truck_AP_dist_1.0: 0.2224, pts_bbox_NuScenes/truck_AP_dist_2.0: 0.3657, pts_bbox_NuScenes/truck_AP_dist_4.0: 0.3988, pts_bbox_NuScenes/truck_trans_err: 0.5955, pts_bbox_NuScenes/truck_scale_err: 0.2285, pts_bbox_NuScenes/truck_orient_err: 0.2259, pts_bbox_NuScenes/truck_vel_err: 0.2660, pts_bbox_NuScenes/truck_attr_err: 0.2360, pts_bbox_NuScenes/trailer_AP_dist_0.5: 0.0000, pts_bbox_NuScenes/trailer_AP_dist_1.0: 0.0000, pts_bbox_NuScenes/trailer_AP_dist_2.0: 0.0073, pts_bbox_NuScenes/trailer_AP_dist_4.0: 0.0857, pts_bbox_NuScenes/trailer_trans_err: 0.9790, pts_bbox_NuScenes/trailer_scale_err: 0.2405, pts_bbox_NuScenes/trailer_orient_err: 0.9358, pts_bbox_NuScenes/trailer_vel_err: 0.3954, pts_bbox_NuScenes/trailer_attr_err: 0.1308, pts_bbox_NuScenes/bus_AP_dist_0.5: 0.0105, pts_bbox_NuScenes/bus_AP_dist_1.0: 0.1396, pts_bbox_NuScenes/bus_AP_dist_2.0: 0.3895, pts_bbox_NuScenes/bus_AP_dist_4.0: 0.4736, pts_bbox_NuScenes/bus_trans_err: 0.7881, pts_bbox_NuScenes/bus_scale_err: 0.1895, pts_bbox_NuScenes/bus_orient_err: 0.1455, pts_bbox_NuScenes/bus_vel_err: 0.6699, pts_bbox_NuScenes/bus_attr_err: 0.1602, pts_bbox_NuScenes/construction_vehicle_AP_dist_0.5: 0.0000, pts_bbox_NuScenes/construction_vehicle_AP_dist_1.0: 0.0036, pts_bbox_NuScenes/construction_vehicle_AP_dist_2.0: 0.0457, pts_bbox_NuScenes/construction_vehicle_AP_dist_4.0: 0.0629, pts_bbox_NuScenes/construction_vehicle_trans_err: 0.9470, pts_bbox_NuScenes/construction_vehicle_scale_err: 0.5084, pts_bbox_NuScenes/construction_vehicle_orient_err: 1.3642, pts_bbox_NuScenes/construction_vehicle_vel_err: 0.1244, pts_bbox_NuScenes/construction_vehicle_attr_err: 0.4645, pts_bbox_NuScenes/bicycle_AP_dist_0.5: 0.0264, pts_bbox_NuScenes/bicycle_AP_dist_1.0: 0.0287, pts_bbox_NuScenes/bicycle_AP_dist_2.0: 0.0290, pts_bbox_NuScenes/bicycle_AP_dist_4.0: 0.0298, pts_bbox_NuScenes/bicycle_trans_err: 0.1875, pts_bbox_NuScenes/bicycle_scale_err: 0.2586, pts_bbox_NuScenes/bicycle_orient_err: 0.8511, pts_bbox_NuScenes/bicycle_vel_err: 0.3377, pts_bbox_NuScenes/bicycle_attr_err: 0.0047, pts_bbox_NuScenes/motorcycle_AP_dist_0.5: 0.1205, pts_bbox_NuScenes/motorcycle_AP_dist_1.0: 0.1384, pts_bbox_NuScenes/motorcycle_AP_dist_2.0: 0.1415, pts_bbox_NuScenes/motorcycle_AP_dist_4.0: 0.1458, pts_bbox_NuScenes/motorcycle_trans_err: 0.2381, pts_bbox_NuScenes/motorcycle_scale_err: 0.2787, pts_bbox_NuScenes/motorcycle_orient_err: 0.7527, pts_bbox_NuScenes/motorcycle_vel_err: 0.6352, pts_bbox_NuScenes/motorcycle_attr_err: 0.3060, pts_bbox_NuScenes/pedestrian_AP_dist_0.5: 0.5656, pts_bbox_NuScenes/pedestrian_AP_dist_1.0: 0.5758, pts_bbox_NuScenes/pedestrian_AP_dist_2.0: 0.5854, pts_bbox_NuScenes/pedestrian_AP_dist_4.0: 0.5960, pts_bbox_NuScenes/pedestrian_trans_err: 0.1403, pts_bbox_NuScenes/pedestrian_scale_err: 0.2611, pts_bbox_NuScenes/pedestrian_orient_err: 0.3074, pts_bbox_NuScenes/pedestrian_vel_err: 0.2499, pts_bbox_NuScenes/pedestrian_attr_err: 0.0425, pts_bbox_NuScenes/traffic_cone_AP_dist_0.5: 0.0727, pts_bbox_NuScenes/traffic_cone_AP_dist_1.0: 0.0775, pts_bbox_NuScenes/traffic_cone_AP_dist_2.0: 0.0849, pts_bbox_NuScenes/traffic_cone_AP_dist_4.0: 0.1073, pts_bbox_NuScenes/traffic_cone_trans_err: 0.1638, pts_bbox_NuScenes/traffic_cone_scale_err: 0.3195, pts_bbox_NuScenes/traffic_cone_orient_err: nan, pts_bbox_NuScenes/traffic_cone_vel_err: nan, pts_bbox_NuScenes/traffic_cone_attr_err: nan, pts_bbox_NuScenes/barrier_AP_dist_0.5: 0.0680, pts_bbox_NuScenes/barrier_AP_dist_1.0: 0.2386, pts_bbox_NuScenes/barrier_AP_dist_2.0: 0.3307, pts_bbox_NuScenes/barrier_AP_dist_4.0: 0.3615, pts_bbox_NuScenes/barrier_trans_err: 0.5643, pts_bbox_NuScenes/barrier_scale_err: 0.2763, pts_bbox_NuScenes/barrier_orient_err: 0.0374, pts_bbox_NuScenes/barrier_vel_err: nan, pts_bbox_NuScenes/barrier_attr_err: nan, pts_bbox_NuScenes/NDS: 0.4278, pts_bbox_NuScenes/mAP: 0.2253

Abyssaledge · 2022-03-08T08:49:22Z

Your config looks fine to me. I am sorry that I do not have enough information to explain the poor results. We will try to run SST on nuScenes, but I can not provide the precise schedule for now.
My suggestion is to debug each component (backbone/head/) using a small datasize. For example, changing the anchor head to the center head to check if the head module is correct.

Zoeeeing · 2022-03-08T09:05:03Z

OK. I will debug the component and check the result when you run on nuScenes. Thanks for your work.

Devoe-97 · 2022-05-20T06:27:30Z

Hi, do you have more recent results on nuscenes? @Zoeeeing

Zoeeeing · 2022-05-24T08:40:40Z

@Devoe-97 Sorry I can not get some better results.

gopi-erabati · 2022-09-05T12:11:03Z

@Abyssaledge did you try to run experiments on nuScenes dataset ?
As nuScenes has less (5 times) samples than Waymo, does that have any effect on training from scratch to get such poor results on nuScenes ? (Because transformers are data hungry!!!)
What do you think about it?

Abyssaledge · 2022-09-07T15:47:59Z

@gopi231091 I have not run the experiments on nuScenes yet.
To my knowledge, SST is not that data-hungry. It has a better performance than PointPillars baseline with 20% training data on Waymo.
However, its performance in nuScenes might a little worse than the SOTAs because the Pillar-based models show inferior performance in nuScenes, which is observed by many researchers.

Abyssaledge closed this as completed Mar 8, 2022

synsin0 mentioned this issue Mar 28, 2023

Have you tested masked SST with CenterHead? georghess/voxel-mae#9

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About other datasets #18

About other datasets #18

Zoeeeing commented Mar 8, 2022 •

edited

Loading

Abyssaledge commented Mar 8, 2022

Zoeeeing commented Mar 8, 2022 •

edited

Loading

Abyssaledge commented Mar 8, 2022

Zoeeeing commented Mar 8, 2022

Devoe-97 commented May 20, 2022

Zoeeeing commented May 24, 2022

gopi-erabati commented Sep 5, 2022

Abyssaledge commented Sep 7, 2022

About other datasets #18

About other datasets #18

Comments

Zoeeeing commented Mar 8, 2022 • edited Loading

Abyssaledge commented Mar 8, 2022

Zoeeeing commented Mar 8, 2022 • edited Loading

Abyssaledge commented Mar 8, 2022

Zoeeeing commented Mar 8, 2022

Devoe-97 commented May 20, 2022

Zoeeeing commented May 24, 2022

gopi-erabati commented Sep 5, 2022

Abyssaledge commented Sep 7, 2022

Zoeeeing commented Mar 8, 2022 •

edited

Loading

Zoeeeing commented Mar 8, 2022 •

edited

Loading