Could not detect any unseen class #40

LHdagong · 2022-05-16T02:02:20Z

I'm trying to divide the dataset into 72 seen classes and 8 unseen classes. Then trained on the newly divided dataset, but it went wrong when testing and did not detect any unseen classes. Could you please offer any advice? This is my config file.

# model settings
model = dict(
    type='ZeroShotFasterRCNN',
    pretrained='torchvision://resnet101',
    backbone=dict(
        type='ResNet',
        depth=101,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        style='pytorch'),
    neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        num_outs=5),
    rpn_head=dict(
        type='BackgroundAwareRPNHead',
        in_channels=256,
        semantic_dims=300,
        feat_channels=256,
        anchor_scales=[8],
        anchor_ratios=[0.5, 1.0, 2.0],
        anchor_strides=[4, 8, 16, 32, 64],
        target_means=[.0, .0, .0, .0],
        target_stds=[1.0, 1.0, 1.0, 1.0],
        voc_path=None,
        vec_path='data/coco/word_w2v_withbg_72_8.txt',
        sync_bg=True,
        loss_cls=dict(
            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
        loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
    bbox_roi_extractor=dict(
        type='SingleRoIExtractor',
        roi_layer=dict(type='RoIAlign', out_size=7, sample_num=2),
        out_channels=256,
        featmap_strides=[4, 8, 16, 32]),
    bbox_head=dict(
        type='SharedFCSemanticBBoxHead',
        num_fcs=2,
        in_channels=256,
        fc_out_channels=1024,
        roi_feat_size=7,
        num_classes=73,
        semantic_dims=300,
        seen_class=True,
        reg_with_semantic=False,
        share_semantic=False,
        with_decoder=True,
        sync_bg=True,
        voc_path='data/coco/vocabulary_w2v.txt',
        vec_path='data/coco/word_w2v_withbg_72_8.txt',
        target_means=[0., 0., 0., 0.],
        target_stds=[0.1, 0.1, 0.2, 0.2],
        reg_class_agnostic=False,
        loss_semantic=dict(
            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
        loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)),
    bbox_with_decoder=True,
    bbox_sync_bg=True)
# model training and testing settings
train_cfg = dict(
    rpn=dict(
        assigner=dict(
            type='MaxIoUAssigner',
            pos_iou_thr=0.7,
            neg_iou_thr=0.3,
            min_pos_iou=0.3,
            ignore_iof_thr=-1),
        sampler=dict(
            type='RandomSampler',
            num=256,
            pos_fraction=0.5,
            neg_pos_ub=-1,
            add_gt_as_proposals=False),
        allowed_border=0,
        pos_weight=-1,
        debug=False),
    rpn_proposal=dict(
        nms_across_levels=False,
        nms_pre=2000,
        nms_post=2000,
        max_num=2000,
        nms_thr=0.7,
        min_bbox_size=0),
    rcnn=dict(
        assigner=dict(
            type='MaxIoUAssigner',
            pos_iou_thr=0.5,
            neg_iou_thr=0.5,
            min_pos_iou=0.5,
            ignore_iof_thr=-1),
        sampler=dict(
            type='RandomSampler',
            num=512,
            pos_fraction=0.25,
            neg_pos_ub=-1,
            add_gt_as_proposals=True),
        pos_weight=-1,
        debug=False))
test_cfg = dict(
    rpn=dict(
        nms_across_levels=False,
        nms_pre=1000,
        nms_post=1000,
        max_num=1000,
        nms_thr=0.7,
        min_bbox_size=0),
    rcnn=dict(
        score_thr=0.05,
        nms=dict(type='nms', iou_thr=0.5),
        max_per_img=100))
# dataset settings
dataset_type = 'CocoDataset_72_8'
data_root = 'data/coco/'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1333, 800),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img']),
        ])
]
data = dict(
    imgs_per_gpu=2,
    workers_per_gpu=0,
    train=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/instances_train2014_seen_72_8.json',
        img_prefix=data_root + 'train2014/',
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/instances_val2014_seen_72_8.json',
        img_prefix=data_root + 'val2014/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/instances_val2014_gzsi_72_8.json',
        img_prefix=data_root + 'val2014/',
        pipeline=test_pipeline))
# optimizer
# optimizer = dict(type='SGD', lr=0.0075, momentum=0.9, weight_decay=0.0001)
optimizer = dict(type='SGD', lr=0.0025, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
# learning policy
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=7000,
    warmup_ratio=1.0 / 3,
    step=[8, 11])
checkpoint_config = dict(interval=12)
# yapf:disable
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])
# yapf:enable
evaluation = dict(interval=1)
# runtime settings
total_epochs = 12
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = './work_dirs/zsi/faster/72_8/'
load_from = None
resume_from = None
workflow = [('train', 1)]

The text was updated successfully, but these errors were encountered:

zhengye1995 · 2022-05-16T02:39:17Z

Your config file looks ok.

For GZSD task, you need to change the code in here, since the number of categories is hardcoded to 65 or 48, you can change it to your number.

LHdagong · 2022-05-16T02:45:50Z

Thank you very much for your reply. I will try it again.

LHdagong · 2022-05-16T05:07:40Z

I ran the test again but still no unseen classes were detected. Do I still need to retrain? Or could there be an error elsewhere?

zhengye1995 · 2022-05-16T05:27:13Z

You do not need to retrain the model.

I find some errors in your config file, for the bbox_head:

seen_class should be set to False to tell the model outputs the unseen results
If you what to evaluate your model in GZSD setting, you also need to set ````gzsd=True``

Please check two config files for zsd(zsi) and gzsd(gzsi) settings in here

LHdagong · 2022-05-16T08:41:08Z

Thank you very much for your answer. I have solved it.

LHdagong closed this as completed May 16, 2022

zhengye1995 mentioned this issue Jul 28, 2022

Question about vocabulary_w2v.txt #42

Closed

zhengye1995 mentioned this issue Oct 18, 2022

Question about detect any unseen class #44

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could not detect any unseen class #40

Could not detect any unseen class #40

LHdagong commented May 16, 2022

zhengye1995 commented May 16, 2022

LHdagong commented May 16, 2022

LHdagong commented May 16, 2022

zhengye1995 commented May 16, 2022 •

edited

LHdagong commented May 16, 2022

Could not detect any unseen class #40

Could not detect any unseen class #40

Comments

LHdagong commented May 16, 2022

zhengye1995 commented May 16, 2022

LHdagong commented May 16, 2022

LHdagong commented May 16, 2022

zhengye1995 commented May 16, 2022 • edited

LHdagong commented May 16, 2022

zhengye1995 commented May 16, 2022 •

edited