Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reproduce problem about swin-t in scene classification. #24

Open
Leiyi-Hu opened this issue Mar 23, 2023 · 1 comment
Open

reproduce problem about swin-t in scene classification. #24

Leiyi-Hu opened this issue Mar 23, 2023 · 1 comment

Comments

@Leiyi-Hu
Copy link

Hi, I try to follow your hyperparameters to reproduce the classification results in misclassification, but I train aid (2:8) using max_epochs=200, base_lr=5e-4, and other settings following:
base = [
# '../base/models/swin_transformer/base_224.py',
# "../base/datasets/ucmerced_landuse_bs64_swin_224.py",
"../base/datasets/aid_bs64_autoaug.py",
"../base/schedules/imagenet_bs64_adamw_swin.py",
"../base/default_runtime.py",
]

refer to SimMIM paper

ADJUST_FACTOR = 1.0
BATCH_SIZE = 64
BASE_LR = 5e-4 * ADJUST_FACTOR # todo: adjust.
WARMUP_LR = 5e-7 * ADJUST_FACTOR
MIN_LR = 5e-6 * ADJUST_FACTOR
NUM_GPUS = 1
DROP_PATH_RATE = 0.2
SCALE_FACTOR = 512.0
MAX_EPOCHS = 200

model settings

model = dict(
type="ImageClassifier",
backbone=dict(
type="SwinTransformer",
# arch="base",
arch="tiny",
img_size=224,
# drop_path_rate=0.1, # DROP_PATH_RATE
drop_path_rate=DROP_PATH_RATE,
),
neck=dict(type="GlobalAveragePooling"),
head=dict(
type="LinearClsHead",
num_classes=21,
# in_channels=1024,
in_channels=768,
init_cfg=None, # suppress the default init_cfg of LinearClsHead.
loss=dict(type="LabelSmoothLoss", label_smooth_val=0.1, mode="original"),
cal_acc=False,
),
init_cfg=[
dict(type="TruncNormal", layer="Linear", std=0.02, bias=0.0),
dict(type="Constant", layer="LayerNorm", val=1.0, bias=0.0),
],
train_cfg=dict(
augments=[
dict(type="BatchMixup", alpha=0.8, num_classes=21, prob=0.5),
dict(type="BatchCutMix", alpha=1.0, num_classes=21, prob=0.5),
]
),
)

optimizer

paramwise_cfg = dict(
norm_decay_mult=0.0,
bias_decay_mult=0.0,
custom_keys={
".absolute_pos_embed": dict(decay_mult=0.0),
".relative_position_bias_table": dict(decay_mult=0.0),
},
)

optimizer = dict(
type="AdamW",
# lr=1e-3 * 64 / 256, # 5e-4 * 64 / 512, # 1e-3 * 64 / 256,
# lr=1.25e-3 * 96 * 1 / 512.0,
# BASE_LR * BATCH_SIZE * NUM_GPUS / 512.0, # 1e-3 * 64 / 256,
lr=BASE_LR * BATCH_SIZE * NUM_GPUS / SCALE_FACTOR,
weight_decay=0.05,
eps=1e-8,
betas=(0.9, 0.999),
paramwise_cfg=paramwise_cfg,
)
optimizer_config = dict(grad_clip=dict(max_norm=5.0))

learning policy

lr_config = dict(
policy="CosineAnnealing",
# min_lr=2.5e-7,
# by_epoch=False, # todo: try
by_epoch=False,
# min_lr_ratio=(2.5e-7 * 96 * 1 / 512.0) / (1.25e-3 * 96 * 1 / 512.0), # 1e-2,
min_lr_ratio=(MIN_LR * BATCH_SIZE * NUM_GPUS / SCALE_FACTOR)
/ (BASE_LR * BATCH_SIZE * NUM_GPUS / SCALE_FACTOR),
# min_lr=2.5e-7, # MIN_LR,
warmup="linear",
# warmup_ratio=(2.5e-7 * 96 * 1 / 512.0) / (1.25e-3 * 96 * 1 / 512.0), # 1e-3,
warmup_ratio=(WARMUP_LR * BATCH_SIZE * NUM_GPUS / SCALE_FACTOR)
/ (BASE_LR * BATCH_SIZE * NUM_GPUS / SCALE_FACTOR),
# warmup_lr=2.5e-7, # WARMUP_LR,
warmup_iters=20, # todo: 0
warmup_by_epoch=True,
)

checkpoint_config = dict(interval=MAX_EPOCHS // 10)
evaluation = dict(
interval=MAX_EPOCHS // 10, metric="accuracy", save_best="auto"
) # save the checkpoint with highest accuracy
runner = dict(type="EpochBasedRunner", max_epochs=MAX_EPOCHS)

data = dict(samples_per_gpu=96, workers_per_gpu=8,)

data = dict(samples_per_gpu=BATCH_SIZE, workers_per_gpu=8,)

fp16 settings

fp16 = dict(loss_scale="dynamic")

so could you help me with this? or provide your training log?
Thanks!

@DotWang
Copy link
Collaborator

DotWang commented Mar 23, 2023

@Leiyi-Hu We use MAE, not SimMIM. In addition, our classification experiments does not use mmclassification. We have provided corresponding codes. You should reproduce with our codes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants