Question about the dimensionality of the mask. #3

MsWik · 2021-08-28T07:01:35Z

Thank you for your work.

It would be nice to see the actual performance of the models in fps on specific hardware. Particularly on devices like jetson.

The training requires the mask to be gray, and in the file that describes the dataset PALETTE has a dimension of 3. Can you tell me what should be the dimensionality of PALETTE (for example my labels will be (1,1,1) or 1 etc.)

sithu31296 · 2021-08-28T07:38:47Z

Hello,
First, speed comparison will be added in near future.

About your question, the dimension of the PALEETE should be (num_classes, 3). Each value is a color value (R, G, B) for corresponding categorical value of 2 dimensional label image.

MsWik · 2021-08-30T05:37:08Z

The thing is, when I train with my own dataset, I get the following error :

{'DATASET': {'NAME': 'ade20k',
'ROOT': '/content/data/ADEChallenge/ADEChallengeData2016'},
'DEVICE': 'cuda',
'EVAL': {'IMAGE_SIZE': [256, 256],
'MODEL_PATH': 'checkpoints/pretrained/segformer/segformer.b0.ade.pth',
'MSF': {'ENABLE': False,
'FLIP': True,
'SCALES': [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]}},
'LOSS': {'CLS_WEIGHTS': True, 'NAME': 'ohemce', 'THRESH': 0.7},
'MODEL': {'NAME': 'segformer',
'PRETRAINED': '/content/semantic-segmentation/mit_b0.pth',
'VARIANT': 'B0'},
'OPTIMIZER': {'LR': 0.001, 'NAME': 'adamw', 'WEIGHT_DECAY': 0.01},
'SAVE_DIR': 'output',
'SCHEDULER': {'NAME': 'warmuppolylr',
'POWER': 0.9,
'WARMUP': 10,
'WARMUP_RATIO': 0.1},
'TEST': {'FILE': 'assests/ade',
'IMAGE_SIZE': [256, 256],
'MODEL_PATH': 'checkpoints/pretrained/segformer/segformer.b0.ade.pth',
'OVERLAY': False},
'TRAIN': {'AMP': True,
'BATCH_SIZE': 64,
'DDP': False,
'EPOCHS': 20,
'EVAL_INTERVAL': 10,
'IMAGE_SIZE': [256, 256]}}
Found 16443 training images.
Found 5481 validation images.
Epoch: [1/20] Iter: [0/256] LR: 0.00100000 Loss: 0.00000000: 0% 0/256 [00:02<?, ?it/s]
Traceback (most recent call last):
File "/content/semantic-segmentation/tools/train.py", line 132, in
main(cfg, gpu, save_dir)
File "/content/semantic-segmentation/tools/train.py", line 76, in main
loss = loss_fn(logits, lbl)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "./utils/losses.py", line 49, in forward
return self._forward(preds, labels)
File "./utils/losses.py", line 38, in _forward
loss = self.criterion(preds, labels).view(-1)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/loss.py", line 1121, in forward
ignore_index=self.ignore_index, reduction=self.reduction)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 2824, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: 1only batches of spatial targets supported (3D tensors) but got targets of size: : [64, 3, 256, 256]

From which I conclude that the mask must be a [mxm] matrix...

I would like to understand why the learning rate increases over time and not decreases. I also get strange results:

Found 16443 training images.
Found 5481 validation images.
Epoch: [1/20] Iter: [0/256] LR: 0.00100000 Loss: 0.00000000: 0% 0/256 [00:00<?, ?it/s]/usr/local/lib/python3.7/dist-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
Epoch: [0/20] Iter: [256/256] LR: 0.00019000 Loss: 2.17269479: 100% 256/256 [03:07<00:00, 1.36it/s]
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Epoch: [1/20] Iter: [256/256] LR: 0.00028000 Loss: 1.29710598: 100% 256/256 [03:06<00:00, 1.37it/s]
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Epoch: [2/20] Iter: [256/256] LR: 0.00037000 Loss: 1.13825008: 100% 256/256 [03:06<00:00, 1.37it/s]
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Epoch: [3/20] Iter: [256/256] LR: 0.00046000 Loss: 1.02070744: 100% 256/256 [03:07<00:00, 1.36it/s]
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Epoch: [4/20] Iter: [256/256] LR: 0.00055000 Loss: 1.02786795: 100% 256/256 [03:06<00:00, 1.38it/s]
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Epoch: [5/20] Iter: [256/256] LR: 0.00064000 Loss: 0.94620133: 100% 256/256 [03:06<00:00, 1.37it/s]
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Epoch: [6/20] Iter: [256/256] LR: 0.00073000 Loss: 0.93045863: 100% 256/256 [03:06<00:00, 1.37it/s]
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Epoch: [7/20] Iter: [256/256] LR: 0.00082000 Loss: 1.36724860: 100% 256/256 [03:05<00:00, 1.38it/s]
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Epoch: [8/20] Iter: [256/256] LR: 0.00091000 Loss: 1.74576994: 100% 256/256 [03:07<00:00, 1.37it/s]
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Epoch: [9/20] Iter: [256/256] LR: 0.00100000 Loss: 1.71283879: 100% 256/256 [03:05<00:00, 1.38it/s]
Evaluating...
0% 0/5481 [00:00<?, ?it/s][W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
100% 5481/5481 [01:10<00:00, 78.11it/s]
Current mIoU: 0.4781 Best mIoU: 0.48
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Epoch: [10/20] Iter: [111/256] LR: 0.00096089 Loss: 1.36222436: 43% 111/256 [01:21<01:44, 1.39it/s]

Loss after 5-7 epochs begins to increase dramatically.

sithu31296 · 2021-08-30T06:45:41Z

The mask should be in shape [B, H, W]; each value is the categorical value with range(0, num_classes) in training. Only then, we can use cross-entropy based loss.

The learning rate will increase until warmup epochs (defined in SCHEDULER > WARMUP) and then it will decrease. You can see the learning rate behavior by running scheduler.py. Actually, the loss increasing around warmup epoch is normal. It will decrease later.

About the warning on thread-pool, see this issue pytorch/pytorch#57273.

MsWik · 2021-08-30T08:05:01Z

Thank you.

MsWik closed this as completed Aug 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the dimensionality of the mask. #3

Question about the dimensionality of the mask. #3

MsWik commented Aug 28, 2021

sithu31296 commented Aug 28, 2021

MsWik commented Aug 30, 2021 •

edited

sithu31296 commented Aug 30, 2021

MsWik commented Aug 30, 2021

Question about the dimensionality of the mask. #3

Question about the dimensionality of the mask. #3

Comments

MsWik commented Aug 28, 2021

sithu31296 commented Aug 28, 2021

MsWik commented Aug 30, 2021 • edited

sithu31296 commented Aug 30, 2021

MsWik commented Aug 30, 2021

MsWik commented Aug 30, 2021 •

edited