- 
                Notifications
    
You must be signed in to change notification settings  - Fork 9.8k
 
Description
Describe the bug
I tried training the htc_x101_64x4d_fpn_20e_16gpu model on a custom dataset. I set the 'seg_prefix' location to the folder that contains my segmentation maps. But soon after I start the training, it gives me the error: RuntimeError: 1only batches of spatial targets supported (non-empty 3D tensors) but got targets of size: : [1, 100, 100, 3]
Also, can you please tell me what is the difference between htc without semantic and htc with semantic?
Reproduction
- What command or script did you run?
 
python tools/train.py ~/Prateek/Prateek/mmdetection2/mmdetection/configs/htc/htc_x101_64x4d_fpn_20e_16gpu.py
- Did you make any modifications on the code or config? Did you understand what you have modified?
I modified the num_classes according to the custom dataset. I'm not sure what value of num_classes should I set in 'semantic_head' 
Environment
sys.platform: linux
Python: 3.7.6 (default, Jan  8 2020, 19:59:22) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.1, V10.1.168
GPU 0,1: GeForce RTX 2080 Ti
GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
PyTorch: 1.4.0
PyTorch compiling details: PyTorch built with:
- GCC 7.3
 - Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
 - Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
 - OpenMP 201511 (a.k.a. OpenMP 4.5)
 - NNPACK is enabled
 - CUDA Runtime 10.1
 - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
 - CuDNN 7.6.3
 - Magma 2.5.1
 - Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,
 
TorchVision: 0.5.0
OpenCV: 4.1.2
MMCV: 0.2.16
MMDetection: 1.0rc1+4b984a7
MMDetection Compiler: GCC 5.4
MMDetection CUDA Compiler: 10.1
Error traceback
2020-01-26 15:34:51,233 - INFO - workflow: [('train', 1)], max: 25 epochs
Traceback (most recent call last):
  File "tools/train.py", line 124, in <module>
    main()
  File "tools/train.py", line 120, in main
    timestamp=timestamp)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/apis/train.py", line 133, in train_detector
    timestamp=timestamp)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/apis/train.py", line 319, in _non_dist_train
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/mmcv/runner/runner.py", line 364, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/mmcv/runner/runner.py", line 268, in train
    self.model, data_batch, train_mode=True, **kwargs)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/apis/train.py", line 100, in batch_processor
    losses = model(**data)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/core/fp16/decorators.py", line 49, in new_func
    return old_func(*args, **kwargs)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/models/detectors/base.py", line 138, in forward
    return self.forward_train(img, img_meta, **kwargs)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/models/detectors/htc.py", line 230, in forward_train
    loss_seg = self.semantic_head.loss(semantic_pred, gt_semantic_seg)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/core/fp16/decorators.py", line 127, in new_func
    return old_func(*args, **kwargs)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/models/mask_heads/fused_semantic_head.py", line 108, in loss
    loss_semantic_seg = self.criterion(mask_pred, labels)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 916, in forward
    ignore_index=self.ignore_index, reduction=self.reduction)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/functional.py", line 2021, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/functional.py", line 1840, in nll_loss
    ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: 1only batches of spatial targets supported (non-empty 3D tensors) but got targets of size: : [1, 100, 100, 3]
Thanks for the help!