Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setting NUM_CLASSES #27

Closed
heizie opened this issue Jun 4, 2021 · 13 comments
Closed

setting NUM_CLASSES #27

heizie opened this issue Jun 4, 2021 · 13 comments

Comments

@heizie
Copy link

heizie commented Jun 4, 2021

in the beginning i would like to give others some notice: even though you've install pytorch via anaconda with cudatoolkit. But still. it is just for the pytorch. not for detectron2. pls consider using cuda package locally or use a docker.

question 1:
train_log_init.txt

I've found out you've noticed, that we should change MODEL.ROI_HEADS.NUM_CLASSES and MODEL.RETINANET.NUM_CLASSES. I've changed them in detectron2/config/defaults.py
Or tried to add the params in all.sh via adding MODEL.ROI_HEADS.NUM_CLASSES 2, MODEL.FCOS.NUM_CLASSES 2, MODEL.RETINANET.NUM_CLASSES 2
for my 2 classes (background not included). but none of them helps...
The error:
AssertionError: A prediction has category_id=62, which is not available in the dataset.
question 2:
the training seems stop immediately.
i've changed the MAX_ITER in yaml file, but it was not helped..

I think the both problems could be relevant, because the model is not trained for 2 classes.
the log file is attached. many thanks for your help!

@lkeab
Copy link
Owner

lkeab commented Jun 5, 2021

Thanks for the beginning tips.

That's strange. I have seen your log error. Could you remove the ',' after 2 (which makes it a tuple but we need int) and directly try to modify this line?

@heizie
Copy link
Author

heizie commented Jun 6, 2021

Thanks for the beginning tips.

That's strange. I have seen your log error. Could you remove the ',' after 2 (which makes it a tuple but we need int) and directly try to modify this line?

hmm. it came something new. new errors..
train_log_init.txt

[06/06 18:32:03 d2.data.common]: Serializing 561 elements to byte tensors and concatenating them all ...
[06/06 18:32:03 d2.data.common]: Serialized dataset takes 11.80 MiB
[06/06 18:32:03 d2.data.detection_utils]: TransformGens used in training: [ResizeShortestEdge(short_edge_length=(640, 672, 704, 736, 768, 800), max_size=1333, sample_style='choice'), RandomFlip()]
[06/06 18:32:03 d2.data.build]: Using training sampler TrainingSampler
[06/06 18:32:04 fvcore.common.checkpoint]: [Checkpointer] Loading from ./pretrained_models/BCNet_models/fcos_Res101.pth ...
[06/06 18:32:04 d2.engine.train_loop]: Starting training from iteration 0
[06/06 18:32:04 d2.engine.hooks]: Total training time: 0:00:00 (0:00:00 on hooks)
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [64,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [65,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [66,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [67,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [68,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [69,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [70,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [71,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [72,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [73,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [74,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [75,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [76,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [77,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [78,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [79,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [80,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [81,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [82,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [32,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [33,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [34,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [35,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [36,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [37,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [38,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [39,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [40,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [41,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [42,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [43,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [44,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [45,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [46,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [47,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [48,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [49,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [50,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [51,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [52,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [53,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [54,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [55,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [56,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [57,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [58,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [59,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [60,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [61,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [62,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [63,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [0,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [1,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [2,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [3,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [4,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [5,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [6,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [7,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [8,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [9,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [10,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [11,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [12,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [13,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [14,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [15,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [16,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [17,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [18,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [19,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [20,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [21,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [22,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [23,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [24,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [25,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [26,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [27,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [28,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [29,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [30,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [31,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
WARNING [06/06 18:32:04 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[06/06 18:32:04 d2.data.datasets.coco]: Loaded 70 images in COCO format from ./datasets/coco/annotations/instances_val.json
[06/06 18:32:04 d2.data.build]: Distribution of instances among all 3 categories:
|   category   | #instances   |  category  | #instances   |  category  | #instances   |
|:------------:|:-------------|:----------:|:-------------|:----------:|:-------------|
| _background_ | 0            |   Cable    | 308          |  CableEnd  | 616          |
|              |              |            |              |            |              |
|    total     | 924          |            |              |            |              |
[06/06 18:32:04 d2.data.common]: Serializing 70 elements to byte tensors and concatenating them all ...
[06/06 18:32:04 d2.data.common]: Serialized dataset takes 0.31 MiB
[06/06 18:32:04 d2.evaluation.evaluator]: Start inference on 70 images
Traceback (most recent call last):
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/engine/train_loop.py", line 131, in train
    self.run_step()
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/engine/train_loop.py", line 211, in run_step
    loss_dict = self.model(data, self.iter, self.max_iter)
  File "/home/iwb/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/modeling/meta_arch/fcos.py", line 176, in forward
    centerness, gt_instances, batched_inputs, images, c_iter, max_iter
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/modeling/meta_arch/fcos.py", line 188, in _forward_train
    locations, box_cls, box_regression, centerness, gt_instances
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/modeling/meta_arch/loss_fcos.py", line 260, in __call__
    reduction="sum",
RuntimeError: CUDA error: device-side assert triggered
The above operation failed in interpreter.
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/fvcore-0.1.5.post20210604-py3.6.egg/fvcore/nn/focal_loss.py", line 34
    """
    p = torch.sigmoid(inputs)
    ce_loss = F.binary_cross_entropy_with_logits(inputs, targets, reduction="none")
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    p_t = p * targets + (1 - p) * (1 - targets)
    loss = ce_loss * ((1 - p_t) ** gamma)
  File "/home/iwb/.local/lib/python3.6/site-packages/torch/nn/functional.py", line 2126, in binary_cross_entropy_with_logits
        raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))

    return torch.binary_cross_entropy_with_logits(input, target, weight, pos_weight, reduction_enum)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./tools/train_net.py", line 184, in <module>
    args=(args,),
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/engine/launch.py", line 51, in launch
    main_func(*args)
  File "./tools/train_net.py", line 149, in main
    return trainer.train()
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/engine/defaults.py", line 373, in train
    super().train(self.start_iter, self.max_iter)
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/engine/train_loop.py", line 134, in train
    self.after_train()
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/engine/train_loop.py", line 142, in after_train
    h.after_train()
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/engine/hooks.py", line 353, in after_train
    self._do_eval()
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/engine/hooks.py", line 321, in _do_eval
    results = self._func()
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/engine/defaults.py", line 324, in test_and_save_results
    self._last_eval_results = self.test(self.cfg, self.model)
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/engine/defaults.py", line 484, in test
    results_i = inference_on_dataset(model, data_loader, evaluator)
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/evaluation/evaluator.py", line 122, in inference_on_dataset
    outputs = model(inputs, idx, len(data_loader))
  File "/home/iwb/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/modeling/meta_arch/fcos.py", line 155, in forward
    images = self.preprocess_image(batched_inputs)
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/modeling/meta_arch/fcos.py", line 425, in preprocess_image
    images = [x["image"].to(self.device) for x in batched_inputs]
  File "/home/iwb/project/dlo_bcnet/BCNet/detectron2/modeling/meta_arch/fcos.py", line 425, in <listcomp>
    images = [x["image"].to(self.device) for x in batched_inputs]
RuntimeError: CUDA error: device-side assert triggered

@lkeab
Copy link
Owner

lkeab commented Jun 6, 2021

The message: "Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you."

What is your category ids in annotations?

@heizie
Copy link
Author

heizie commented Jun 7, 2021

The message: "Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you."

What is your category ids in annotations?

I didn’t define it yet. That should be fine right? i’ve used same setting for CenterMask 2, also in detection 2 framework.

=========update ===============
i've opened the json file. it shows at the end:
Screenshot from 2021-06-07 10-20-41

====== update again ========
the id in converted json file is shifted....from 1 to N
original id is from 0 to N-1
pytorch/pytorch#21136
(left:converted, right:original instances_val.zip)
val

@lkeab
Copy link
Owner

lkeab commented Jun 7, 2021

You don't need the background class in the annotation file. I think the problem of the converted json comes with incorrect image size setting.

This is the original
"categories":
[{'id': 1, 'name': 'person'}, {'id': 2, 'name': 'bicycle'}, {'id': 3, 'name': 'car'}, {'id': 4, 'name': 'motorcycle'}, {'id': 5, 'name': 'airplane'}, {'id': 6, 'name': 'bus'}, {'id': 7, 'name': 'train'}, {'id': 8, 'name': 'truck'}, {'id': 9, 'name': 'boat'}, {'id': 10, 'name': 'traffic light'}, {'id': 11, 'name': 'fire hydrant'}, {'id': 13, 'name': 'stop sign'}, {'id': 14, 'name': 'parking meter'}, {'id': 15, 'name': 'bench'}, {'id': 16, 'name': 'bird'}, {'id': 17, 'name': 'cat'}, {'id': 18, 'name': 'dog'}, {'id': 19, 'name': 'horse'}, {'id': 20, 'name': 'sheep'}, {'id': 21, 'name': 'cow'}, {'id': 22, 'name': 'elephant'}, {'id': 23, 'name': 'bear'}, {'id': 24, 'name': 'zebra'}, {'id': 25, 'name': 'giraffe'}, {'id': 27, 'name': 'backpack'}, {'id': 28, 'name': 'umbrella'}, {'id': 31, 'name': 'handbag'}, {'id': 32, 'name': 'tie'}, {'id': 33, 'name': 'suitcase'}, {'id': 34, 'name': 'frisbee'}, {'id': 35, 'name': 'skis'}, {'id': 36, 'name': 'snowboard'}, {'id': 37, 'name': 'sports ball'}, {'id': 38, 'name': 'kite'}, {'id': 39, 'name': 'baseball bat'}, {'id': 40, 'name': 'baseball glove'}, {'id': 41, 'name': 'skateboard'}, {'id': 42, 'name': 'surfboard'}, {'id': 43, 'name': 'tennis racket'}, {'id': 44, 'name': 'bottle'}, {'id': 46, 'name': 'wine glass'}, {'id': 47, 'name': 'cup'}, {'id': 48, 'name': 'fork'}, {'id': 49, 'name': 'knife'}, {'id': 50, 'name': 'spoon'}, {'id': 51, 'name': 'bowl'}, {'id': 52, 'name': 'banana'}, {'id': 53, 'name': 'apple'}, {'id': 54, 'name': 'sandwich'}, {'id': 55, 'name': 'orange'}, {'id': 56, 'name': 'broccoli'}, {'id': 57, 'name': 'carrot'}, {'id': 58, 'name': 'hot dog'}, {'id': 59, 'name': 'pizza'}, {'id': 60, 'name': 'donut'}, {'id': 61, 'name': 'cake'}, {'id': 62, 'name': 'chair'}, {'id': 63, 'name': 'couch'}, {'id': 64, 'name': 'potted plant'}, {'id': 65, 'name': 'bed'}, {'id': 67, 'name': 'dining table'}, {'id': 70, 'name': 'toilet'}, {'id': 72, 'name': 'tv'}, {'id': 73, 'name': 'laptop'}, {'id': 74, 'name': 'mouse'}, {'id': 75, 'name': 'remote'}, {'id': 76, 'name': 'keyboard'}, {'id': 77, 'name': 'cell phone'}, {'id': 78, 'name': 'microwave'}, {'id': 79, 'name': 'oven'}, {'id': 80, 'name': 'toaster'}, {'id': 81, 'name': 'sink'}, {'id': 82, 'name': 'refrigerator'}, {'id': 84, 'name': 'book'}, {'id': 85, 'name': 'clock'}, {'id': 86, 'name': 'vase'}, {'id': 87, 'name': 'scissors'}, {'id': 88, 'name': 'teddy bear'}, {'id': 89, 'name': 'hair drier'}, {'id': 90, 'name': 'toothbrush'}]

@heizie
Copy link
Author

heizie commented Jun 7, 2021

I think the problem of the converted json comes with incorrect image size setting.

could be something relative to wrong size. but in both converted and original json files, the sizes are the same.
besides, the input images should be automatically "ResizeShortestEdge"
I don't know where can i "catch" the size of the real input

@heizie
Copy link
Author

heizie commented Jun 7, 2021

Ahha. we can use self.NUM_CLASSES instead
https://github.com/lkeab/BCNet/blob/main/detectron2/modeling/meta_arch/inference_fcos.py#L145

then i'm now facing this issue:
#25

I think i can close this issue very soon..

@lkeab
Copy link
Owner

lkeab commented Jun 7, 2021

Great. Leave a message here when you have new progress.

@heizie
Copy link
Author

heizie commented Jun 8, 2021

then i'm now facing this issue:
#25

I'm not sure should I report the cuda issue in this ticket. but for the issue I've mentioned, I think this is very fatal for your open source project. as you know, one of the solutions of the problem is to reduce the batch size. but I've already reduced it to 1... I've tried to update the pytorch to 1.7. but the detectron 2 you've used is obviously a acient version, which did not support pytorch 1.7 and some support function (like RandomApply).

pls consider constructing a stable environment in a docker. or even in a .txt...or even update the detectron 2 in a higher version

But anyways, thanks again for your job

@lkeab
Copy link
Owner

lkeab commented Jun 8, 2021

That's strange. I did't meet this issue on my machine, and some other users have trained on their own dataset and reported results already in the issue. Maybe you could just take the mask head design code and combine with some codebases that you could train/inference smoothly. Thanks for watching the work.

@heizie
Copy link
Author

heizie commented Jun 8, 2021

That's strange. I did't meet this issue on my machine,

I think you must have local cuda deployment in your machine. try using other server. then you'll know this is something that depends on your luck....

and some other users have trained on their own dataset and reported results already in the issue.

that is also strainge for me... and I've noticed, that he might use windows for training...what a hacker...

Maybe you could just take the mask head design code and combine with some codebases that you could train/inference smoothly

that is not quite realistic for many researchers...i think I might take more than 2 weeks for trying and coding

@lkeab lkeab closed this as completed Jun 8, 2021
@lkeab
Copy link
Owner

lkeab commented Jun 8, 2021

I actually trained/tested the model on two different servers because of changing working place and both works fine. I guess the reported user is just using windows for showing results (you could contact him probably). Maybe we can wait some more users to also report their usage. Thanks a lot for feedback.

@TechyRio
Copy link

The message: "Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you."
What is your category ids in annotations?

I didn’t define it yet. That should be fine right? i’ve used same setting for CenterMask 2, also in detection 2 framework.

=========update =============== i've opened the json file. it shows at the end: Screenshot from 2021-06-07 10-20-41

====== update again ======== the id in converted json file is shifted....from 1 to N original id is from 0 to N-1 pytorch/pytorch#21136 (left:converted, right:original instances_val.zip) val

I also encountered the same problem as you. How did you convert the json file to remove the background? Do you have a conversion script file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants