Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed #38

nkise · 2020-04-19T15:42:54Z

Wanted to reduce num of classes to 8 but after starting train have this error.

void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [1,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
Traceback (most recent call last):
  File "train.py", line 329, in <module>
    train(epoch)
  File "train.py", line 180, in train
    loss = region_loss(output, target)
  File "/home/nkise/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nkise/Documents/neuron/YOWO/region_loss.py", line 259, in forward
    loss_cls = self.class_scale * FL(cls, tcls)
  File "/home/nkise/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nkise/Documents/neuron/YOWO/FocalLoss.py", line 58, in forward
    self.alpha = self.alpha.cuda()
RuntimeError: CUDA error: device-side assert triggered

The text was updated successfully, but these errors were encountered:

nkise · 2020-04-19T15:46:55Z

@wei-tim it seems that you have hardcoded numbers of class in region_loss.py

        # try focal loss with gamma = 2
        FL = FocalLoss(class_num=24, gamma=2, size_average=False)
        loss_cls = self.class_scale * FL(cls, tcls)

wei-tim · 2020-04-21T06:01:45Z

@nkise
Hi,
that's true, you need to modify this part if you train with your own dataset.

nkise · 2020-04-21T10:38:00Z

@nkise
Hi,
that's true, you need to modify this part if you train with your own dataset.

So, as I understand, I need to modify clues_num to my number of classes?

wei-tim · 2020-04-22T08:01:02Z

@nkise

exactly.

hfarhid · 2021-02-08T17:44:47Z

Did this fix you problem? still have same issue after fixing the class_num

okankop · 2021-02-08T17:52:50Z

Currently I am refactoring the whole repo. So, there won't be these kind of bugs. More importantly, I will add the training details for the AVA dataset, and also update the article with AVA results.

Stay tuned!

nkise closed this as completed May 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed #38

Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed #38

nkise commented Apr 19, 2020

nkise commented Apr 19, 2020

wei-tim commented Apr 21, 2020

nkise commented Apr 21, 2020

wei-tim commented Apr 22, 2020

hfarhid commented Feb 8, 2021

okankop commented Feb 8, 2021

Assertion indexValue >= 0 && indexValue < tensor.sizes[dim] failed #38

Assertion indexValue >= 0 && indexValue < tensor.sizes[dim] failed #38

Comments

nkise commented Apr 19, 2020

nkise commented Apr 19, 2020

wei-tim commented Apr 21, 2020

nkise commented Apr 21, 2020

wei-tim commented Apr 22, 2020

hfarhid commented Feb 8, 2021

okankop commented Feb 8, 2021

Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed #38

Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed #38