Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion indexValue >= 0 && indexValue < tensor.sizes[dim] failed #38

Closed
nkise opened this issue Apr 19, 2020 · 6 comments
Closed

Assertion indexValue >= 0 && indexValue < tensor.sizes[dim] failed #38

nkise opened this issue Apr 19, 2020 · 6 comments

Comments

@nkise
Copy link

nkise commented Apr 19, 2020

Wanted to reduce num of classes to 8 but after starting train have this error.

void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [1,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
Traceback (most recent call last):
  File "train.py", line 329, in <module>
    train(epoch)
  File "train.py", line 180, in train
    loss = region_loss(output, target)
  File "/home/nkise/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nkise/Documents/neuron/YOWO/region_loss.py", line 259, in forward
    loss_cls = self.class_scale * FL(cls, tcls)
  File "/home/nkise/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nkise/Documents/neuron/YOWO/FocalLoss.py", line 58, in forward
    self.alpha = self.alpha.cuda()
RuntimeError: CUDA error: device-side assert triggered
@nkise
Copy link
Author

nkise commented Apr 19, 2020

@wei-tim it seems that you have hardcoded numbers of class in region_loss.py

        # try focal loss with gamma = 2
        FL = FocalLoss(class_num=24, gamma=2, size_average=False)
        loss_cls = self.class_scale * FL(cls, tcls)

@wei-tim
Copy link
Owner

wei-tim commented Apr 21, 2020

@nkise
Hi,
that's true, you need to modify this part if you train with your own dataset.

@nkise
Copy link
Author

nkise commented Apr 21, 2020

@nkise
Hi,
that's true, you need to modify this part if you train with your own dataset.

So, as I understand, I need to modify clues_num to my number of classes?

@wei-tim
Copy link
Owner

wei-tim commented Apr 22, 2020

@nkise

exactly.

@nkise nkise closed this as completed May 5, 2020
@hfarhid
Copy link

hfarhid commented Feb 8, 2021

Did this fix you problem? still have same issue after fixing the class_num

@okankop
Copy link
Collaborator

okankop commented Feb 8, 2021

Currently I am refactoring the whole repo. So, there won't be these kind of bugs. More importantly, I will add the training details for the AVA dataset, and also update the article with AVA results.

Stay tuned!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants