Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train error #26

Open
StevenAZy opened this issue Apr 3, 2021 · 7 comments
Open

Train error #26

StevenAZy opened this issue Apr 3, 2021 · 7 comments

Comments

@StevenAZy
Copy link

Traceback (most recent call last):
File "train.py", line 93, in
trainer[dataset_name](args, net, snapshot_path)
File "/home/lichangyong/Code/TransUNet/trainer.py", line 57, in trainer_synapse
loss_dice = dice_loss(outputs, label_batch, softmax=True)
File "/home/lichangyong/.pyenv/versions/pytorch_gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/lichangyong/Code/TransUNet/utils.py", line 43, in forward
class_wise_dice.append(1.0 - dice.item())
RuntimeError: CUDA error: device-side assert triggered

@andife
Copy link
Contributor

andife commented Apr 3, 2021

Hi,
I think I had a similar error message. The reason for this was the names, or the numerical values of my labels.
Maybe look facebookresearch/SymbolicMathematics#2
You should find several github issues looking for the expression "RuntimeError: CUDA error: device-side assert triggered".

@StevenAZy StevenAZy reopened this Apr 4, 2021
@SM-93
Copy link

SM-93 commented Apr 5, 2021

I had a similar problem, I was training the model with 13 classes instead of 9 (8 organs + background) so I had to change the num_classes in the dataset_config (train.py) to 14. Maybe you are also training with 13 classes.

@AllenPiaoHong
Copy link

I had a similar problem, I was training the model with 13 classes instead of 9 (8 organs + background) so I had to change the num_classes in the dataset_config (train.py) to 14. Maybe you are also training with 13 classes.

Hi, have you fixed the problem now. I also change the num_classes in train.py. But it doesn't work.

@StevenAZy
Copy link
Author

I had a similar problem, I was training the model with 13 classes instead of 9 (8 organs + background) so I had to change the num_classes in the dataset_config (train.py) to 14. Maybe you are also training with 13 classes.

Hi, have you fixed the problem now. I also change the num_classes in train.py. But it doesn't work.

I changed the num_classes value to 14. The code can run successfully.

@rameses666
Copy link

I think the reason is the value of your label.Check that.

@JAYCHOU2020
Copy link

I had a similar problem, I was training the model with 13 classes instead of 9 (8 organs + background) so I had to change the num_classes in the dataset_config (train.py) to 14. Maybe you are also training with 13 classes.

Hi, have you fixed the problem now. I also change the num_classes in train.py. But it doesn't work.

I changed the num_classes value to 14. The code can run successfully.

my dataset class is 1 so I set numclass=2 but it also report the error ,even I set numclass=3 it not work

@buhuiya
Copy link

buhuiya commented Apr 2, 2024

有人解决了吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants