Train error #26

StevenAZy · 2021-04-03T12:58:19Z

Traceback (most recent call last):
File "train.py", line 93, in
trainer[dataset_name](args, net, snapshot_path)
File "/home/lichangyong/Code/TransUNet/trainer.py", line 57, in trainer_synapse
loss_dice = dice_loss(outputs, label_batch, softmax=True)
File "/home/lichangyong/.pyenv/versions/pytorch_gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/lichangyong/Code/TransUNet/utils.py", line 43, in forward
class_wise_dice.append(1.0 - dice.item())
RuntimeError: CUDA error: device-side assert triggered

andife · 2021-04-03T20:23:47Z

Hi,
I think I had a similar error message. The reason for this was the names, or the numerical values of my labels.
Maybe look facebookresearch/SymbolicMathematics#2
You should find several github issues looking for the expression "RuntimeError: CUDA error: device-side assert triggered".

SM-93 · 2021-04-05T06:40:13Z

I had a similar problem, I was training the model with 13 classes instead of 9 (8 organs + background) so I had to change the num_classes in the dataset_config (train.py) to 14. Maybe you are also training with 13 classes.

AllenPiaoHong · 2021-04-08T10:57:30Z

I had a similar problem, I was training the model with 13 classes instead of 9 (8 organs + background) so I had to change the num_classes in the dataset_config (train.py) to 14. Maybe you are also training with 13 classes.

Hi, have you fixed the problem now. I also change the num_classes in train.py. But it doesn't work.

StevenAZy · 2021-04-08T11:37:08Z

I had a similar problem, I was training the model with 13 classes instead of 9 (8 organs + background) so I had to change the num_classes in the dataset_config (train.py) to 14. Maybe you are also training with 13 classes.

Hi, have you fixed the problem now. I also change the num_classes in train.py. But it doesn't work.

I changed the num_classes value to 14. The code can run successfully.

rameses666 · 2021-04-14T08:07:50Z

I think the reason is the value of your label.Check that.

JAYCHOU2020 · 2023-04-15T15:48:35Z

I had a similar problem, I was training the model with 13 classes instead of 9 (8 organs + background) so I had to change the num_classes in the dataset_config (train.py) to 14. Maybe you are also training with 13 classes.

Hi, have you fixed the problem now. I also change the num_classes in train.py. But it doesn't work.

I changed the num_classes value to 14. The code can run successfully.

my dataset class is 1 so I set numclass=2 but it also report the error ,even I set numclass=3 it not work

buhuiya · 2024-04-02T06:01:42Z

有人解决了吗

StevenAZy closed this as completed Apr 4, 2021

StevenAZy reopened this Apr 4, 2021

StevenAZy mentioned this issue Apr 7, 2021

"num_classes" setting #33

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train error #26

Train error #26

StevenAZy commented Apr 3, 2021

andife commented Apr 3, 2021

SM-93 commented Apr 5, 2021

AllenPiaoHong commented Apr 8, 2021

StevenAZy commented Apr 8, 2021

rameses666 commented Apr 14, 2021

JAYCHOU2020 commented Apr 15, 2023

buhuiya commented Apr 2, 2024

Train error #26

Train error #26

Comments

StevenAZy commented Apr 3, 2021

andife commented Apr 3, 2021

SM-93 commented Apr 5, 2021

AllenPiaoHong commented Apr 8, 2021

StevenAZy commented Apr 8, 2021

rameses666 commented Apr 14, 2021

JAYCHOU2020 commented Apr 15, 2023

buhuiya commented Apr 2, 2024