Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An error:RuntimeError: Function 'SigmoidBackward0' returned nan values in its 0th output. #24

Open
xixixihean opened this issue Mar 11, 2024 · 1 comment

Comments

@xixixihean
Copy link

In nnUNetTrainerV2_DDP.py,
with autocast(enabled=False):
output_act = output_ds[i].sigmoid() if is_sigmoid else softmax_helper(output_ds[i]) # bug occurs here..
There will raise an error:RuntimeError: Function 'SigmoidBackward0' returned nan values in its 0th output.
Could you tell me how to solve it,please?
Thank you.

@TaWald
Copy link

TaWald commented Mar 12, 2024

I got the same issue.
For me it always happens at around epoch ~120.

Here is my stacktrace:

"/dkfz/cluster/gpu/data/OE0441/t006d/Code/transunet3d/nn_transunet/trainer/nnUNetTrainerV2_DDP.py", line 1039, in run_training
    l = self.run_iteration(self.tr_gen, True)
  File "/dkfz/cluster/gpu/data/OE0441/t006d/Code/transunet3d/nn_transunet/trainer/nnUNetTrainerV2_DDP.py", line 552, in run_iteration
    l = self.compute_loss(output, target, is_max, is_c2f, self.args.is_sigmoid, is_max_hungarian, is_max_ds, point_rend, num_point_rend, no_object_weight)
  File "/dkfz/cluster/gpu/data/OE0441/t006d/Code/transunet3d/nn_transunet/trainer/nnUNetTrainerV2_DDP.py", line 658, in compute_loss
    output_act = output_ds[i].sigmoid() if is_sigmoid else softmax_helper(output_ds[i]) # bug occurs here..
 (Triggered internally at ../torch/csrc/autograd/python_anomaly_mode.cpp:113.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants