CoxPHLoss does not handle batches where all samples are censored #52

lenbrocki · 2023-12-21T13:31:21Z

I'm training a custom model with the CoxPHLoss and have noticed that when using the Efron tie method the training will fail when a batch only contains censored events. The code giving the error is in lassonet/utils.py:

if hasattr(torch.Tensor, "scatter_reduce_"):
    # version >= 1.12
    def scatter_reduce(input, dim, index, reduce, *, output_size=None):
        src = input
        if output_size is None:
            output_size = index.max() + 1
        return torch.empty(output_size, device=input.device).scatter_reduce(
            dim=dim, index=index, src=src, reduce=reduce, include_self=False
        )

else:
    scatter_reduce = torch.scatter_reduce

When all samples are censored index will be an empty tensor and index.max() fails.
Also, if I understand correctly, the Cox likelihood would be zero in that case so that the log likelihood is not defined.
For now I have resorted to skipping these problematic batches, but I was thinking that it might be helpful to handle this edge case directly in CoxPHLoss. Not sure what's the best way of doing it though.

The text was updated successfully, but these errors were encountered:

louisabraham · 2023-12-22T13:53:07Z

Hello, can you produce a minimal reproducible example?

Also, if I understand correctly, the Cox likelihood would be zero in that case so that the log likelihood is not defined.

wouldn't it be one? Can you test with the Breslow approximation?

louisabraham · 2023-12-24T13:46:53Z

I think that if the sets are empty, the log-likelihood is just zero.

lenbrocki · 2024-01-08T08:57:33Z

Sorry for the late response. A minimal example would be:

import torch
from lassonet.cox import CoxPHLoss

loss = CoxPHLoss("breslow")
labels = torch.tensor([[5.0, 0], [2.0, 0]])
hazards = torch.tensor([5.0, 2.0])
print(loss(hazards, labels)) 
#prints nan 

loss = CoxPHLoss("efron")
labels = torch.tensor([[5.0, 0], [2.0, 0]])
hazards = torch.tensor([5.0, 2.0])
print(loss(hazards, labels)) 
#fails with RuntimeError: max(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument.

The error for the Efron method happens because of what I've described above. I think the nan in the breslow case happens because the likelihood is zero. This can for example be seen from your paper https://arxiv.org/pdf/2208.09793.pdf in equation 1: if all $\delta_i$ are zero, the product is zero and then the log of this is not defined.

louisabraham · 2024-01-08T09:07:21Z

I'll look at it tonight. But an empty product is 1, not 0, hence the log will be 0. Le lun. 8 janv. 2024, 09:57, lenbrocki ***@***.***> a écrit :

…

Sorry for the late response. A minimal example would be: import torch from lassonet.cox import CoxPHLoss loss = CoxPHLoss("breslow") labels = torch.tensor([[5.0, 0], [2.0, 0]]) hazards = torch.tensor([5.0, 2.0]) print(loss(hazards, labels)) #prints nan loss = CoxPHLoss("efron") labels = torch.tensor([[5.0, 0], [2.0, 0]]) hazards = torch.tensor([5.0, 2.0]) print(loss(hazards, labels)) #fails with RuntimeError: max(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument. The error for the Efron method happens because of what I've described above. I think the nan in the breslow case happens because the likelihood is zero. This can for example be seen from your paper https://arxiv.org/pdf/2208.09793.pdf in equation 1: if all $\delta_i$ are zero, the product is zero and then the log of this is not defined. — Reply to this email directly, view it on GitHub <#52 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADEQQFJM3G6QQAZPKXGMPF3YNOYIRAVCNFSM6AAAAABA6M7RZ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBQGU4TQNJYGM> . You are receiving this because you commented.Message ID: ***@***.***>

lenbrocki · 2024-01-08T09:36:54Z

Oh I wasn't aware of that, but yes you're right of course. Then the problem becomes why nan is returned and not 0.

louisabraham · 2024-01-08T14:38:42Z

I tried a fix, can you test it? Le lun. 8 janv. 2024, 10:37, lenbrocki ***@***.***> a écrit :

…

Oh I wasn't aware of that, but yes you're right of course. Then the problem becomes why nan is returned and not 0. — Reply to this email directly, view it on GitHub <#52 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADEQQFIFJVKAUTX3FGRJWXTYNO44BAVCNFSM6AAAAABA6M7RZ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBQGY2TIOBWHE> . You are receiving this because you commented.Message ID: ***@***.***>

lenbrocki · 2024-01-11T09:56:42Z

The CoxPHLoss now correctly returns 0. But when I'm trying to use the fixed loss in training I'm getting for batches where all samples have $\delta_i = 0$ this error:

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

louisabraham · 2024-01-12T08:37:31Z

I think I managed to find a better fix :) Can you test again?

lenbrocki · 2024-01-24T15:46:04Z

Sorry again for the delay. Yes, it's working now!

louisabraham · 2024-01-24T16:19:19Z

Great!

louisabraham closed this as completed in ead7a45 Jan 8, 2024

louisabraham added a commit that referenced this issue Jan 12, 2024

fix #52

ea0725c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CoxPHLoss does not handle batches where all samples are censored #52

CoxPHLoss does not handle batches where all samples are censored #52

lenbrocki commented Dec 21, 2023

louisabraham commented Dec 22, 2023

louisabraham commented Dec 24, 2023

lenbrocki commented Jan 8, 2024

louisabraham commented Jan 8, 2024 via email

lenbrocki commented Jan 8, 2024

louisabraham commented Jan 8, 2024 via email

lenbrocki commented Jan 11, 2024 •

edited

Loading

louisabraham commented Jan 12, 2024

lenbrocki commented Jan 24, 2024

louisabraham commented Jan 24, 2024

CoxPHLoss does not handle batches where all samples are censored #52

CoxPHLoss does not handle batches where all samples are censored #52

Comments

lenbrocki commented Dec 21, 2023

louisabraham commented Dec 22, 2023

louisabraham commented Dec 24, 2023

lenbrocki commented Jan 8, 2024

louisabraham commented Jan 8, 2024 via email

lenbrocki commented Jan 8, 2024

louisabraham commented Jan 8, 2024 via email

lenbrocki commented Jan 11, 2024 • edited Loading

louisabraham commented Jan 12, 2024

lenbrocki commented Jan 24, 2024

louisabraham commented Jan 24, 2024

lenbrocki commented Jan 11, 2024 •

edited

Loading