Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LogCumsumExp #26411

Closed
agadetsky opened this issue Sep 18, 2019 · 19 comments
Closed

LogCumsumExp #26411

agadetsky opened this issue Sep 18, 2019 · 19 comments
Labels
feature A request for a proper, new feature. good first issue module: bootcamp We plan to do a full writeup on the issue, and then get someone to do it for onboarding triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@agadetsky
Copy link

agadetsky commented Sep 18, 2019

🚀 Feature

Add numerically stable cumulative logsumexp function. Also we have associated PR on cummax that is needed for numerically stable implementation (#20240).

Motivation

This is useful when computing sum of probabilities and have different applications.

Pitch

Torch has cumsum and cumprod so I suggest logcumsumexp to be added.

Alternatives

Current workaround for me and for people that need it can be written as follows (it is quite fast, only overhead for converting between numpy/torch):

import numpy as np
import torch

def cummax(x, dim):
    x_np = x.detach().cpu().numpy()
    ret = np.maximum.accumulate(x_np, axis=dim)
    return torch.from_numpy(ret).to(x)


def logcumsumexp(x, dim=-1):
    if (dim != -1) or (dim != x.ndimension() - 1):
        x = x.transpose(dim, -1)

    init_size = x.size()
    last_dim_size = init_size[-1]
    x_resized = x.contiguous().view(-1, last_dim_size)
    d1, d2 = x_resized.size()
    x_cummax = cummax(x_resized, -1).view(d1, d2, 1)
    x_expand = x_resized.unsqueeze(1).expand(d1, d2, last_dim_size)
    mask = torch.tril(torch.ones(last_dim_size, last_dim_size)).unsqueeze(0)
    ret = torch.log(torch.sum(torch.exp(x_expand - x_cummax) * mask, dim=-1)) + x_cummax.view(d1, d2)
    ret = ret.view(*init_size)

    if (dim != -1) or (dim != x.ndimension() - 1):
        ret = ret.transpose(-1, dim)

    return ret

UPDATE: code above is not always numerically stable. Still most numerically stable, but slow way to compute logcumsumexp is use for-loop:

def logcumsumexp(x, dim):
    # slow implementation, but ok for now
    if (dim != -1) or (dim != x.ndimension() - 1):
        x = x.transpose(dim, -1)

    out = []
    for i in range(1, x.size(-1) + 1):
        out.append(torch.logsumexp(x[..., :i], dim=-1, keepdim=True))
    out = torch.cat(out, dim=-1)

    if (dim != -1) or (dim != x.ndimension() - 1):
        out = out.transpose(-1, dim)
    return out
@mrshenli mrshenli added feature A request for a proper, new feature. good first issue module: operators module: bootcamp We plan to do a full writeup on the issue, and then get someone to do it for onboarding triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Sep 19, 2019
@opheliagame
Copy link

I would like to work on this.

@agadetsky
Copy link
Author

@opheliagame see #20240 (comment) for cummax implementation

@michiboo
Copy link

HI @agadetsky , I created a PR #26832 for this issues. But I am not sure If I am doing it right in the implementation part, I stop just before the line

mask = torch.tril(torch.ones(last_dim_size, last_dim_size)).unsqueeze(0)

in python implementation.

It would be great if you can review THTensor_(logcumsumexp) in THTensormoreMath.cpp.

I want to make sure I am going in right direction.

Many thanks!

@agadetsky
Copy link
Author

@michiboo hello! firstly I think we need to end this PR on cummax (#26637). Can you work on it?

@michiboo
Copy link

Hi, Yes I could try work on it.

@michiboo
Copy link

@agadetsky what do you use to loop over multi-dimensional tensor in C++, I am trying to compute the convert value of the tensor to 1s and 0s

@agadetsky
Copy link
Author

@michiboo unfortunately I am not familiar with pytorch C++ implementation.

@arktrail
Copy link
Contributor

arktrail commented Dec 5, 2019

@michiboo Hi, are you still working on this issue?

@pandeykartikey
Copy link
Contributor

I will try to submit a PR for this by the weekend, if not needed urgently.

PS: @agadetsky I can't seem to assign myself the issue 😅

@agadetsky
Copy link
Author

@pandeykartikey thank you, waiting for your PR

@agadetsky
Copy link
Author

@pandeykartikey you can use implemented cummax for stable implementation of logcumsumexp

@pandeykartikey
Copy link
Contributor

@agadetsky Wouldn't that create issues while building a feature because It can undergo drastic changes in implementation as it is being reviewed right now?

@agadetsky
Copy link
Author

@pandeykartikey better to ask someone else, atm I don't know, but I think that without cummax it is harder to implement logcumsumexp properly

@agadetsky
Copy link
Author

@pandeykartikey hi! do you have any progress?

@agadetsky
Copy link
Author

@pandeykartikey cummax is merged now (#30881), is it possible to implement logcumsumexp for you using cummax?

@pandeykartikey
Copy link
Contributor

Hi @agadetsky, I am almost done with the implementation. Only the backprop and tests are left. But I am having trouble figuring out the backprop algorithm. Can you help me with that?

@rebelmachina
Copy link

@pandeykartikey what's the status on this ? Still open ?

@agadetsky
Copy link
Author

@whoiscris yes, maybe @anjali411 will finish it soon here #32876

@agadetsky
Copy link
Author

up, #32876 still needs to be finished

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A request for a proper, new feature. good first issue module: bootcamp We plan to do a full writeup on the issue, and then get someone to do it for onboarding triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants