Backpropagation for sparse matrix indexing is problematic (colab provided) #45996

jasonbian97 · 2020-10-07T21:50:59Z

🐛 Bug

When indexing the sparse matrix, it performs well in the forward, but in backpropagation, the gradients become all zeros (i.e. empty sparse matrix).

To Reproduce

I put up a toy case in colab to reproduce.

The strange behavior is the gradients (in sparse) is empty.

Expected behavior

non-zero gradients appear at non-zero entry location.

cc @vincentqb @aocsa @nikitaved @pearu @mruberry

mruberry · 2020-10-08T05:37:45Z

Thanks for reporting this issue, @jasonbian97! We're reviewing our sparse tensor implementation now, actually, and we'll be sure to look at this behavior, too.

aocsa · 2021-01-18T23:52:03Z

I was looking this issue and I find out that there is some part of the code that does not make sense with sparse tensors. The index or get_item operation sp1[i], which call internally to index_select_sparse function creates a new sparse tensor which is different to the strided tensor version code which returns a view. So in the case of the sparse tensor code the computational graph is not connected anymore. And this make sense due to not every sub-tensor sp1[i] is materialized when sp1 is defined as a sparse tensor. IMO this is not an issue as there is not a way to create a view from a sparse tensor and for this case probably the torch.sparse.sum function can be used instead.

cc @jasonbian97, @mruberry, @rgommers

import torch
import torch.nn.functional as F

device = "cuda"
# device = "cpu"

#construct sparse mat
ind = torch.LongTensor([[0, 1, 1,3],
                          [2, 1, 2,3],
                    ])
vals = torch.FloatTensor([3, 4, 5,9])
sp1 = torch.sparse.FloatTensor(ind, vals, torch.Size([5,5])).to(device)

# index sparse mat
print("sp1.to_dense() = \n",sp1.to_dense())
print("sp1[0] = \n", sp1[0]) # this is a new sparse tensor
print("sp1[0].to_dense() = \n", sp1[0].to_dense())

print(sp1[0].grad_fn)

sp1 = sp1.detach().requires_grad_(True)
losses = []
for i in range(sp1.shape[0]):
    loss = torch.sparse.sum(sp1[i]) # not a view, for each i  a new sparse tensor is created
    print(loss)
    losses.append(loss)
l = sum(losses)
print(l)

jasonbian97 · 2021-03-25T07:52:32Z

Thanks for looking into this!

So what if I just want to use the first non-zero value in the sparse matrix to do some following computation, is there any way I can keep the gradient flow back to that value?

jasonbian97 changed the title ~~Backpropagation for sparse matrix is problematic (colab provided)~~ Backpropagation for sparse matrix indexing is problematic (colab provided) Oct 7, 2020

ngimel added module: sparse Related to torch.sparse triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Oct 8, 2020

mruberry mentioned this issue Oct 8, 2020

torch.sparse improvements - tracking issue #44634

Open

26 tasks

pearu added this to In progress in Sparse tensors Aug 10, 2021

pearu moved this from In progress to To do in Sparse tensors Aug 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backpropagation for sparse matrix indexing is problematic (colab provided) #45996

Backpropagation for sparse matrix indexing is problematic (colab provided) #45996

jasonbian97 commented Oct 7, 2020 •

edited by pytorch-probot bot

mruberry commented Oct 8, 2020

aocsa commented Jan 18, 2021 •

edited

jasonbian97 commented Mar 25, 2021

Backpropagation for sparse matrix indexing is problematic (colab provided) #45996

Backpropagation for sparse matrix indexing is problematic (colab provided) #45996

Comments

jasonbian97 commented Oct 7, 2020 • edited by pytorch-probot bot

🐛 Bug

To Reproduce

Expected behavior

mruberry commented Oct 8, 2020

aocsa commented Jan 18, 2021 • edited

jasonbian97 commented Mar 25, 2021

jasonbian97 commented Oct 7, 2020 •

edited by pytorch-probot bot

aocsa commented Jan 18, 2021 •

edited