Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch.sparse.mm breaking API changes #11

Closed
migalkin opened this issue Jun 21, 2022 · 7 comments 路 Fixed by #12 or #16
Closed

torch.sparse.mm breaking API changes #11

migalkin opened this issue Jun 21, 2022 · 7 comments 路 Fixed by #12 or #16

Comments

@migalkin
Copy link

Suddenly, everything stopped working 馃槺 presumably because of the changes to torch.sparse.
Particularly, I am on PyTorch 1.10, master branch of PyKEEN and torch-ppr 0.0.5.

Problem 1: the allclose() check does not pass now:

if not torch.allclose(x_sum, torch.ones_like(x_sum)):
raise ValueError(f"The entries do not sum to 1. {x_sum[x_sum != 0]}")

MWE:

import torch
from torch_ppr import page_rank

from pykeen.datasets import FB15k237

dataset = FB15k237(create_inverse_triples=False)
edges = dataset.training.mapped_triples[:, [0, 2]].t()
pr = page_rank(edge_index=torch.cat([edges, edges.flip(0)], dim=-1), num_nodes=dataset.num_entities)

>> ValueError: Invalid column sum: tensor([1.0000, 1.0000, 1.0000,  ..., 1.0000, 1.0000, 1.0000]). expected 1.0

Looking into the debugger:

  • adj_sum does sum up to the number of nodes
  • the default tolerance fails the check, but if I reduce rtol=1e-4 or atol=1e-4 - the check passes

Problem 2: the signature of torch.sparse.addmm has changed from the one used in power_iteration so the API call fails with the unknown kwarg error.

x = torch.sparse.addmm(input=x0, sparse=adj, dense=x, beta=alpha, alpha=beta)

In fact, I can't find where those kwargs input, sparse, dense come from because the current signature has less readable mat, mat1, mat2. I traced to the very Torch 1.3.0 and still can't find where those originated from. Where does this signature come from? 馃槄

My test env

torch                 1.10.0
torch-ppr             0.0.5
@mberr
Copy link
Owner

mberr commented Jun 21, 2022

On the current master and with

torch              1.11.0+cu113

the code runs through - though even there, the signature seems to be mat, mat1, mat2 馃槄

@cthoyt
Copy link
Collaborator

cthoyt commented Jun 21, 2022

i've never set this up before but can we get the unit tests to run on multiple versions of pytorch?

@mberr
Copy link
Owner

mberr commented Jun 21, 2022

@mberr
Copy link
Owner

mberr commented Jun 21, 2022

I am not sure whether I want to support multiple version though, in particular as the sparse API is currently (1.11) declared beta 馃槄

@cthoyt
Copy link
Collaborator

cthoyt commented Jun 21, 2022

okay up to you

@mberr
Copy link
Owner

mberr commented Jun 21, 2022

@migalkin , for the first problem: If you do not provide a starting value yourself, it will initialize to

return torch.full(size=(n,), fill_value=1.0 / n)

which should sum up to 1. So I guess this is a numerical problem here.

@mberr
Copy link
Owner

mberr commented Jun 21, 2022

I played around with it, and the the error seemed to occur when there is large number of non-zero entries in one column. This is intuitive, since the more values we sum, the more error can accumulate.

In natural graphs with hub nodes, this can occur.

@mberr mberr closed this as completed in #16 Jun 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants