-
-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
⚓🔍 NodePiece: GPU-enabled BFS searcher #990
Conversation
Okay this takes an interesting turn, the same BFS procedure on CPU and GPU gives a different result 👀 On YAGO310 mining for 500 anchors / 20 per node, debugging on a laptop, tokenization succeeds:
But running the same code on a GPU, nothing has been found:
|
sounds like it is an (too) early termination issue 🤔 |
btw, I could only run my checks on CPU today, and I never encountered the warning message about missing anchors. |
oh maybe it's because I reduced |
Debugging on a GPU led me to a revelation: torch_sparse apparently has some bugs processing bool tensors 😢 The workaround that returns the same results as on CPU assumes converting dense tensors to float and then converting the result back to bool (but those tricks kinda remove all memory saving from working with bool tensors :( ): reachable = spmm(
index=edge_list,
value=values.float(),
m=num_entities,
n=num_entities,
matrix=reachable.float()
) > 0.0 I'll open an issue in the torch_sparse repo: rusty1s/pytorch_sparse#243 Or it can be a general issue with the spmm kernel with bool tensors because it looks like we see the same even with the |
Okay, then we should just directly use float instead of converting from/to float in each iteration, right? In that case, we can also use the torch-builtin spmm without requiring an extra dependency. |
In my experiments, torch_sparse (even its CPU version) is much faster than the vanilla torch sparse operators, I'd keep the dependency since it's a separate Searcher anyways and that issue with bools might be solved soon'ish |
@cthoyt we are ready here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great looking code - I left two minor comments for adding more context to two TODOs. One big question I was wondering about is why we do these operations in numpy? Is this because it has data structures and functionality that pytorch doesn't?
The original |
GPU-enabled BFS searcher
SparseBFSSearcher
with torch_sparse and distance tracking.Uses sparse-dense matrix multiplication between bool tensors