[Feature request]: PyG installation instructions (esp. for XPUs) #166

chaitjo · 2024-03-25T00:43:30Z

Feature/behavior summary

I'm trying to get PyG to install and work well with Intel XPUs, and was hoping to use this repository as reference. At present, I see that PyG is never installed by default, and nor are any instructions for setting it up with XPUs available.

Request attributes

Would this be a refactor of existing code?
Does this proposal require new package dependencies?
Would this change break backwards compatibility?
Does this proposal include a new model?
Does this proposal include a new dataset?
Does this proposal include a new task/workflow?

Related issues

No response

Solution description

Unknown.

Additional notes

At present, working with a different repository (https://github.com/a-r-j/ProteinWorkshop), I've been trying to integrate your code for the XPU as a new accelerator in PyTorch Lightning: https://github.com/IntelLabs/matsciml/blob/main/matsciml/lightning/xpu.py.

So far, I'm able to get my trainer to identify the XPU as a device, but it seems like some torch_cluster operations are not compatible with tensor stored on XPUs. I would like to perform torch_cluster operations such as knn graph creation on XPU tensors so that I can do data processing in a batched manner or on-the-fly, as opposed to on the CPU.

Here is a minimal example which fails:

import torch
import intel_extension_for_pytorch as ipex
from torch_geometric.nn import knn_graph

device = torch.device('xpu:0' if torch.xpu.is_available() else 'cpu')

x = torch.tensor([[-1.0, -1.0], [-1.0, 1.0], [1.0, -1.0], [1.0, 1.0]]).to(device)
batch = torch.tensor([0, 0, 0, 0]).to(device)
edge_index = knn_graph(x, k=2, batch=batch, loop=False)

The resulting error is RuntimeError: x.device().is_cpu() INTERNAL ASSERT FAILED at "csrc/cpu/knn_cpu.cpp":12, please report a bug to PyTorch. x must be CPU tensor.

And here's a longer trace from the ProteinWorkshop codebase, which probably won't make any sense to MatSciML maintainers.

File "/home/ckj24/rds/hpc-work/envs/proteinworkshop/lib/python3.10/site-packages/torch_geometric/nn/pool/__init__.py", line 171, in knn_graph
    return torch_cluster.knn_graph(x, k, batch, loop, flow, cosine,
  File "/home/ckj24/rds/hpc-work/envs/proteinworkshop/lib/python3.10/site-packages/torch_cluster/knn.py", line 132, in knn_graph
    edge_index = knn(x, x, k if loop else k + 1, batch, batch, cosine,
  File "/home/ckj24/rds/hpc-work/envs/proteinworkshop/lib/python3.10/site-packages/torch_cluster/knn.py", line 81, in knn
    return torch.ops.torch_cluster.knn(x, y, ptr_x, ptr_y, k, cosine,
  File "/home/ckj24/rds/hpc-work/envs/proteinworkshop/lib/python3.10/site-packages/torch/_ops.py", line 692, in __call__
    return self._op(*args, **kwargs or {})
RuntimeError: x.device().is_cpu() INTERNAL ASSERT FAILED at "csrc/cpu/knn_cpu.cpp":12, please report a bug to PyTorch. x must be CPU tensor

The text was updated successfully, but these errors were encountered:

laserkelvin · 2024-03-25T15:27:51Z

Thanks for bringing this up! That's a good point, I think we've been taking a lot of the dependencies for granted and we'll update the documentation.

Nominally, PyG since a few versions ago, a lot of the PyG core functionality has been upstreamed to be PyTorch (e.g. torch_scatter stuff), but not everything; that means for the most part, PyG by itself should work out of the box on XPUs, however functionality that exists outside - torch_scatter, torch_cluster, torch_sparse - aren't supported yet. So the error you're seeing is basically the low level implementation for knn_graph only exists for CUDA or for CPUs, and it's expecting a tensor that resides on the latter.

I'm not 100% sure what our plans are for supporting those supplementary libraries, and so they might need to be treated on a case-by-case basis. Please reach out to me via email or Slack and we can discuss this further (even if it's not matsciml related). I'll keep this issue up still, since I agree we do need to update our PyG + XPU instructions.

chaitjo · 2024-03-25T22:25:04Z

Thanks!

What's the current recommended way to installing PyG?

I'm currently using:

pip install torch_geometric
pip install torch-scatter torch-cluster

..and this seems fine unless I need some of the functions from torch-cluster to be run on tensors which are located on XPUs. PyG's doc also states regarding torch-scatter and torch-cluster that these packages 'come with their own CPU and GPU kernel implementations based on the PyTorch C++/CUDA/hip(ROCm) extension interface.' So I suppose there's no real fix yet for my particular usecase apart from shifting my computation to the CPU.

laserkelvin · 2024-03-25T23:01:16Z

Those pip commands should work. If you are super paranoid, you can tack on --no-cache-dirs to make sure you're not using a cached version, and also --no-binary :all: to make sure it's built from source. If you have issues, I'd suggest you step through those :)

I've brought up torch_cluster support internally on some things we can potentially do, but will require some time. I'll send you an email separately.

laserkelvin · 2024-05-29T15:30:19Z

@chaitjo do you think I can close this issue?

#198 updated the README, and I think it should be pretty complete - within the bounds of the current status of broader framework support

chaitjo · 2024-05-29T19:29:48Z

Yes please.

…

On Wed, 29 May 2024 at 4:30 PM, Kelvin Lee ***@***.***> wrote: @chaitjo <https://github.com/chaitjo> do you think I can close this issue? #198 <#198> updated the README, and I think it should be pretty complete - within the bounds of the current status of broader framework support — Reply to this email directly, view it on GitHub <#166 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABUNYNIGUATPA5OXDECC2N3ZEXYCBAVCNFSM6AAAAABFGEFOHOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZXGY4TSNBRGM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

chaitjo added the enhancement New feature or request label Mar 25, 2024

laserkelvin added the needs triage Issue needs decision making label Mar 25, 2024

laserkelvin mentioned this issue Apr 23, 2024

Revised XPU instructions and features #198

Merged

laserkelvin closed this as completed May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request]: PyG installation instructions (esp. for XPUs) #166

[Feature request]: PyG installation instructions (esp. for XPUs) #166

chaitjo commented Mar 25, 2024

laserkelvin commented Mar 25, 2024

chaitjo commented Mar 25, 2024

laserkelvin commented Mar 25, 2024

laserkelvin commented May 29, 2024

chaitjo commented May 29, 2024 via email

[Feature request]: PyG installation instructions (esp. for XPUs) #166

[Feature request]: PyG installation instructions (esp. for XPUs) #166

Comments

chaitjo commented Mar 25, 2024

Feature/behavior summary

Request attributes

Related issues

Solution description

Additional notes

laserkelvin commented Mar 25, 2024

chaitjo commented Mar 25, 2024

laserkelvin commented Mar 25, 2024

laserkelvin commented May 29, 2024

chaitjo commented May 29, 2024 via email