Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable distributed link hetero sampling #8722

Merged
merged 17 commits into from
Feb 4, 2024

Conversation

kgajdamo
Copy link
Contributor

@kgajdamo kgajdamo commented Jan 5, 2024

This PR enables distributed edge sampling for heterogeneous graphs.

Added:

  • Distributed edge heterogeneous sampling.
  • Distributed edge heterogeneous node-level and edge-level temporal sampling.
  • DistEdgeHeteroSamplerInput class, which serves as an input data to the node_sample function when for a given input edge there are different source and target node types.
  • unit tests

Comments:

  • In the case when a given input edge has distinct source and destination node types it is necessary to handle the data of each of these types separately, so it is slightly different from the situation when we have only one input node type.
  • This PR depends on: #8718

Copy link

codecov bot commented Jan 5, 2024

Codecov Report

Attention: 91 lines in your changes are missing coverage. Please review.

Comparison is base (c65cff9) 86.46% compared to head (472a2a0) 86.27%.

Files Patch % Lines
...rch_geometric/distributed/dist_neighbor_sampler.py 0.00% 91 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8722      +/-   ##
==========================================
- Coverage   86.46%   86.27%   -0.20%     
==========================================
  Files         479      479              
  Lines       31189    31267      +78     
==========================================
+ Hits        26969    26976       +7     
- Misses       4220     4291      +71     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@denadai2
Copy link
Contributor

thanks for doing this!!

@github-actions github-actions bot removed the sampler label Feb 4, 2024
@rusty1s rusty1s merged commit e4568a2 into pyg-team:master Feb 4, 2024
15 of 16 checks passed
rusty1s added a commit that referenced this pull request Feb 5, 2024
…tributed solution (#8820)

**Description:**

This PR purpose is to add e2e link prediction example to run distributed
training on a MovieLens dataset. It is characterized by the presence of
temporal information for the edges.

This example is strongly inspired on:
[distributed_cpu.py](https://github.com/pyg-team/pytorch_geometric/blob/f9470530bd23bbcff93f73c6818af3fe03f0bf67/examples/distributed/pyg/distributed_cpu.py)
and
[temporal_link_pred.py](https://github.com/pyg-team/pytorch_geometric/blob/master/examples/hetero/temporal_link_pred.py)

**IMPORTANT INFORMATION:** This script depends on [Enable distributed
link hetero sampling
PR](#8722). It will
not work without the changes made in that PR. So that one should be
merged first.

**Script information:**
- Link prediction
- Edge based temporal information
- MovieLens dataset

**How to run:**
The example should be evaluated on at least 2 machines. Before running
the example a prework needs to be done. It is necessary to generate a
partitions of the MovieLens dataset using a
[partition_graph.py](https://github.com/pyg-team/pytorch_geometric/blob/master/examples/distributed/pyg/partition_graph.py)
script using the command below:
`python partition_graph.py --MovieLens --num_partitions {number of
machines}`

Example commands to run the script:
Node 0:
`python ./distributed_link_temporal_cpu.py --num_nodes 2 --node_rank 0
--batch_size 1024 --master_addr {ip address of one of the machines}
--ddp_port 11111 --train_loader_port 11112 --test_loader_port 11113`

Node 1:
`python ./distributed_link_temporal_cpu.py --num_nodes 2 --node_rank 1
--batch_size 1024 --master_addr {ip address of one of the machines}
--ddp_port 11111 --train_loader_port 11112 --test_loader_port 11113`

(As you can see the only difference is the `--node_rank` argument.)

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants