Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add base class DistLoader #8079

Merged
merged 15 commits into from
Oct 2, 2023
Merged

Add base class DistLoader #8079

merged 15 commits into from
Oct 2, 2023

Conversation

JakubPietrakIntel
Copy link
Contributor

@JakubPietrakIntel JakubPietrakIntel commented Sep 27, 2023

[1/3] Distributed Loaders PRs
This PR includes base class of DistributedLoader that handles RPC connection and handling requests from DistributedNeighborSampler processes.

It includes basic DistNeighborSampler functions used by the loader.

  1. Add base class DistLoader #8079
  2. Add DistributedNeighborLoader [3/6] #8080
  3. Add DistributedLinkNeighborLoader [4/6] #8085

Other PRs related to this module:
DistSampler: #7974
GraphStore\FeatureStore: #8083

@JakubPietrakIntel JakubPietrakIntel changed the title Add base class DistLoader and utils Add base class DistLoader Sep 27, 2023
@codecov
Copy link

codecov bot commented Sep 27, 2023

Codecov Report

Merging #8079 (597d472) into master (a9776e6) will decrease coverage by 1.14%.
Report is 1 commits behind head on master.
The diff coverage is 38.36%.

@@            Coverage Diff             @@
##           master    #8079      +/-   ##
==========================================
- Coverage   89.42%   88.28%   -1.14%     
==========================================
  Files         467      471       +4     
  Lines       27913    28151     +238     
==========================================
- Hits        24962    24854     -108     
- Misses       2951     3297     +346     
Files Coverage Δ
torch_geometric/distributed/rpc.py 87.62% <100.00%> (-1.04%) ⬇️
torch_geometric/utils/__init__.py 100.00% <ø> (ø)
torch_geometric/utils/subgraph.py 98.27% <92.00%> (-1.73%) ⬇️
torch_geometric/data/hypergraph_data.py 85.00% <85.00%> (ø)
torch_geometric/distributed/utils.py 0.00% <0.00%> (ø)
torch_geometric/distributed/dist_loader.py 0.00% <0.00%> (ø)
...rch_geometric/distributed/dist_neighbor_sampler.py 0.00% <0.00%> (ø)

... and 30 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@github-actions github-actions bot removed the loader label Oct 2, 2023
@rusty1s rusty1s merged commit 13b3243 into master Oct 2, 2023
15 of 16 checks passed
@rusty1s rusty1s deleted the intel/dist-loader branch October 2, 2023 13:50
rusty1s added a commit that referenced this pull request Oct 9, 2023
This code belongs to the part of the whole distributed training for PyG.

`DistNeighborSampler` leverages the `NeighborSampler` class from
`pytorch_geometric` and the `neighbor_sample` function from `pyg-lib`.
However, due to the fact that in case of distributed training it is
required to synchronise the results between machines after each layer,
the part of the code responsible for sampling was implemented in python.

Added suport for the following sampling methods:
- node, edge, negative, disjoint, temporal

**TODOs:**

- [x] finish hetero part
- [x] subgraph sampling

**This PR should be merged together with other distributed PRs:**
pyg-lib: [#246](pyg-team/pyg-lib#246),
[#252](pyg-team/pyg-lib#252)
GraphStore\FeatureStore:
#8083
DistLoaders:
1.  #8079
2.  #8080
3.  #8085

---------

Co-authored-by: JakubPietrakIntel <jakub.pietrak@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ZhengHongming888 <hongming.zheng@intel.com>
Co-authored-by: Jakub Pietrak <97102979+JakubPietrakIntel@users.noreply.github.com>
Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
rusty1s added a commit that referenced this pull request Nov 6, 2023
This code belongs to the part of the whole distributed training for PyG.

Please be aware that this PR should be merged before Loaders package! -
@JakubPietrakIntel
Loaders:
1.  #8079
2.  #8080
3.  #8085

Other PRs related to this module:
DistSampler: #7974

---------

Co-authored-by: JakubPietrakIntel <jakub.pietrak@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
rusty1s added a commit that referenced this pull request Nov 10, 2023
**[2/3] Distributed Loaders PRs**
This PR includes`DistributedNeighborLoader` used for processing node
sampler output in distributed training setup.


1.  #8079
2.  #8080
3.  #8085

Other PRs related to this module:
DistSampler: #7974
GraphStore\FeatureStore:
#8083

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
rusty1s added a commit that referenced this pull request Nov 10, 2023
**[3/3] Distributed Loaders PRs**
This PR includes `DistributedLinkNeighborLoader` used for processing
edge sampler output in distributed training setup.


1.  #8079
2.  #8080
3.  #8085

Other PRs related to this module:
DistSampler: #7974
GraphStore\FeatureStore:
#8083

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants