Skip to content

Conversation

guangyey
Copy link
Collaborator

@guangyey guangyey commented Jan 29, 2024

Stack from ghstack (oldest at bottom):

Motivation

As mentioned in [RFC] Intel GPU Runtime Upstreaming, the last runtime component we would like to upstream is Generator which is responsible for the pseudo-random number generation. To facilitate the code review, we split the code changes into 2 PRs. This is one of the 2 PRs and covers the changes under aten.

Design

Following the previous design, c10::GeneratorImpl is the device-agnostic abstraction of a random number generator. So we will introduce an XPU generator XPUGeneratorImpl, inheriting from c10::GeneratorImpl, to manage random states on an Intel GPU device. Intel GPU runtime Generator adopts the same algorithm as CPU. The corresponding C++ file should be placed in aten/src/ATen/xpu/ folder and is built in libtorch_xpu.so.
This PR provide the list of APIs:

  • getDefaultXPUGenerator
  • createXPUGenerator

Additional Context

The 2nd PR will cover python frontend.

The differences with CUDA:
The generator-related ATen CPP APIs are 1:1 mapping with CUDA.
The XPUGeneratorImpl's member functions have slight differences with CUDA.
lack of CUDA-related counterpart APIs listed below:

  • capture_prologue
  • capture_epilogue
  • philox_cuda_state
  • reset_rnn_state

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

Copy link

pytorch-bot bot commented Jan 29, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/118528

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 03217cd with merge base 685d862 (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@guangyey guangyey added ciflow/xpu Run XPU CI tasks intel This tag is for PR from Intel topic: new features topic category labels Jan 30, 2024
@guangyey guangyey added the release notes: xpu release notes category label Jan 30, 2024
guangyey added a commit that referenced this pull request Jan 30, 2024
ghstack-source-id: 2384013
Pull Request resolved: #118528
cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 26, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: xpu / linux-jammy-xpu-py3.8 / build

Details for Dev Infra team Raised by workflow job

guangyey added a commit that referenced this pull request Feb 26, 2024
…nerator"


# Motivation
According to [[1/2] Intel GPU Runtime Upstreaming for Generator](#118528), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](#114842), the second PR covers the changes under `python frontend`.

# Design
Currently, it primarily offers geneartor-related APIs, including

- `torch.xpu.default_generators`
- `torch.xpu.get_rng_state`
- `torch.xpu.get_rng_state_all`
- `torch.xpu.initial_seed`
- `torch.xpu.manual_seed`
- `torch.xpu.manual_seed_all`
- `torch.xpu.seed`
- `torch.xpu.seed_all`
- `torch.xpu.set_rng_state`
- `torch.xpu.set_rng_state_all`

# Additional Context
The differences with CUDA:
The generator-related frontend python APIs are 1:1 mapping with CUDA.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
guangyey added a commit that referenced this pull request Feb 26, 2024
# Motivation
According to [[1/2] Intel GPU Runtime Upstreaming for Generator](#118528), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](#114842), the second PR covers the changes under `python frontend`.

# Design
Currently, it primarily offers geneartor-related APIs, including

- `torch.xpu.default_generators`
- `torch.xpu.get_rng_state`
- `torch.xpu.get_rng_state_all`
- `torch.xpu.initial_seed`
- `torch.xpu.manual_seed`
- `torch.xpu.manual_seed_all`
- `torch.xpu.seed`
- `torch.xpu.seed_all`
- `torch.xpu.set_rng_state`
- `torch.xpu.set_rng_state_all`

# Additional Context
The differences with CUDA:
The generator-related frontend python APIs are 1:1 mapping with CUDA.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
guangyey added a commit that referenced this pull request Feb 26, 2024
…nerator"


# Motivation
According to [[1/2] Intel GPU Runtime Upstreaming for Generator](#118528), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](#114842), the second PR covers the changes under `python frontend`.

# Design
Currently, it primarily offers geneartor-related APIs, including

- `torch.xpu.default_generators`
- `torch.xpu.get_rng_state`
- `torch.xpu.get_rng_state_all`
- `torch.xpu.initial_seed`
- `torch.xpu.manual_seed`
- `torch.xpu.manual_seed_all`
- `torch.xpu.seed`
- `torch.xpu.seed_all`
- `torch.xpu.set_rng_state`
- `torch.xpu.set_rng_state_all`

# Additional Context
The differences with CUDA:
The generator-related frontend python APIs are 1:1 mapping with CUDA.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
guangyey added a commit that referenced this pull request Feb 26, 2024
# Motivation
According to [[1/2] Intel GPU Runtime Upstreaming for Generator](#118528), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](#114842), the second PR covers the changes under `python frontend`.

# Design
Currently, it primarily offers geneartor-related APIs, including

- `torch.xpu.default_generators`
- `torch.xpu.get_rng_state`
- `torch.xpu.get_rng_state_all`
- `torch.xpu.initial_seed`
- `torch.xpu.manual_seed`
- `torch.xpu.manual_seed_all`
- `torch.xpu.seed`
- `torch.xpu.seed_all`
- `torch.xpu.set_rng_state`
- `torch.xpu.set_rng_state_all`

# Additional Context
The differences with CUDA:
The generator-related frontend python APIs are 1:1 mapping with CUDA.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
guangyey added a commit that referenced this pull request Feb 26, 2024
…nerator"


# Motivation
According to [[1/2] Intel GPU Runtime Upstreaming for Generator](#118528), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](#114842), the second PR covers the changes under `python frontend`.

# Design
Currently, it primarily offers geneartor-related APIs, including

- `torch.xpu.default_generators`
- `torch.xpu.get_rng_state`
- `torch.xpu.get_rng_state_all`
- `torch.xpu.initial_seed`
- `torch.xpu.manual_seed`
- `torch.xpu.manual_seed_all`
- `torch.xpu.seed`
- `torch.xpu.seed_all`
- `torch.xpu.set_rng_state`
- `torch.xpu.set_rng_state_all`

# Additional Context
The differences with CUDA:
The generator-related frontend python APIs are 1:1 mapping with CUDA.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
guangyey added a commit that referenced this pull request Feb 26, 2024
# Motivation
According to [[1/2] Intel GPU Runtime Upstreaming for Generator](#118528), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](#114842), the second PR covers the changes under `python frontend`.

# Design
Currently, it primarily offers geneartor-related APIs, including

- `torch.xpu.default_generators`
- `torch.xpu.get_rng_state`
- `torch.xpu.get_rng_state_all`
- `torch.xpu.initial_seed`
- `torch.xpu.manual_seed`
- `torch.xpu.manual_seed_all`
- `torch.xpu.seed`
- `torch.xpu.seed_all`
- `torch.xpu.set_rng_state`
- `torch.xpu.set_rng_state_all`

# Additional Context
The differences with CUDA:
The generator-related frontend python APIs are 1:1 mapping with CUDA.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
@guangyey guangyey requested a review from albanD February 26, 2024 08:30
guangyey added a commit that referenced this pull request Feb 26, 2024
…nerator"


# Motivation
According to [[1/2] Intel GPU Runtime Upstreaming for Generator](#118528), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](#114842), the second PR covers the changes under `python frontend`.

# Design
Currently, it primarily offers geneartor-related APIs, including

- `torch.xpu.default_generators`
- `torch.xpu.get_rng_state`
- `torch.xpu.get_rng_state_all`
- `torch.xpu.initial_seed`
- `torch.xpu.manual_seed`
- `torch.xpu.manual_seed_all`
- `torch.xpu.seed`
- `torch.xpu.seed_all`
- `torch.xpu.set_rng_state`
- `torch.xpu.set_rng_state_all`

# Additional Context
The differences with CUDA:
The generator-related frontend python APIs are 1:1 mapping with CUDA.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
guangyey added a commit that referenced this pull request Feb 26, 2024
# Motivation
According to [[1/2] Intel GPU Runtime Upstreaming for Generator](#118528), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](#114842), the second PR covers the changes under `python frontend`.

# Design
Currently, it primarily offers geneartor-related APIs, including

- `torch.xpu.default_generators`
- `torch.xpu.get_rng_state`
- `torch.xpu.get_rng_state_all`
- `torch.xpu.initial_seed`
- `torch.xpu.manual_seed`
- `torch.xpu.manual_seed_all`
- `torch.xpu.seed`
- `torch.xpu.seed_all`
- `torch.xpu.set_rng_state`
- `torch.xpu.set_rng_state_all`

# Additional Context
The differences with CUDA:
The generator-related frontend python APIs are 1:1 mapping with CUDA.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
# Motivation
As mentioned in [[RFC] Intel GPU Runtime Upstreaming](#114842), the last runtime component we would like to upstream is `Generator` which is responsible for the pseudo-random number generation. To facilitate the code review, we split the code changes into 2 PRs. This is one of the 2 PRs and covers the changes under `aten`.

# Design
Following the previous design, `c10::GeneratorImpl` is the device-agnostic abstraction of a random number generator. So we will introduce an XPU generator `XPUGeneratorImpl`, inheriting from `c10::GeneratorImpl`, to manage random states on an Intel GPU device. Intel GPU runtime `Generator` adopts the same algorithm as CPU. The corresponding C++ file should be placed in aten/src/ATen/xpu/ folder and is built in `libtorch_xpu.so`.
This PR provide the list of APIs:
- `getDefaultXPUGenerator`
- `createXPUGenerator`

# Additional Context
The 2nd PR will cover `python frontend`.

The differences with CUDA:
The generator-related ATen CPP APIs are 1:1 mapping with CUDA.
The XPUGeneratorImpl's member functions have slight differences with CUDA.
lack of CUDA-related counterpart APIs listed below:
- capture_prologue
- capture_epilogue
- philox_cuda_state
- reset_rnn_state

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM !

*/
c10::once_flag init_flag;
DeviceIndex num_gpus = -1;
std::deque<c10::once_flag> xpu_gens_init_flag;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ho interesting. Sounds good!

@guangyey
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

guangyey added a commit that referenced this pull request Feb 27, 2024
…nerator"


# Motivation
According to [[1/2] Intel GPU Runtime Upstreaming for Generator](#118528), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](#114842), the second PR covers the changes under `python frontend`.

# Design
Currently, it primarily offers geneartor-related APIs, including

- `torch.xpu.default_generators`
- `torch.xpu.get_rng_state`
- `torch.xpu.get_rng_state_all`
- `torch.xpu.initial_seed`
- `torch.xpu.manual_seed`
- `torch.xpu.manual_seed_all`
- `torch.xpu.seed`
- `torch.xpu.seed_all`
- `torch.xpu.set_rng_state`
- `torch.xpu.set_rng_state_all`

# Additional Context
The differences with CUDA:
The generator-related frontend python APIs are 1:1 mapping with CUDA.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
guangyey added a commit that referenced this pull request Feb 27, 2024
# Motivation
According to [[1/2] Intel GPU Runtime Upstreaming for Generator](#118528), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](#114842), the second PR covers the changes under `python frontend`.

# Design
Currently, it primarily offers geneartor-related APIs, including

- `torch.xpu.default_generators`
- `torch.xpu.get_rng_state`
- `torch.xpu.get_rng_state_all`
- `torch.xpu.initial_seed`
- `torch.xpu.manual_seed`
- `torch.xpu.manual_seed_all`
- `torch.xpu.seed`
- `torch.xpu.seed_all`
- `torch.xpu.set_rng_state`
- `torch.xpu.set_rng_state_all`

# Additional Context
The differences with CUDA:
The generator-related frontend python APIs are 1:1 mapping with CUDA.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
guangyey added a commit that referenced this pull request Feb 28, 2024
…nerator"


# Motivation
According to [[1/2] Intel GPU Runtime Upstreaming for Generator](#118528), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](#114842), the second PR covers the changes under `python frontend`.

# Design
Currently, it primarily offers geneartor-related APIs, including

- `torch.xpu.default_generators`
- `torch.xpu.get_rng_state`
- `torch.xpu.get_rng_state_all`
- `torch.xpu.initial_seed`
- `torch.xpu.manual_seed`
- `torch.xpu.manual_seed_all`
- `torch.xpu.seed`
- `torch.xpu.seed_all`
- `torch.xpu.set_rng_state`
- `torch.xpu.set_rng_state_all`

# Additional Context
The differences with CUDA:
The generator-related frontend python APIs are 1:1 mapping with CUDA.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
guangyey added a commit that referenced this pull request Feb 28, 2024
# Motivation
According to [[1/2] Intel GPU Runtime Upstreaming for Generator](#118528), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](#114842), the second PR covers the changes under `python frontend`.

# Design
Currently, it primarily offers geneartor-related APIs, including

- `torch.xpu.default_generators`
- `torch.xpu.get_rng_state`
- `torch.xpu.get_rng_state_all`
- `torch.xpu.initial_seed`
- `torch.xpu.manual_seed`
- `torch.xpu.manual_seed_all`
- `torch.xpu.seed`
- `torch.xpu.seed_all`
- `torch.xpu.set_rng_state`
- `torch.xpu.set_rng_state_all`

# Additional Context
The differences with CUDA:
The generator-related frontend python APIs are 1:1 mapping with CUDA.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Feb 28, 2024
# Motivation
According to [[1/2] Intel GPU Runtime Upstreaming for Generator](#118528), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](#114842), the second PR covers the changes under `python frontend`.

# Design
Currently, it primarily offers geneartor-related APIs, including

- `torch.xpu.default_generators`
- `torch.xpu.get_rng_state`
- `torch.xpu.get_rng_state_all`
- `torch.xpu.initial_seed`
- `torch.xpu.manual_seed`
- `torch.xpu.manual_seed_all`
- `torch.xpu.seed`
- `torch.xpu.seed_all`
- `torch.xpu.set_rng_state`
- `torch.xpu.set_rng_state_all`

# Additional Context
The differences with CUDA:
The generator-related frontend python APIs are 1:1 mapping with CUDA.

Pull Request resolved: #118613
Approved by: https://github.com/gujinghui, https://github.com/EikanWang, https://github.com/jgong5, https://github.com/albanD
@github-actions github-actions bot deleted the gh/guangyey/9/head branch March 28, 2024 01:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request ciflow/xpu Run XPU CI tasks intel This tag is for PR from Intel Merged open source release notes: xpu release notes category topic: new features topic category
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

6 participants