Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs][MPS] Add mps environment variable table #129008

Closed
wants to merge 8 commits into from

Conversation

qqaatw
Copy link
Collaborator

@qqaatw qqaatw commented Jun 18, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Jun 18, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/129008

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 408c627 with merge base 9a7e251 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

qqaatw added a commit that referenced this pull request Jun 18, 2024
ghstack-source-id: 6905ac57e91d92d3c30d9be625cdfa7e403bdcf4
Pull Request resolved: #129008
[ghstack-poisoned]
qqaatw added a commit that referenced this pull request Jun 18, 2024
ghstack-source-id: 98f71f85ac9b9ea53c44cac317d22b2245b7db90
Pull Request resolved: #129008
@qqaatw qqaatw marked this pull request as ready for review June 18, 2024 23:18
[ghstack-poisoned]
qqaatw added a commit that referenced this pull request Jun 19, 2024
ghstack-source-id: e7fa7fbfd215d9727b61d530241a16ecd564cb49
Pull Request resolved: #129008
[ghstack-poisoned]
qqaatw added a commit that referenced this pull request Jun 19, 2024
ghstack-source-id: b7f196ebe495fd814d817a84b50cea35f48c0365
Pull Request resolved: #129008
Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

[ghstack-poisoned]
qqaatw added a commit that referenced this pull request Jun 19, 2024
ghstack-source-id: 006b50c0d43befe8ec719905d6dba73b7f80b3e3
Pull Request resolved: #129008
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@qqaatw
Copy link
Collaborator Author

qqaatw commented Jun 20, 2024

@pytorchbot merge -f "doc tests passed"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Jun 25, 2024
Allow users to decide whether they want to have fast math enabled via env var
Pull Request resolved: #129007
Approved by: https://github.com/malfet
ghstack dependencies: #129006, #129008
pytorchmergebot pushed a commit that referenced this pull request Jun 26, 2024
This PR generalizes the multi_tensor_apply function for other fused optimizers

Pull Request resolved: #129105
Approved by: https://github.com/malfet
ghstack dependencies: #129006, #129008, #129007
pytorchmergebot pushed a commit that referenced this pull request Jun 27, 2024
```
[-------------------------------------- Fused SGD --------------------------------------]
                                                          |  Fused: True  |  Fused: False
1 threads: ------------------------------------------------------------------------------
      numel: 1024, num_tensors: 100, momentum: True       |        2      |       15
      numel: 1024, num_tensors: 100, momentum: False      |        2      |        5
      numel: 65536, num_tensors: 100, momentum: True      |        3      |       16
      numel: 65536, num_tensors: 100, momentum: False     |        2      |        5
      numel: 1048576, num_tensors: 100, momentum: True    |       11      |       16
      numel: 1048576, num_tensors: 100, momentum: False   |        8      |        6
      numel: 1024, num_tensors: 500, momentum: True       |       29      |       70
      numel: 1024, num_tensors: 500, momentum: False      |       20      |       24
      numel: 65536, num_tensors: 500, momentum: True      |       33      |       76
      numel: 65536, num_tensors: 500, momentum: False     |       22      |       26
      numel: 1048576, num_tensors: 500, momentum: True    |       70      |       80
      numel: 1048576, num_tensors: 500, momentum: False   |       43      |       40
      numel: 1024, num_tensors: 1000, momentum: True      |      108      |      139
      numel: 1024, num_tensors: 1000, momentum: False     |       72      |       48
      numel: 65536, num_tensors: 1000, momentum: True     |      116      |      150
      numel: 65536, num_tensors: 1000, momentum: False    |       77      |       52
      numel: 1048576, num_tensors: 1000, momentum: True   |      190      |      170
      numel: 1048576, num_tensors: 1000, momentum: False  |      120      |       50
```

```python
def profile_fused_sgd():
    from torch.optim.sgd import sgd
    import torch.utils.benchmark as benchmark

    import itertools

    def profile(fn, params, grads, momentum_buffer_list, fused):
        fn(
            params,
            grads,
            momentum_buffer_list,
            momentum=True if len(momentum_buffer_list) > 0 else False,
            dampening=0.0,
            nesterov=False,
            foreach=False,
            fused=fused,
            lr=1e-3,
            weight_decay=.0,
            maximize=False,
            grad_scale=None,
            found_inf=None,
        )
        torch.mps.synchronize()

    device = "mps"

    results = []

    for num_tensors, numel, momentum in itertools.product([100, 500, 1000], [1024, 65536, 1048576], [True, False]):
        sublabel = f"numel: {numel}, num_tensors: {num_tensors}, momentum: {momentum}"
        print(sublabel)
        params, grads = [[torch.arange(numel, dtype=torch.float32, device=device) + (numel * i) for i in range(num_tensors)] for _ in range(2)]
        momentum_buffer_list = [torch.arange(numel, dtype=torch.float32, device=device) + (numel * i) for i in range(num_tensors)] if momentum else []
        fn = sgd

        for fused in [True, False]:

            t = benchmark.Timer(
                    stmt='profile(fn, params, grads, momentum_buffer_list, fused)',
                    label='Fused SGD',
                    sub_label=sublabel,
                    globals=locals(),
                    description= f"Fused: {fused}",
                ).blocked_autorange(min_run_time=5)
            results.append(t)

    compare = benchmark.Compare(results)
    compare.trim_significant_figures()
    compare.colorize(rowwise=True)
    compare.print()
```
Pull Request resolved: #129350
Approved by: https://github.com/janeyx99
ghstack dependencies: #129006, #129008, #129007, #129105
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants