Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add opacus microbenchmarks code #368

Closed
ffuuugor opened this issue Mar 11, 2022 · 0 comments
Closed

Add opacus microbenchmarks code #368

ffuuugor opened this issue Mar 11, 2022 · 0 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@ffuuugor
Copy link
Contributor

Merge the code used for opacus microbenchmarks into opacus repo

@ffuuugor ffuuugor changed the title Microbenchmarks Merge microbenchmarks Mar 11, 2022
@ffuuugor ffuuugor added the enhancement New feature or request label Mar 11, 2022
@ffuuugor ffuuugor added this to the 1.1.1 milestone Mar 25, 2022
@karthikprasad karthikprasad changed the title Merge microbenchmarks Add opacus microbenchmarks code Apr 3, 2022
@jessijzhao jessijzhao mentioned this issue Apr 11, 2022
7 tasks
moaradwan pushed a commit to moaradwan/opacus that referenced this issue Aug 25, 2022
Summary:
## Types of changes

- [ ] Bug fix (non-breaking change which fixes an issue)
- [X] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Docs change / refactoring / dependency upgrade

## Motivation and Context / Related issue
There's a task pytorch#368 for committing benchmark code. In this change I add these benchmarks into CI integration tests. To choose thresholds I ran the benchmarks locally on all  the layers with (batch size: 16, num_runs: 100, num_repeats: 20, forward_only: False), and generated the following report:

|              | memory*  | memory* | memory*     | memory* | memory*      | runtime                | runtime              | runtime            | runtime                | runtime            |
|--------------|---------|--------|------------|--------|-------------|------------------------|----------------------|--------------------|------------------------|--------------------|
| value        | control | dp     | dp/control | gsm    | gsm/control | control                | dp                   | dp/control         | gsm                    | gsm/control        |
| base_layer   |         |        |            |        |             |                        |                      |                    |                        |                    |
| conv         | 0.0     |        |            | 0.0    |             | 2.021756922606001      |                      |                    | 3.2889059911645036     | 1.6267563891534373 |
| embedding    | 0.0     |        |            | 0.0    |             | 0.002484286398502263   |                      |                    | 0.013664713416999803   | 5.5004581698946    |
| groupnorm    | 0.0     |        |            | 0.0    |             | 0.0001871487290072764  |                      |                    | 0.00043170701800136156 | 2.306759016165034  |
| gru          | 0.0     | 0.0    |            | 0.0    |             | 0.045029744959007065   | 0.057370035271503174 | 1.2740475284443677 | 0.2402042072270033     | 5.334345274344187  |
| instancenorm | 0.0     |        |            | 0.0    |             | 0.004493124293996517   |                      |                    | 0.006058429501005777   | 1.3483779002287433 |
| layernorm    | 0.0     |        |            | 0.0    |             | 0.00011227587499979562 |                      |                    | 0.0002241125804985131  | 1.9960884784814286 |
| linear       | 0.0     |        |            | 0.0    |             | 0.001010556231000001   |                      |                    | 0.003052972127999998   | 3.021080900148341  |
| lstm         | 0.0     | 0.0    |            | 0.0    |             | 0.052634652085002925   | 0.06508583683050075  | 1.2365586975931682 | 0.2982182763324963     | 5.665816425477371  |
| mha          | 0.0     | 0.0    |            | 0.0    |             | 0.018872260358001765   | 0.01870937360499738  | 0.9913689854890476 | 0.02688384014700477    | 1.424516175435558  |
| rnn          | 0.0     | 0.0    |            | 0.0    |             | 0.01576623683249454    | 0.02184348723049516  | 1.3854597937711604 | 0.10178373254250346    | 6.455803856296582  |

(*) This report wasn't generated on a machine with CUDA so the memory wasn't measured. Will update later when it runs in CI on a GPU machine.

Using the report and section 3 in the [paper](https://arxiv.org/pdf/2109.12298.pdf), I parameterised the runtime and memory thresholds for different layers.

## How Has This Been Tested (if it applies)
 I ran the jobs locally and generated reports.

## Checklist

- [X] The documentation is up-to-date with the changes I made.
- [X] I have read the **CONTRIBUTING** document and completed the CLA (see **CONTRIBUTING**).
- [ ] All tests passed, and additional code has been covered with new tests.

Pull Request resolved: pytorch#479

Differential Revision: D38999201

Pulled By: moaradwan

fbshipit-source-id: 3d02931970e39ea331674c9f0676db9e22c5edaa
This was referenced Aug 25, 2022
facebook-github-bot pushed a commit that referenced this issue Aug 31, 2022
Summary:
## Types of changes

- [ ] Bug fix (non-breaking change which fixes an issue)
- [X] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Docs change / refactoring / dependency upgrade

 Issue: #368
## Motivation and Context / Related issue
There's a task #368 for committing benchmark code. In this change I add these benchmarks into CI integration tests. To choose thresholds I ran the benchmarks locally on all  the layers with (batch size: 16, num_runs: 100, num_repeats: 20, forward_only: False), ~and generated the following report:~ (please check the [comment below](#481 (comment)))

Using the [report](#481 (comment)) and section 3 in the [paper](https://arxiv.org/pdf/2109.12298.pdf), I parameterised the runtime and memory thresholds for different layers.

## How Has This Been Tested (if it applies)
 - I ran the jobs locally and generated reports.
 - Local CircleCI config validation `circleci config process .circleci/config.yml`
 - Local CircleCI job run: `circleci local execute --job JOB_NAME`
## Checklist

- [X] The documentation is up-to-date with the changes I made.
- [X] I have read the **CONTRIBUTING** document and completed the CLA (see **CONTRIBUTING**).
- [x] All tests passed, and additional code has been covered with new tests.

Pull Request resolved: #481

Reviewed By: ffuuugor

Differential Revision: D39026827

Pulled By: moaradwan

fbshipit-source-id: 5c6a7b1b5faaca9e5e95bd172b479a02f59b2b69
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants