Skip to content

fix(aten::batch_norm): A new batch norm implementation that hopefully doesnt have the same performace cost #55

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 14, 2020

Conversation

narendasan
Copy link
Collaborator

Signed-off-by: Naren Dasan naren@narendasan.com
Signed-off-by: Naren Dasan narens@nvidia.com

Description

Addresses performance issues seen with large input sizes and the conv based batch norm implementation. This new implementation just uses scale layers instead of conv.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation and have regenerated the documentation (make html in docsrc)
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes

doesnt have the same performace cost

Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
@narendasan narendasan added the component: converters Issues re: Specific op converters label May 8, 2020
@narendasan narendasan marked this pull request as draft May 8, 2020 07:21
@narendasan
Copy link
Collaborator Author

Fix was confirmed

[JIT]: batch_size: 1
    Average latency: 61.2975 ms
    Average FPS: 16.3139 fps
    Latency Standard Deviation: 0.805419
    FPS Standard Deviation: 0.24558
(excluding initial warmup runs)
[JIT/TRT]: batch_size: 1
    Average latency: 31.1957 ms
    Average FPS: 32.0557 fps
    Latency Standard Deviation: 0.107248
    FPS Standard Deviation: 0.110173

@narendasan narendasan marked this pull request as ready for review May 14, 2020 23:15
@narendasan narendasan merged commit 227dea3 into master May 14, 2020
@narendasan narendasan deleted the batch_norm_alt branch May 14, 2020 23:15
frank-wei pushed a commit that referenced this pull request Jun 4, 2022
Summary:
Pull Request resolved: https://github.com/pytorch/fx2trt/pull/55

Apply pass manager to lower flow

Reviewed By: khabinov

Differential Revision: D35518483

fbshipit-source-id: 48bc9c364cd006cc5a2c1b04d667987827f0a4d4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: converters Issues re: Specific op converters
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant