Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[not for land, ci only] fake_quant: add a more memory efficient version #50849

Closed
wants to merge 1 commit into from

Commits on Jan 20, 2021

  1. fake_quant: add a more memory efficient version

    Summary:
    
    Not for review yet, a bunch of TODOs need finalizing.
    
    tl;dr; add an alternative implementation of `fake_quantize` which saves
    a ask during the forward pass and uses it to calculate the backward.
    
    There are two benefits:
    
    1. the backward function no longer needs the input Tensor, and it can be
    gc'ed earlier by autograd.  On MobileNetV2, this reduces QAT overhead
    by ~15% (TODO: link, and absolute numbers).  We add an additional mask Tensor
    to pass around, but its size is 4x smaller than the input tensor. A
    future optimization would be to pack the mask bitwise and unpack in the
    backward.
    
    2. the computation of `qval` can be done only once in the forward and
    reused in the backward. No perf change observed, TODO verify with better
    matrics.
    
    TODO: describe in more detail
    
    Test Plan:
    
    OSS / torchvision / MobileNetV2
    ```
    python references/classification/train_quantization.py
      --print-freq 1
      --data-path /data/local/packages/ai-group.imagenet-256-smallest-side/prod/
      --output-dir ~/nfs/pytorch_vision_tests/
      --backend qnnpack
      --epochs 5
    TODO paste results here
    ```
    
    TODO more
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    
    ghstack-source-id: f932055ee57b6a4e419d3896fb605c58fc063668
    Pull Request resolved: #50561
    vkuzo committed Jan 20, 2021
    Configuration menu
    Copy the full SHA
    8de18ba View commit details
    Browse the repository at this point in the history