[sparsity] Sparsity parametrization #58705

z-a-f · 2021-05-20T21:20:32Z

Stack from ghstack:

[quant][sparsity] Generic convert function #60728 [quant][sparsity] Generic convert function
[sparsity] Lambda Scheduler #59771 [sparsity] Lambda Scheduler
[sparsity] Base sparsity level scheduler class #59770 [sparsity] Base sparsity level scheduler class
[sparsity] WeightNormSparsifier #58955 [sparsity] WeightNormSparsifier
[sparsity][refactor] Import factoring out #58707 [sparsity][refactor] Import factoring out
[sparsity] Sparsifier class #58704 [sparsity] Sparsifier class
[sparsity] Sparsity parametrization #58705 [sparsity] Sparsity parametrization
[sparsity][refactor] Changing linear row/col control #60850 [sparsity][refactor] Changing linear row/col control
[sparsity] Add sparsity tests to run_test.py #60887 [sparsity] Add sparsity tests to run_test.py

The basic demo for this particular implementation can be found here:
https://gist.github.com/z-a-f/1d06ae8d5a509d3c9c1596dcb924afe0

Test Plan:

python test/test_ao_sparsity.py

Differential Revision: D28970959

[ghstack-poisoned]

facebook-github-bot · 2021-05-20T21:20:41Z

💊 CI failures summary and remediations

As of commit 9a69b41 (more details on the Dr. CI page and at hud.pytorch.org/pr/58705):

2/2 failures possibly* introduced in this PR
- 1/2 non-scanned failure(s)

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

Windows CI (pytorch-win-vs2019-cpu-py3) / test (default, 1, 2, windows.4xlarge) (1/1)

Step: "Run test scripts" (full log | diagnosis details | 🔁 rerun)

2021-07-01T22:52:39.4142956Z ERROR [0.098s]: te...aunch_user_script (__main__.TestDistirbutedLaunch)

2021-07-01T22:52:39.3847633Z https://pytorch.org/docs/stable/distributed.html#launch-utility for 
2021-07-01T22:52:39.3848201Z further instructions
2021-07-01T22:52:39.3848432Z 
2021-07-01T22:52:39.3848700Z   warnings.warn(
2021-07-01T22:52:39.3849000Z ERROR (0.098s)
2021-07-01T22:52:39.3849258Z *****************************************
2021-07-01T22:52:39.3850112Z Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
2021-07-01T22:52:39.3851080Z *****************************************
2021-07-01T22:52:39.4141613Z 
2021-07-01T22:52:39.4142175Z ======================================================================
2021-07-01T22:52:39.4142956Z ERROR [0.098s]: test_launch_user_script (__main__.TestDistirbutedLaunch)
2021-07-01T22:52:39.4143676Z ----------------------------------------------------------------------
2021-07-01T22:52:39.4144132Z Traceback (most recent call last):
2021-07-01T22:52:39.4145275Z   File "distributed/test_launcher.py", line 49, in test_launch_user_script
2021-07-01T22:52:39.4145779Z     launch.main(args)
2021-07-01T22:52:39.4146542Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\distributed\launch.py", line 187, in main
2021-07-01T22:52:39.4147116Z     launch(args)
2021-07-01T22:52:39.4147844Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\distributed\launch.py", line 173, in launch
2021-07-01T22:52:39.4148430Z     run(args)
2021-07-01T22:52:39.4149108Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\distributed\run.py", line 688, in run
2021-07-01T22:52:39.4149682Z     elastic_launch(

Preview docs built from this PR

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

[ghstack-poisoned]

z-a-f · 2021-06-08T19:24:43Z

@zafartahirov has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

raghuramank100 · 2021-06-10T06:30:14Z

torch/ao/nn/sparse/linear.py

+import torch
+from torch.nn import functional as F
+
+class Linear(torch.nn.Linear):


We should have the masking done by a module that can operate on the weights. This would be a FakeSparseModule, whos forward method masks the weights. Can you rewrite this implementation such that:

We have a base FakeSparseModule class that contains a mask attribute.

The FakeSparseModule is applied to the weight as a parameterization.

As discussed, the parametrization is going to be part of a separate PR

Sure, in this PR lets do the following:

define fakesparse module [similar to the mulby module]

class FakeSparse(nn.Module): def __init__(self, shape, sparsity_config): super().__init__() # Initialize mask to be all ones self.mask = torch.ones(shape) self.sparsity_config = sparsity_config def forward(self, x): assert self.mask.shape == x.shape return self.mask * x

If you are doing the reparameterization in a later PR, use this module manually, similar to how it is used in nn.qat.linear.

class Linear(torch.nn.Linear): def __init__(self, in_features, out_features, bias=True, device=None, dtype=None) -> None: factory_kwargs = {'device': device, 'dtype': dtype} super().__init__(in_features, out_features, bias, **factory_kwargs) self.fake_sparse = FakeSparse(self.weight.size()) def forward(self, input): return F.linear(input, self.fake_sparse(self.weight), self.bias)

raghuramank100 · 2021-06-10T06:31:40Z

torch/ao/nn/sparse/linear.py

+
+    @classmethod
+    def from_dense(cls, dense):
+        sparse = cls(dense.in_features, dense.out_features, (dense.bias is not None))


Can we handle this instead by doing the default init of the mask attr to be all ones? This way from_dense is not needed.

Can you elaborate on this? We need a transformation function for the prepare.

raghuramank100 · 2021-06-10T06:36:23Z

torch/ao/nn/sparse/quantized/linear.py

    """
    _version = 1
-    _FLOAT_MODULE = torch.nn.Linear
+    _FLOAT_MODULE = (torch.nn.Linear, SparseLinear)


I am assuming that nn.Linear is needed for the PTQ flow and SparseLinear for the QAT flow. Is that correct?

Lets remove the torch.nn.Linear as a supported option, In both flows, we will need to create a SparseLinearModule. In the Post training sparsity case, one could do:

m = nn.Linear(3,5) # Sparsifier.prepare would essentially do this: m_sparse = torch.nn.sparse.linear.from_dense(m) # m_sparse now has the same weights, but the mask is set to identity m_sparse.set_mask(mask, pattern) # Function to override the mask and define the sparsity pattern # Now we can convert only the sparse linear module.

raghuramank100 · 2021-06-10T06:37:08Z

torch/ao/nn/sparse/quantized/linear.py

-        weight = mod.weight
-        if getattr(mod.qconfig, 'mask', False):
-            weight = mod.qconfig.mask * mod.weight
+        weight = mod.weight * mod.mask


Let us make mask an attribute of the fakeSparse Module and do this via reparameterization

The reparametrization will be a separate PR, as there are several things to iron out there.

raghuramank100 · 2021-06-10T06:38:22Z

torch/ao/nn/sparse/quantized/linear.py


    @classmethod
-    def from_float(cls, mod):
+    def from_float(cls, mod, row_block_size=1, col_block_size=4):


We should not need the row/block col size here. This block just applies the mask. The mask should enforce the required sparsity pattern

I am not sure how we could avoid that -- once we call the from_float, we convert the float model into the sparse quantized model. In the latter we have to pack the weight, and we need the block shape in there. The sparsifier has this information, but the current model does not

I see, from the point of view of keeping the API clean, should we do the following:

When we create a sparseLinear module, the fakesparse module should have all the information needed to specify the mask.

At convert, we only rely on the fake-sparse module

def from_float(cls, mod): sparsity_pattern = mod.fake_sparse.sparsity_config.sparsity_pattern # This is the shape of the sparse blocks

Is this being addressed in a later PR?

This will be addressed by the torch.ao.utils.convert -- because it will take the sparse_config, the rows/cols can be passed around by that utility. Currently, it will be an argument to this to make sure the tests pass. I am going to keep the current implementation as a stopgap measure -- addressed in the later PRs

raghuramank100

Please add tests so that functionality is verified.

Differential Revision: [D28970959](https://our.internmc.facebook.com/intern/diff/D28970959) [ghstack-poisoned]

z-a-f · 2021-06-11T18:53:16Z

@zafartahirov has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Demo for the current PR: https://gist.github.com/z-a-f/1d06ae8d5a509d3c9c1596dcb924afe0 Differential Revision: [D28970959](https://our.internmc.facebook.com/intern/diff/D28970959) [ghstack-poisoned]

z-a-f · 2021-06-14T09:50:18Z

@zafartahirov has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

The basic demo for this particular implementation can be found here: https://gist.github.com/z-a-f/1d06ae8d5a509d3c9c1596dcb924afe0 Test Plan: ``` python test/test_ao_sparsity.py ``` Differential Revision: [D28970959](https://our.internmc.facebook.com/intern/diff/D28970959) [ghstack-poisoned]

test/ao/sparsity/test_parametrization.py

raghuramank100 · 2021-06-28T19:17:47Z

test/ao/sparsity/test_parametrization.py

+        self.assertEqual(model_save.seq[1].parametrizations['weight'][0].mask,
+                         model_load.seq[1].parametrizations['weight'][0].mask)
+
+    def test_jit_trace(self):


Can you also add a test to check if parameterized models are scriptable?

The parametrizations are not scriptable

Please mark a TODO to add tests for symbolic tracing also. We need paramaterization to compose with these APIs.

test/ao/sparsity/test_parametrization.py

torch/ao/sparsity/utils.py

raghuramank100 · 2021-06-28T19:29:00Z

torch/ao/sparsity/utils.py

+        not.
+    """
+    def __init__(self, mask):
+        super().__init__()


Should we instead init with shape instead of mask?
The mask could be initialized to torch.ones(shape)

def __init__(self, shape): super().__init__() self.register_buffer('mask', torch.ones(shape))

If the user wants to override the mask, they can directly set the attribute.

What if the user already has a mask?

raghuramank100

Looks good, a few comments

z-a-f · 2021-06-29T19:10:55Z

@zafartahirov has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

The basic demo for this particular implementation can be found here: https://gist.github.com/z-a-f/1d06ae8d5a509d3c9c1596dcb924afe0 Test Plan: ``` python test/test_ao_sparsity.py ``` Differential Revision: [D28970959](https://our.internmc.facebook.com/intern/diff/D28970959) [ghstack-poisoned]

z-a-f · 2021-06-29T20:25:02Z

@zafartahirov has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

The basic demo for this particular implementation can be found here: https://gist.github.com/z-a-f/1d06ae8d5a509d3c9c1596dcb924afe0 Test Plan: ``` python test/test_ao_sparsity.py ``` Differential Revision: [D28970959](https://our.internmc.facebook.com/intern/diff/D28970959) [ghstack-poisoned]

z-a-f · 2021-07-01T10:29:39Z

@zafartahirov has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

The basic demo for this particular implementation can be found here: https://gist.github.com/z-a-f/1d06ae8d5a509d3c9c1596dcb924afe0 Test Plan: ``` python test/test_ao_sparsity.py ``` Differential Revision: [D28970959](https://our.internmc.facebook.com/intern/diff/D28970959) [ghstack-poisoned]

raghuramank100 · 2021-07-01T18:51:51Z

test/ao/sparsity/test_parametrization.py

+        self.seq[0].weight = nn.Parameter(torch.zeros_like(self.seq[0].weight) + 2.0)
+        self.seq[1].weight = nn.Parameter(torch.zeros_like(self.seq[1].weight) + 3.0)
+        if bias:
+            self.linear = nn.Parameter(torch.zeros_like(self.linear.bias) + 10.0)


self.seq[0] should be self.seq[1]

raghuramank100

Looks good, some previous comments were missed, please take a look before landing.
Also, please add TODOs for scripting and symbolic tracing test coverage.

z-a-f · 2021-07-01T20:43:29Z

@zafartahirov has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

The basic demo for this particular implementation can be found here: https://gist.github.com/z-a-f/1d06ae8d5a509d3c9c1596dcb924afe0 Test Plan: ``` python test/test_ao_sparsity.py ``` Differential Revision: [D28970959](https://our.internmc.facebook.com/intern/diff/D28970959) [ghstack-poisoned]

z-a-f · 2021-07-01T21:36:03Z

@zafartahirov has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-07-02T18:12:49Z

@z-a-f merged this pull request in 80cab10.

The basic demo for this particular implementation can be found here: https://gist.github.com/z-a-f/1d06ae8d5a509d3c9c1596dcb924afe0 Test Plan: ``` python test/test_ao_sparsity.py ``` ghstack-source-id: cf62000 Pull Request resolved: pytorch/pytorch#58705

ghstack-source-id: f42b3da Pull Request resolved: pytorch/pytorch#58705

[sparsity] Sparse linear layers

0a7b205

[ghstack-poisoned]

facebook-github-bot added the cla signed label May 20, 2021

This was referenced May 20, 2021

[sparse] Add the AO namespace to torch #58703

Closed

[sparsity] Sparsifier class #58704

Closed

[sparsity] Sparsifier #58706

Closed

[sparsity][refactor] Import factoring out #58707

Closed

Zafar added 4 commits May 23, 2021 19:18

Update on "[sparsity] Sparse linear layers"

a8df41d

[ghstack-poisoned]

Update on "[sparsity] Sparse linear layers"

0c46eb8

[ghstack-poisoned]

Update on "[sparsity] Sparse linear layers"

d139b4c

[ghstack-poisoned]

Update on "[sparsity] Sparse linear layers"

199b3ba

[ghstack-poisoned]

z-a-f mentioned this pull request May 25, 2021

[sparsity] WeightNormSparsifier #58955

Closed

z-a-f requested a review from raghuramank100 June 8, 2021 19:17

Update on "[sparsity] Sparse linear layers"

d6b1288

[ghstack-poisoned]

This was referenced Jun 10, 2021

[sparsity] Base sparsity level scheduler class #59770

Closed

[sparsity] Lambda Scheduler #59771

Closed

raghuramank100 reviewed Jun 10, 2021

View reviewed changes

raghuramank100 suggested changes Jun 10, 2021

View reviewed changes

z-a-f requested review from kazhou and raghuramank100 June 10, 2021 19:18

Update on "[sparsity] Sparse linear layers"

4860e51

Differential Revision: [D28970959](https://our.internmc.facebook.com/intern/diff/D28970959) [ghstack-poisoned]

Update on "[sparsity] Sparse linear layers"

db2dbc9

Demo for the current PR: https://gist.github.com/z-a-f/1d06ae8d5a509d3c9c1596dcb924afe0 Differential Revision: [D28970959](https://our.internmc.facebook.com/intern/diff/D28970959) [ghstack-poisoned]

z-a-f requested a review from raghuramank100 June 28, 2021 17:54

raghuramank100 reviewed Jun 28, 2021

View reviewed changes

test/ao/sparsity/test_parametrization.py Show resolved Hide resolved

raghuramank100 reviewed Jun 28, 2021

View reviewed changes

test/ao/sparsity/test_parametrization.py Show resolved Hide resolved

raghuramank100 reviewed Jun 28, 2021

View reviewed changes

test/ao/sparsity/test_parametrization.py Outdated Show resolved Hide resolved

raghuramank100 reviewed Jun 28, 2021

View reviewed changes

test/ao/sparsity/test_parametrization.py Show resolved Hide resolved

raghuramank100 reviewed Jun 28, 2021

View reviewed changes

torch/ao/sparsity/utils.py Show resolved Hide resolved

raghuramank100 reviewed Jun 28, 2021

View reviewed changes

raghuramank100 suggested changes Jun 28, 2021

View reviewed changes

z-a-f requested a review from raghuramank100 July 1, 2021 06:23

raghuramank100 reviewed Jul 1, 2021

View reviewed changes

raghuramank100 approved these changes Jul 1, 2021

View reviewed changes

facebook-github-bot closed this in 80cab10 Jul 2, 2021

facebook-github-bot added the Merged label Jul 2, 2021

facebook-github-bot deleted the gh/z-a-f/105/head branch July 6, 2021 14:17

jasperzhong pushed a commit to jasperzhong/swift that referenced this pull request Nov 25, 2021

[sparsity] Sparse linear layers

8511927

ghstack-source-id: f42b3da Pull Request resolved: pytorch/pytorch#58705

[sparsity] Sparsity parametrization #58705

[sparsity] Sparsity parametrization #58705

Uh oh!

Conversation

z-a-f commented May 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented May 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

Windows CI (pytorch-win-vs2019-cpu-py3) / test (default, 1, 2, windows.4xlarge) (1/1)

Uh oh!

z-a-f commented Jun 8, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

raghuramank100 Jun 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

raghuramank100 left a comment

Choose a reason for hiding this comment

Uh oh!

z-a-f commented Jun 11, 2021

Uh oh!

z-a-f commented Jun 14, 2021

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

raghuramank100 left a comment

Choose a reason for hiding this comment

Uh oh!

z-a-f commented Jun 29, 2021

Uh oh!

z-a-f commented Jun 29, 2021

Uh oh!

z-a-f commented Jul 1, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

z-a-f commented May 20, 2021 •

edited

Loading

facebook-github-bot commented May 20, 2021 •

edited

Loading

raghuramank100 Jun 14, 2021 •

edited

Loading