inductor: enable weight prepack for LSTM #103071

chunyuan-w · 2023-06-06T09:08:06Z

Stack from ghstack (oldest at bottom):

-> inductor: enable weight prepack for LSTM #103071

Enabled LSTM weight prepack in inductor.
Added a mkldnn decomposition for lstm which won't change for different seq_lens. With the previous decomposition, for dynamic shapes use case where seq_lens changes, the graph will be different.
Extended several inductor utility functions to support List(Tensor) as input. Previously those functions only supported Tensor input.

Update 2023-07-26:

inductor: move the CPU weight packing path after of AOTAutograd #103851 has moved CPU weight packing to be after AOTAutograd. Fixed the support in this PR to follow the same way (mainly in 3b207f7#diff-6dffed1ade0ba3e887f9a4eafa3bfcec267ab2365b8adcb91bd391f49b3fd2e3).
LSTM is decomposed in aten.mkldnn_rnn_layer by layer and by direction. The weight prepack is done at the mkldnn_rnn_layer level.

Add a fix in rnn __get_state__ function in case we need to recompile an LSTM module.
When compiling the module, the weights tensors which are the named_parameters of the module are converted to functional_tensor here:

pytorch/torch/nn/utils/stateless.py

Lines 125 to 128 in 76fb72e

    
           orig_parameters_and_buffers, _ = accessor.swap_tensors_dict( 
        
               untied_parameters_and_buffers, allow_missing=True 
        
           ) 
        
           yield

The forward function of LSTM will be called:

pytorch/torch/_functorch/aot_autograd.py

Lines 3379 to 3381 in 76fb72e

    
                       out = Interpreter(mod).run(*args[params_len:], **kwargs) 
        
           else: 
        
               out = mod(*args[params_len:], **kwargs)

In the forward function, the _flat_weights are updated to be the same as the weights, thus becoming functional_tensor:

pytorch/torch/nn/modules/rnn.py

Lines 775 to 778 in 76fb72e

    
           def forward(self, input, hx=None):  # noqa: F811 
        
               if not torch.jit.is_scripting(): 
        
                   if self._weights_have_changed(): 
        
                       self._init_flat_weights()

The weights tensors are converted back to the original tensors (which are not functional_tensor anymore) before exiting the _reparametrize_module context here:

pytorch/torch/nn/utils/stateless.py

Lines 130 to 142 in 76fb72e

    
           new_parameters_and_buffers, _ = accessor.swap_tensors_dict( 
        
               orig_parameters_and_buffers, allow_missing=True 
        
           ) 
        
           # Sometimes the module is not completely stateless and has some in-place modifications on 
        
           # the _parameters and _buffers dictionaries. 
        
           # Write the changed parameters and buffers back to the original dict. 
        
           parameters_and_buffers.update( 
        
               { 
        
                   k: new_parameters_and_buffers[k] 
        
                   for k in parameters_and_buffers 
        
                   if k in new_parameters_and_buffers 
        
               } 
        
           )

But since _flat_weights is not in the named_parameters of the module, it's still functional_tensor (link of the parameters that will be converted to functional and reverted back).
At this moment, if we need to recompile the model, deepcopy will be called:

pytorch/torch/_dynamo/utils.py

Lines 915 to 917 in 76fb72e

    
           def deepcopy_to_fake_tensor(obj, fake_mode): 
        
               with torch._subclasses.fake_tensor.FakeCopyMode(fake_mode): 
        
                   return wrap_fake_exception(lambda: copy.deepcopy(obj))

And it will report UnImplemented since we have functional_tensor (_flat_weights) and will trigger graph break which is not what we expect:

pytorch/torch/_subclasses/meta_utils.py

Line 514 in 76fb72e

torch._is_functional_tensor(t),

Added a fix in the __get_state__ to update the _flat_weights if ever weights have changed to fix this issue. The fix is covered in the test_lstm_packed UT.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov

[ghstack-poisoned]

pytorch-bot · 2023-06-06T09:08:09Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/103071

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.