Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ProphetNet] Bart-like Refactor #10501

Merged

Conversation

patrickvonplaten
Copy link
Contributor

@patrickvonplaten patrickvonplaten commented Mar 3, 2021

What does this PR do?

This PR refactors ProphetNet similar to Bart in that it moves the time dimension to be always at the 2nd place and the batch dimensions always in the first place. Also, the cache is refactored to consists of tuples instead of a dict.

The model is thereby very much aligned with Bart (I cannot really add any " # Copied from" statements though because the weight names are different).

The PR is in spirit very similar to #8900.

I've verified that all slow tests pass. In the next step, I want to make a short notebook, verifying that ProphetNet can be trained since there have been some issues on training: #9804

Benchmarking

The PR doesn't change compute or memory complexity:

On this PR:

====================       INFERENCE - SPEED - RESULT       ====================
--------------------------------------------------------------------------------
          Model Name             Batch Size     Seq Length     Time in s   
--------------------------------------------------------------------------------
microsoft/prophetnet-large-unc       8               8             0.029     
microsoft/prophetnet-large-unc       8               32            0.044     
microsoft/prophetnet-large-unc       8              128            0.175     
microsoft/prophetnet-large-unc       8              512             N/A      
--------------------------------------------------------------------------------

====================      INFERENCE - MEMORY - RESULT       ====================
--------------------------------------------------------------------------------
          Model Name             Batch Size     Seq Length    Memory in MB 
--------------------------------------------------------------------------------
microsoft/prophetnet-large-unc       8               8              2562     
microsoft/prophetnet-large-unc       8               32             2756     
microsoft/prophetnet-large-unc       8              128             3628     
microsoft/prophetnet-large-unc       8              512             N/A      
--------------------------------------------------------------------------------

on master:

====================       INFERENCE - SPEED - RESULT       ====================
--------------------------------------------------------------------------------
          Model Name             Batch Size     Seq Length     Time in s   
--------------------------------------------------------------------------------
microsoft/prophetnet-large-unc       8               8             0.027     
microsoft/prophetnet-large-unc       8               32            0.044     
microsoft/prophetnet-large-unc       8              128            0.172     
microsoft/prophetnet-large-unc       8              512             N/A      
--------------------------------------------------------------------------------

====================      INFERENCE - MEMORY - RESULT       ====================
--------------------------------------------------------------------------------
          Model Name             Batch Size     Seq Length    Memory in MB 
--------------------------------------------------------------------------------
microsoft/prophetnet-large-unc       8               8              2562     
microsoft/prophetnet-large-unc       8               32             2768     
microsoft/prophetnet-large-unc       8              128             3740     
microsoft/prophetnet-large-unc       8              512             N/A      
--------------------------------------------------------------------------------

@patrickvonplaten patrickvonplaten changed the title [ProphetNet] Bart-like Refactor [WIP][ProphetNet] Bart-like Refactor Mar 3, 2021
@patrickvonplaten patrickvonplaten changed the title [WIP][ProphetNet] Bart-like Refactor [ProphetNet] Bart-like Refactor Mar 4, 2021
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for refactoring this, LGTM!

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! Thanks for running all the slow tests and verifying time/memory complexity.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants