Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about number of parameters #5

Closed
huchinlp opened this issue Oct 13, 2020 · 2 comments
Closed

question about number of parameters #5

huchinlp opened this issue Oct 13, 2020 · 2 comments

Comments

@huchinlp
Copy link

huchinlp commented Oct 13, 2020

Hi, I trained some models using the pre-defined configurations but
the number of parameters is much larger than what you reported (55.1M vs. 31.5M).

Configuration:
HAT_iwslt14deen_titanxp@168.8ms_bleu@34.8.yml

Here is the code I used to calculate the number of parameters:
(embedding layers are excluded)

import torch

m = torch.load('checkpoints/iwslt14.de-en/subtransformer/HAT_iwslt14deen_titanxp@\
168.8ms_bleu@34.8/checkpoint_best.pt', map_location='cpu')
m = m['model']

n = 0
for k in m:
    if 'emb' not in k:
        n += m[k].numel()

print(n)
@Hanrui-Wang
Copy link
Collaborator

Hi Huchi,

Sorry for my late reply, I was too busy in the past several weeks.
It is required to extract the SubTransformer weights from the checkpoints we shared to get the correct model size. The reason Is that we finetuned a SubTransformer by always sampling that SubTransformer from the SuperTransformer. So the checkpoint contains all the weights of a SuperTransformer, but we need to only use the SubTransformer part to do testing and profiling.

Please refer to train.py line 61 to 64 for how to get SubTransformer model size:

# Log model size
if args.train_subtransformer:
    print(f"| SubTransformer size (without embedding weights): {model.get_sampled_params_numel(utils.get_subtransformer_config(args))}")
    embed_size = args.decoder_embed_dim_subtransformer * len(task.tgt_dict)
    print(f"| Embedding layer size: {embed_size} \n")

Thanks!
Hanrui

@huchinlp
Copy link
Author

huchinlp commented Nov 10, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants