You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In order to do model averaging, we need to keep several checkpoints.
2 approaches:
Time-based: this is the TF approach which keeps 10 minutes interval ckpt files, up to N (20 by default) files
Step-based: same as in Lua-onmt with save_every option which saves a chekcpoint every X iteration.
Step is more appropriate since system's speed could vary a lot.
But it's also good to have a "keep last N" flag to minimize disk usage.
then the external tool can average weights and spit out the averaged model.
should guve at least 1 bleu improvement.
The text was updated successfully, but these errors were encountered:
In order to do model averaging, we need to keep several checkpoints.
2 approaches:
Time-based: this is the TF approach which keeps 10 minutes interval ckpt files, up to N (20 by default) files
Step-based: same as in Lua-onmt with save_every option which saves a chekcpoint every X iteration.
Step is more appropriate since system's speed could vary a lot.
But it's also good to have a "keep last N" flag to minimize disk usage.
then the external tool can average weights and spit out the averaged model.
should guve at least 1 bleu improvement.
The text was updated successfully, but these errors were encountered: