Don't persist iteration number in model checkpoints #37

tcbegley · 2023-05-12T14:56:50Z

Currently training the reward model starts at the iteration number of the transformer checkpoint, which is weird. We should just start counting iterations from 0 in the reward model training loop (assuming that the reward model is being trained from scratch, if loading a reward model from a checkpoint then we can count from the checkpointed iteration number of the reward model) regardless of how many iterations the transformer was trained for.

tcbegley added the bug Something isn't working label May 12, 2023

tcbegley assigned vmoens and unassigned vmoens May 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't persist iteration number in model checkpoints #37

Don't persist iteration number in model checkpoints #37

tcbegley commented May 12, 2023

Don't persist iteration number in model checkpoints #37

Don't persist iteration number in model checkpoints #37

Comments

tcbegley commented May 12, 2023