Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

继续预训练如何加载模型? #24

Closed
fade-color opened this issue Jul 26, 2022 · 4 comments
Closed

继续预训练如何加载模型? #24

fade-color opened this issue Jul 26, 2022 · 4 comments

Comments

@fade-color
Copy link

我在pretrain_glm.py继续预训练加载下载下来的glm-large-chinese/mp_rank_00_model_states.pt时报错:

WARNING: could not find the metadata file /root/Data/zz/GitHub/GLM/blocklm-large-chinese/latest_checkpointed_iteration.txt 
Try to directly load the checkpoint from the directory
Traceback (most recent call last):
  File "pretrain_glm.py", line 663, in <module>
    main()
  File "pretrain_glm.py", line 580, in main
    args.iteration = load_checkpoint(model, optimizer, lr_scheduler, args)
  File "/root/Data/zz/GitHub/GLM/utils.py", line 337, in load_checkpoint
    checkpoint_name, sd = model.load_checkpoint(load_dir, tag,
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 2513, in load_checkpoint
    load_path, client_states = self._load_checkpoint(load_dir,
  File "/root/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 2671, in _load_checkpoint
    client_state['optimizer'] = optim_checkpoint['optimizer']
KeyError: 'optimizer'

提供的模型文件要怎么才能正确加载呢?

@fade-color
Copy link
Author

我尝试设置--no-load-optim,但是发现没有用

@duzx16
Copy link
Member

duzx16 commented Jul 28, 2022

DeepSpeed's load_checkpoint function will always load the optimizer state even if load_optimizer_states=False.
You can pull the latest commit and set --no-deepspeed-load.

@fade-color
Copy link
Author

谢谢,可以了

@zhangzai666
Copy link

谢谢,可以了

请问您的预训练数据格式可以参考一下吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants