Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minillm: Run DeepSpeed. After initialization, some parameters of the model are missing, such as model_ori.transformer.h[0].attn.c_attn.wight. #56

Closed
liuxy1103 opened this issue Aug 2, 2023 · 3 comments
Assignees

Comments

@liuxy1103
Copy link

def setup_model_and_optimizer(args, ds_config, device, set_optim=True):
# get the model
model = get_model(args, device)
# get the optimizer and lr_scheduler
if set_optim:
optimizer = get_optimizer(args, model)
lr_scheduler = get_learning_rate_scheduler(args, optimizer)
else:
optimizer, lr_scheduler = None, None
model_ori = model
model, optimizer, _, lr_scheduler = deepspeed.initialize(
model=model,
optimizer=optimizer,
args=args,
lr_scheduler=lr_scheduler,
mpu=mpu if args.model_parallel else None,
config_params=ds_config

    Run DeepSpeed. After initialization, some parameters of the model are missing, such as model_ori.transformer.h[0].attn.c_attn.wight.
@t1101675
Copy link
Contributor

t1101675 commented Aug 2, 2023

Hi, can you provide more detailed information, like the running command, the scripts, and the environment?

@liuxy1103
Copy link
Author

I modify the setting in minillm/configs/deepspeed/ds_config.json
"zero_optimization": {
"stage": 3,
"offload_param":{
"device": "cpu"
}
},

@t1101675
Copy link
Contributor

t1101675 commented Aug 2, 2023

We currently do not support zero3 and cpu-offload. Maybe we will add this feature in a later version.

@donglixp donglixp self-assigned this Aug 9, 2023
@donglixp donglixp closed this as completed Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants