We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
def setup_model_and_optimizer(args, ds_config, device, set_optim=True): # get the model model = get_model(args, device) # get the optimizer and lr_scheduler if set_optim: optimizer = get_optimizer(args, model) lr_scheduler = get_learning_rate_scheduler(args, optimizer) else: optimizer, lr_scheduler = None, None model_ori = model model, optimizer, _, lr_scheduler = deepspeed.initialize( model=model, optimizer=optimizer, args=args, lr_scheduler=lr_scheduler, mpu=mpu if args.model_parallel else None, config_params=ds_config
Run DeepSpeed. After initialization, some parameters of the model are missing, such as model_ori.transformer.h[0].attn.c_attn.wight.
The text was updated successfully, but these errors were encountered:
Hi, can you provide more detailed information, like the running command, the scripts, and the environment?
Sorry, something went wrong.
I modify the setting in minillm/configs/deepspeed/ds_config.json "zero_optimization": { "stage": 3, "offload_param":{ "device": "cpu" } },
We currently do not support zero3 and cpu-offload. Maybe we will add this feature in a later version.
donglixp
No branches or pull requests
def setup_model_and_optimizer(args, ds_config, device, set_optim=True):
# get the model
model = get_model(args, device)
# get the optimizer and lr_scheduler
if set_optim:
optimizer = get_optimizer(args, model)
lr_scheduler = get_learning_rate_scheduler(args, optimizer)
else:
optimizer, lr_scheduler = None, None
model_ori = model
model, optimizer, _, lr_scheduler = deepspeed.initialize(
model=model,
optimizer=optimizer,
args=args,
lr_scheduler=lr_scheduler,
mpu=mpu if args.model_parallel else None,
config_params=ds_config
The text was updated successfully, but these errors were encountered: