Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory issue when loading OPT #117

Closed
larry-fuy opened this issue Dec 9, 2022 · 3 comments
Closed

Memory issue when loading OPT #117

larry-fuy opened this issue Dec 9, 2022 · 3 comments

Comments

@larry-fuy
Copy link

Recently I am trying to run OPT models on MII but came across some memory issues. The OPT model I used is facebook/opt-13b. mii-config and deployment parameters are like this:

mii_configs = {
    "dtype": "fp32",
    "tensor_parallel": 4,
}

name = "facebook/opt-13b"

mii.deploy(task='text-generation',
           model=name,
           deployment_name=name + "_deployment",
           model_path='/root/ckpt/opt_13b/mii',
           mii_config=mii_configs)

The checkpoint is already downloaded into the model_path. Since the checkpoint size of opt-13b is around 26 Gb, I suppose it should work on a machine with 4 x v100 and 224G memory. But it turns out the loading part (even before the server started), MII reported an error of the server crashed and exit quietly. I then checked the memory usage and surprisingly found MII used up all 224G memory. So my question is why MII consumes several times of memory than the checkpoint? Is there any configuration to change this behavior?

@aponte411
Copy link

@larry-fuy out of curiosity, have you tried addingload_with_sys_mem: True to the config? It may help as it loads the model onto system memory and then lets deepspeed.init_inference take care of moving the model to GPU memory.

@mrwyattii
Copy link
Contributor

@aponte411 this is part of the solution. @larry-fuy you also have "dtype": "fp32", but the facebook/opt-13b checkpoint are stored in fp16. So you are doubling the size of the model by not choosing to run in half precision.

@larry-fuy
Copy link
Author

@mrwyattii Yes. facebook/opt-13b is stored in fp16 so fixed this. @aponte411 I tried load_with_sys_mem but the issue is still there. But I fixed it by upgrading the transformers version to latest one. I guess it is the issue of transformers rather than MII.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants