Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: server crashed for some reason, unable to proceed #123

Closed
underlines opened this issue Dec 22, 2022 · 4 comments
Closed

RuntimeError: server crashed for some reason, unable to proceed #123

underlines opened this issue Dec 22, 2022 · 4 comments

Comments

@underlines
Copy link

Using default example to deploy Deploying MII-Public on Azure ML:
Compute instance: TeslaK80 12GB
Kernel: Python 3.8 - AzureML

pip install deepspeed-mii

restart kernel

using this fails:

import mii

mii_configs = {"tensor_parallel": 1, "dtype": "fp16"}
mii.deploy(task='text-generation',
           model="bigscience/bloom-560m",
           deployment_name="bloom560m_deployment",
           mii_config=mii_configs)

AssertionError: text-generation only supports ['distilgpt2', 'gpt2-large'...

using this modified to tensor_parallel=1 fails:

import mii

mii_configs = {
    "dtype": "fp16",
    "tensor_parallel": 1,
    "port_number": 50950,
}
name = "microsoft/bloom-deepspeed-inference-fp16"

mii.deploy(task='text-generation',
           model=name,
           deployment_name=name + "_deployment",
           model_path="/data/bloom-mp",
           mii_config=mii_configs)

RuntimeError: server crashed for some reason, unable to proceed

Also switching to int8 didn't help.

Is my compute instance too small?

@DanDelluomo
Copy link

I am getting this error too.

@mrwyattii
Copy link
Contributor

Sorry for the late response on this. Please see the solution in #135

@mrwyattii
Copy link
Contributor

mrwyattii commented Jan 19, 2023

@underlines to run the microsoft/bloom-deepspeed-inference-fp16 model you will need at least 8x80GB A100s for fp16 or 4x80GB A100s for int8.

@underlines
Copy link
Author

@underlines to run the microsoft/bloom-deepspeed-inference-fp16 model you will need at least 8x80GB A100s for fp16 or 4x80GB A100s for int8.

Unfortunately Azure isn't giving my account the resources. They are forcing people to manually request GPU resources including stating the reasons and examples of the application. Off to AWS then, but without the azure optimizations of DeepSpeed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants