You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@underlines to run the microsoft/bloom-deepspeed-inference-fp16 model you will need at least 8x80GB A100s for fp16 or 4x80GB A100s for int8.
Unfortunately Azure isn't giving my account the resources. They are forcing people to manually request GPU resources including stating the reasons and examples of the application. Off to AWS then, but without the azure optimizations of DeepSpeed
Using default example to deploy Deploying MII-Public on Azure ML:
Compute instance: TeslaK80 12GB
Kernel: Python 3.8 - AzureML
pip install deepspeed-mii
restart kernel
using this fails:
AssertionError: text-generation only supports ['distilgpt2', 'gpt2-large'...
using this modified to tensor_parallel=1 fails:
RuntimeError: server crashed for some reason, unable to proceed
Also switching to int8 didn't help.
Is my compute instance too small?
The text was updated successfully, but these errors were encountered: