Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] deep inference on T4 get Triton Error [CUDA]: invalid argument #2942

Open
stevensu1977 opened this issue Mar 4, 2023 · 2 comments
Open
Labels
bug Something isn't working compression

Comments

@stevensu1977
Copy link

May I missing something ?

env:
accelerate==0.16.0
deepspeed==0.7.4+fffca7df
diffusers==0.12.0
tokenizers==0.12.1
torch==1.11.0+cu113
torchvision==0.12.0+cu113
tqdm==4.65.0
transformers==4.26.1

error message:

ValueError: model must be a torch.nn.Module, got <class.
'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'>

model = StableDiffusionPipeline.from_pretrained(
    HF_MODEL_ID,
    torch_dtype=torch.float16,
    revision="fp16")


model = deepspeed.init_inference(model=model.to("cuda"), dtype=torch.float16)
model("a photo of an astronaut riding a horse on moon")
@stevensu1977 stevensu1977 added bug Something isn't working compression labels Mar 4, 2023
@stevensu1977
Copy link
Author

I change other deepspeed version , and get "Triton Error [CUDA]: invalid argument" , I want to confirm flash_atthention not support T4 ?

deepspeed==0.8.0
diffusers==0.10.0
triton==2.0.0.dev20221202

File "/opt/conda/envs/deepspeed01/lib/python3.9/site-packages/deepspeed/ops/transformer/inference/diffusers_attention.py", line 81, in selfAttention_fp
context_layer = triton_flash_attn_kernel(qkv_out[0],
File "/opt/conda/envs/deepspeed01/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/deepspeed01/lib/python3.9/site-packages/deepspeed/ops/transformer/inference/triton_ops.py", line 119, in forward
_fwd_kernel[grid](
File "/opt/conda/envs/deepspeed01/lib/python3.9/site-packages/triton/runtime/jit.py", line 106, in launcher
return self.run(*args, grid=grid, **kwargs)
File "", line 43, in _fwd_kernel
RuntimeError: Triton Error [CUDA]: invalid argument

@stevensu1977 stevensu1977 changed the title [BUG] deep inference get model must be a torch.nn.Module [BUG] deep inference get Triton Error [CUDA]: invalid argument Mar 4, 2023
@stevensu1977 stevensu1977 changed the title [BUG] deep inference get Triton Error [CUDA]: invalid argument [BUG] deep inference on T4 get Triton Error [CUDA]: invalid argument Mar 4, 2023
@stevensu1977
Copy link
Author

I try deepspeed and ComVis origin txt2img.py , deepseed inference it's work , I think may be it's diffuserss pipeline issue , if anything that I missed please tell me , thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working compression
Projects
None yet
Development

No branches or pull requests

1 participant