[Issue]: Memory_efficient error #118

AbyszOne · 2023-04-13T22:40:33Z

Issue Description

NotImplementedError: No operator found for memory_efficient_attention_forward with inputs: query : shape=(1, 4096, 1, 512) (torch.float16) key : shape=(1, 4096, 1, 512) (torch.float16) value : shape=(1, 4096, 1, 512) (torch.float16) attn_bias : <class 'NoneType'> p : 0.0 cutlassF is not supported because: xFormers wasn't build with CUDA support flshattF is not supported because: xFormers wasn't build with CUDA support max(query.shape[-1] != value.shape[-1]) > 128 tritonflashattF is not supported because: xFormers wasn't build with CUDA support max(query.shape[-1] != value.shape[-1]) > 128 triton is not available requires A100 GPU smallkF is not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 unsupported embed per head: 512

It runs but doesn't show any image, just that error.

Platform Description

Win 10. Python 3.10.9. Torch 2 (Installed from wiki guide). No warnings at launch.py start.

The text was updated successfully, but these errors were encountered:

vladmandic · 2023-04-13T22:48:23Z

what does startup show on console for torch? and how did you install xformers? during which operation does this happen?

Saelfen · 2023-04-15T11:42:19Z

I had the error myself on the first installation, but it happened since I copied over the venv from the original A1111 fork. In my instance, it was fixed after wiping the venv and installing the requirements again.

rockiecxh · 2023-04-16T03:22:07Z

Is it like the Mac M1 is not supported yet? I have the similar issue here.

memory_efficient_attention_forward

gradio call: NotImplementedError
╭────────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────────────────────────────────────╮
│ /Users/rockie/Documents/git/automatic/modules/call_queue.py:59 in f │
│ │
│ 58 │ │ │ │ pr.enable() │
│ ❱ 59 │ │ │ res = list(func(*args, **kwargs)) │
│ 60 │ │ │ if shared.cmd_opts.profile: │
│ │
│ /Users/rockie/Documents/git/automatic/modules/call_queue.py:38 in f │
│ │
│ 37 │ │ │ try: │
│ ❱ 38 │ │ │ │ res = func(*args, **kwargs) │
│ 39 │ │ │ finally: │
│ │
│ ... 17 frames hidden ... │
│ │
│ /Users/rockie/Documents/git/automatic/venv/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py:95 in _dispatch_fw │
│ │
│ 94 │ │ priority_list_ops.insert(0, triton.FwOp) │
│ ❱ 95 │ return _run_priority_list( │
│ 96 │ │ "memory_efficient_attention_forward", priority_list_ops, inp │
│ │
│ /Users/rockie/Documents/git/automatic/venv/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py:70 in _run_priority_list │
│ │
│ 69 │ │ msg += "\n" + _format_not_supported_reasons(op, not_supported) │
│ ❱ 70 │ raise NotImplementedError(msg) │
│ 71 │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
NotImplementedError: No operator found for memory_efficient_attention_forward with inputs:
query : shape=(1, 7680, 1, 512) (torch.float32)
key : shape=(1, 7680, 1, 512) (torch.float32)
value : shape=(1, 7680, 1, 512) (torch.float32)
attn_bias : <class 'NoneType'>
p : 0.0
cutlassF is not supported because:
device=mps (supported: {'cuda'})
flshattF is not supported because:
device=mps (supported: {'cuda'})
dtype=torch.float32 (supported: {torch.float16, torch.bfloat16})
max(query.shape[-1] != value.shape[-1]) > 128
tritonflashattF is not supported because:
device=mps (supported: {'cuda'})
dtype=torch.float32 (supported: {torch.float16, torch.bfloat16})
max(query.shape[-1] != value.shape[-1]) > 128
triton is not available
smallkF is not supported because:
device=mps (supported: {'cuda', 'cpu'})
max(query.shape[-1] != value.shape[-1]) > 32
unsupported embed per head: 512

vladmandic · 2023-04-16T03:32:15Z

Original issue is with Cuda, don't hijack the thread with M1. Search issues/discussions first and open new issue if needed as it's not related to this.

rockiecxh · 2023-04-17T14:04:57Z

I see, there is already an issue mentioned not supporting M1. What is the estimation of the timeframe to support M1? Thanks for your effort!

vladmandic · 2023-04-17T14:11:52Z

I'm fully willing to support the effort, but i don't have M1 system available - I'd love to have a contributor to suggest what's needed,

Same applies for AMD optimizations.

vladmandic · 2023-04-17T19:44:43Z

there is no update on original issue, so closing for now and can be reopened once update is provided.

vladmandic added the question Further information is requested label Apr 14, 2023

vladmandic self-assigned this Apr 14, 2023

vladmandic closed this as completed Apr 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Issue]: Memory_efficient error #118

[Issue]: Memory_efficient error #118

AbyszOne commented Apr 13, 2023

vladmandic commented Apr 13, 2023 •

edited

Saelfen commented Apr 15, 2023

rockiecxh commented Apr 16, 2023

vladmandic commented Apr 16, 2023

rockiecxh commented Apr 17, 2023

vladmandic commented Apr 17, 2023

vladmandic commented Apr 17, 2023

[Issue]: Memory_efficient error #118

[Issue]: Memory_efficient error #118

Comments

AbyszOne commented Apr 13, 2023

Issue Description

Platform Description

vladmandic commented Apr 13, 2023 • edited

Saelfen commented Apr 15, 2023

rockiecxh commented Apr 16, 2023

vladmandic commented Apr 16, 2023

rockiecxh commented Apr 17, 2023

vladmandic commented Apr 17, 2023

vladmandic commented Apr 17, 2023

vladmandic commented Apr 13, 2023 •

edited