Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: Memory_efficient error #118

Closed
AbyszOne opened this issue Apr 13, 2023 · 7 comments
Closed

[Issue]: Memory_efficient error #118

AbyszOne opened this issue Apr 13, 2023 · 7 comments
Assignees
Labels
question Further information is requested

Comments

@AbyszOne
Copy link

Issue Description

NotImplementedError: No operator found for memory_efficient_attention_forward with inputs: query : shape=(1, 4096, 1, 512) (torch.float16) key : shape=(1, 4096, 1, 512) (torch.float16) value : shape=(1, 4096, 1, 512) (torch.float16) attn_bias : <class 'NoneType'> p : 0.0 cutlassF is not supported because: xFormers wasn't build with CUDA support flshattF is not supported because: xFormers wasn't build with CUDA support max(query.shape[-1] != value.shape[-1]) > 128 tritonflashattF is not supported because: xFormers wasn't build with CUDA support max(query.shape[-1] != value.shape[-1]) > 128 triton is not available requires A100 GPU smallkF is not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 unsupported embed per head: 512

It runs but doesn't show any image, just that error.

Platform Description

Win 10. Python 3.10.9. Torch 2 (Installed from wiki guide). No warnings at launch.py start.

@vladmandic
Copy link
Owner

vladmandic commented Apr 13, 2023

what does startup show on console for torch? and how did you install xformers? during which operation does this happen?

@vladmandic vladmandic added the question Further information is requested label Apr 14, 2023
@vladmandic vladmandic self-assigned this Apr 14, 2023
@Saelfen
Copy link

Saelfen commented Apr 15, 2023

I had the error myself on the first installation, but it happened since I copied over the venv from the original A1111 fork. In my instance, it was fixed after wiping the venv and installing the requirements again.

@rockiecxh
Copy link

Is it like the Mac M1 is not supported yet? I have the similar issue here.

memory_efficient_attention_forward

gradio call: NotImplementedError
╭────────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────────────────────────────────────╮
│ /Users/rockie/Documents/git/automatic/modules/call_queue.py:59 in f │
│ │
│ 58 │ │ │ │ pr.enable() │
│ ❱ 59 │ │ │ res = list(func(*args, **kwargs)) │
│ 60 │ │ │ if shared.cmd_opts.profile: │
│ │
│ /Users/rockie/Documents/git/automatic/modules/call_queue.py:38 in f │
│ │
│ 37 │ │ │ try: │
│ ❱ 38 │ │ │ │ res = func(*args, **kwargs) │
│ 39 │ │ │ finally: │
│ │
│ ... 17 frames hidden ... │
│ │
│ /Users/rockie/Documents/git/automatic/venv/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py:95 in _dispatch_fw │
│ │
│ 94 │ │ priority_list_ops.insert(0, triton.FwOp) │
│ ❱ 95 │ return _run_priority_list( │
│ 96 │ │ "memory_efficient_attention_forward", priority_list_ops, inp │
│ │
│ /Users/rockie/Documents/git/automatic/venv/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py:70 in _run_priority_list │
│ │
│ 69 │ │ msg += "\n" + _format_not_supported_reasons(op, not_supported) │
│ ❱ 70 │ raise NotImplementedError(msg) │
│ 71 │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
NotImplementedError: No operator found for memory_efficient_attention_forward with inputs:
query : shape=(1, 7680, 1, 512) (torch.float32)
key : shape=(1, 7680, 1, 512) (torch.float32)
value : shape=(1, 7680, 1, 512) (torch.float32)
attn_bias : <class 'NoneType'>
p : 0.0
cutlassF is not supported because:
device=mps (supported: {'cuda'})
flshattF is not supported because:
device=mps (supported: {'cuda'})
dtype=torch.float32 (supported: {torch.float16, torch.bfloat16})
max(query.shape[-1] != value.shape[-1]) > 128
tritonflashattF is not supported because:
device=mps (supported: {'cuda'})
dtype=torch.float32 (supported: {torch.float16, torch.bfloat16})
max(query.shape[-1] != value.shape[-1]) > 128
triton is not available
smallkF is not supported because:
device=mps (supported: {'cuda', 'cpu'})
max(query.shape[-1] != value.shape[-1]) > 32
unsupported embed per head: 512

@vladmandic
Copy link
Owner

Original issue is with Cuda, don't hijack the thread with M1. Search issues/discussions first and open new issue if needed as it's not related to this.

@rockiecxh
Copy link

I see, there is already an issue mentioned not supporting M1. What is the estimation of the timeframe to support M1? Thanks for your effort!

@vladmandic
Copy link
Owner

I'm fully willing to support the effort, but i don't have M1 system available - I'd love to have a contributor to suggest what's needed,

Same applies for AMD optimizations.

@vladmandic
Copy link
Owner

there is no update on original issue, so closing for now and can be reopened once update is provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants