Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] No operator for 'memory_efficient_attention_forward' #412

Open
usamaa-saleem opened this issue Apr 14, 2023 · 7 comments
Open

[Bug] No operator for 'memory_efficient_attention_forward' #412

usamaa-saleem opened this issue Apr 14, 2023 · 7 comments

Comments

@usamaa-saleem
Copy link

NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
     query       : shape=(200, 9126, 1, 64) (torch.float32)
     key         : shape=(200, 9126, 1, 64) (torch.float32)
     value       : shape=(200, 9126, 1, 64) (torch.float32)
     attn_bias   : <class 'NoneType'>
     p           : 0.0
`flshattF` is not supported because:
    device=cpu (supported: {'cuda'})
    dtype=torch.float32 (supported: {torch.float16, torch.bfloat16})
`tritonflashattF` is not supported because:
    device=cpu (supported: {'cuda'})
    dtype=torch.float32 (supported: {torch.float16, torch.bfloat16})
`cutlassF` is not supported because:
    device=cpu (supported: {'cuda'})
`smallkF` is not supported because:
    max(query.shape[-1] != value.shape[-1]) > 32
    unsupported embed per head: 64
steps:   0%|                                                                                                                                                        | 0/98800 [56:23<?, ?it/s]
@usamaa-saleem
Copy link
Author

The GPU i am using is A100

@sdbds
Copy link
Contributor

sdbds commented Apr 14, 2023

i had met same problem when use conda cudatoolkit
when i use pip install torch+cu,it worked.
maybe you can try this.

@liuchenbaidu
Copy link

same

@usamaa-saleem
Copy link
Author

pip install torch+cu

What's the cli exactly?

@usamaa-saleem
Copy link
Author

@kohya-ss can you guide me on how to solve this?
Need it to be done asap

NotImplementedError: No operator found for `memory_efficient_attention_forward`
with inputs:
     query       : shape=(200, 9126, 1, 64) (torch.float32)
     key         : shape=(200, 9126, 1, 64) (torch.float32)
     value       : shape=(200, 9126, 1, 64) (torch.float32)
     attn_bias   : <class 'NoneType'>
     p           : 0.0
`flshattF` is not supported because:
    device=cpu (supported: {'cuda'})
    dtype=torch.float32 (supported: {torch.bfloat16, torch.float16})
`tritonflashattF` is not supported because:
    device=cpu (supported: {'cuda'})
    dtype=torch.float32 (supported: {torch.bfloat16, torch.float16})
`cutlassF` is not supported because:
    device=cpu (supported: {'cuda'})
`smallkF` is not supported because:
    max(query.shape[-1] != value.shape[-1]) > 32
    unsupported embed per head: 64
steps:   0%|                                          | 0/98800 [56:21<?, ?it/s]

@torvinx
Copy link

torvinx commented Jul 4, 2023

@kohya-ssМожете ли вы подсказать мне, как это решить? Нужно сделать как можно быстрее

NotImplementedError: No operator found for `memory_efficient_attention_forward`
with inputs:
     query       : shape=(200, 9126, 1, 64) (torch.float32)
     key         : shape=(200, 9126, 1, 64) (torch.float32)
     value       : shape=(200, 9126, 1, 64) (torch.float32)
     attn_bias   : <class 'NoneType'>
     p           : 0.0
`flshattF` is not supported because:
    device=cpu (supported: {'cuda'})
    dtype=torch.float32 (supported: {torch.bfloat16, torch.float16})
`tritonflashattF` is not supported because:
    device=cpu (supported: {'cuda'})
    dtype=torch.float32 (supported: {torch.bfloat16, torch.float16})
`cutlassF` is not supported because:
    device=cpu (supported: {'cuda'})
`smallkF` is not supported because:
    max(query.shape[-1] != value.shape[-1]) > 32
    unsupported embed per head: 64
steps:   0%|                                          | 0/98800 [56:21<?, ?it/s]

same problem

@YOlegY
Copy link

YOlegY commented Jul 14, 2023

Exactly same problem but i running kohya_ss on CPU-only (no GPU).What setting do i missed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants