Skip to content
Discussion options

You must be logged in to vote

Can you test with different options for flash attention? #23714 changed the default for llama-bench
-fa, --flash-attn <on|off|auto> (default: auto)

Replies: 4 comments 5 replies

Comment options

You must be logged in to vote
1 reply
@KaySees
Comment options

Comment options

You must be logged in to vote
4 replies
@KaySees
Comment options

@Kononnable
Comment options

@KaySees
Comment options

@KaySees
Comment options

Answer selected by KaySees
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
4 participants