Does it make sense to buy a powerful graphics card to speed up generation?

I currently have an RTX 3050 and the latest releases of koboldcpp have really speed up the prompt processing. Obviously, a more powerful graphics card will speed up this process even more. But what about generation? I might buy an RTX 4090 if it would make the token generation rate significantly faster, but I suspect that won't happen. Can't CUDA and RTX cores be plugged in for at least some computation to speed up generation? Because 70B models on a CPU would be very slow...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does it make sense to buy a powerful graphics card to speed up generation? #342

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Does it make sense to buy a powerful graphics card to speed up generation? #342

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions