Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GGUF models inference speed - Why is GGUF model inference fast on my Mac but slow on cluster? #7715

Closed
eltonjohnfanboy opened this issue Jun 3, 2024 · 0 comments

Comments

@eltonjohnfanboy
Copy link

Hi guys!

I've noticed that GGUF model inference is much faster on my Mac M3 compared to my college's cluster, even when I request for 8 or 16 cores. Both systems run the same GGUF model version and dependencies. The inference in the MAC takes seconds while in the cluster it can take up to 1 hour to generate the response.
Are there known issues with GGUF models on certain CPUs? Any help would be greatly appreciated. Thank you!

Repository owner locked and limited conversation to collaborators Jun 3, 2024
@JohannesGaessler JohannesGaessler converted this issue into discussion #7717 Jun 3, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant