-
-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to pass computer resource information #40
Comments
Are you trying to configure CatAI only to use some CPU/GPU cores? You can configure CatAI like this:
|
Yes, this may be related. The binding of The recommended format is
You can check this page for more models: https://huggingface.co/TheBloke Most GGUF models are supported, you can install a custom model by coping the model link to The more parameter the model the more resources it utilizes, so this model (70B) https://huggingface.co/TheBloke/ARIA-70B-V2-GGUF Can be pretty heavy. If a model split into files like https://huggingface.co/TheBloke/Falcon-180B-GGUF/tree/main
I have not tried this method yet, so if there is a problem feel free to report it :) |
Ok. I'll continue to look into it. |
This git was generated from screen recording in mac pro m1. llama.cpp has spesifict optimization for mac silicon. Are your cuda working when the model generating tokens? |
I uploaded new models, in my Mac token/s is faster than in the gif, I use Screen.Recording.2023-09-21.at.12.02.31.mov |
Thank you. |
Hello,
I am looking to see where/how cpu and/or gpu information is passed during server start but I am unable to find it.
Thank you
The text was updated successfully, but these errors were encountered: