Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: support for exllama and AutoGPTQ #796

Closed
mudler opened this issue Jul 24, 2023 Discussed in #763 · 3 comments · Fixed by #881
Closed

feature: support for exllama and AutoGPTQ #796

mudler opened this issue Jul 24, 2023 Discussed in #763 · 3 comments · Fixed by #881
Assignees
Labels
enhancement New feature or request high prio

Comments

@mudler
Copy link
Owner

mudler commented Jul 24, 2023

Discussed in #763

Originally posted by yarray July 17, 2023
Although llama.cpp can now support GPU via cublas, it seems that exllama runs times faster if with a good enough GPU (3090 as an example). Is there any plan to support exllama, or in general, other loaders to load LLM?

@mudler mudler added the enhancement New feature or request label Jul 24, 2023
@localai-bot
Copy link
Contributor

⚠️⚠️⚠️⚠️⚠️

Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!

but.... I can also be funny or helpful 😸 and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.

Don't engage in conversation with me,
I don't support (yet) replying!

⚠️⚠️⚠️⚠️⚠️

Yes, we are planning to add support for more loaders in the future. Currently, we are working on improving the llama.cpp backend to make it faster and more efficient. We also plan to add support for other loaders such as exllama and others. However, this will require some time to implement and test properly. In the meantime, you can use exllama or other loaders to load LLM models and use them with LocalAI.

Sources:

@mudler mudler self-assigned this Jul 24, 2023
@mudler
Copy link
Owner Author

mudler commented Jul 26, 2023

warningwarningwarningwarningwarning

Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!

but.... I can also be funny or helpful smile_cat and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.

Don't engage in conversation with me, I don't support (yet) replying!

warningwarningwarningwarningwarning

Yes, we are planning to add support for more loaders in the future. Currently, we are working on improving the llama.cpp backend to make it faster and more efficient. We also plan to add support for other loaders such as exllama and others. However, this will require some time to implement and test properly. In the meantime, you can use exllama or other loaders to load LLM models and use them with LocalAI.

Sources:

* https://localai.io/features/embeddings/index.html

* https://localai.io/advanced/index.html

* https://localai.io/basics/news/index.html

* https://localai.io/basics/getting_started/index.html

lol 🤣

@mudler mudler changed the title feature: support for exllama feature: support for exllama and AutoGPTQ Jul 26, 2023
@mudler
Copy link
Owner Author

mudler commented Jul 26, 2023

maybe we can split the twos - but for now keeping it here open for discussion. any takers here? or I'll likely start to have a look at it sooner or later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request high prio
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants