feat: LLM inference libraries support plan #3124

hydai · 2023-12-29T07:57:11Z

Summary

There are various LLM inference libraries. WasmEdge already integrated llama.cpp, but we want to bring more to the community.

Details

Already supported:

PyTorch
TFLite
OpenVINO
llama.cpp

The support priority list:

Tier 1:
- burn-rs
Tier 2:
- Intel extension for transformers
- whisper.cpp
- RWKV
Tier 3:
- vllm
- CTranslate2
- candle
- mlx

Please feel free to add any comments and suggestions. We would like to hear the voice of the community.
Also, if you are interested in contributing the new LLM inference libraries, please tell and show to us :-)

Happy new year!

Appendix

No response

katopz · 2023-12-30T03:01:51Z

Cool, i'm now playing with

For RAG: llm-chain // just support Qdrant
For mac: mlx // still in heavy development
For fun: candle // infer tps seem to be a lot slower than llama.cpp atm but it will keep up somehow i hope.
For co-pilot: Tabby // Support repos lookup but the output not satisfy me yet

hydai added the enhancement New feature or request label Dec 29, 2023

alabulei1 pinned this issue Jan 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: LLM inference libraries support plan #3124

feat: LLM inference libraries support plan #3124

hydai commented Dec 29, 2023 •

edited by juntao

katopz commented Dec 30, 2023

feat: LLM inference libraries support plan #3124

feat: LLM inference libraries support plan #3124

Comments

hydai commented Dec 29, 2023 • edited by juntao

Summary

Details

Appendix

katopz commented Dec 30, 2023

hydai commented Dec 29, 2023 •

edited by juntao