Stars
ai
4 repositories
Fork of Facebooks LLaMa model to run on CPU
Real time transcription with OpenAI Whisper.
A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.




