Skip to content

Load larger models by offloading model layers to both GPU and CPU

License

Notifications You must be signed in to change notification settings

eniompw/llama-cpp-gpu

Repository files navigation

LLaMA.cpp GPU

Offloads some of the model layers to the GPU, allowing larger models to be loaded

model colab resources parameters

About

Load larger models by offloading model layers to both GPU and CPU

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages