eniompw / llama-cpp-gpu Public

Notifications You must be signed in to change notification settings
Fork 0
Star 3

Load larger models by offloading model layers to both GPU and CPU

3 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
LICENSE		LICENSE
README.md		README.md
colab-resources.png		colab-resources.png
llama_cpp_gpu.ipynb		llama_cpp_gpu.ipynb
model-details.png		model-details.png
size-parameters.png		size-parameters.png

Repository files navigation

LLaMA.cpp GPU

Offloads some of the model layers to the GPU, allowing larger models to be loaded

cuBLAS
Model

About

Load larger models by offloading model layers to both GPU and CPU

gpu colab gpu-acceleration llama colab-notebook llamacpp llama-cpp

Report repository

Releases

No releases published

Packages

Languages

Jupyter Notebook 100.0%