Skip to content

mglambda/llm-layers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llm-layers

llm-layers determines suitable large language models for your hardware, downloads them from huggingface, and generates startup scripts for various backends that offload an appropriate number of layers onto your GPU.

The project got started because of my frustration with having to manage command line options to adjust the amount of layers that aN LLM backend (like llama.cpp) would load without going over my graphics card's vram. This was especially painful when loading several models at the same time, or having portions of vram be spoken for due to other reasons, like a TTS engine, whisper model, or streaming with OBS.

So at first this was just a csv file with model filename and number of GPU layers associated with that model. The layers-file. Now it looks something like this.

 $ llm-layers -d
# Running with -d (--dry_run), Nothing permanent will be written to disk. Here isa pretty version of the potential layers file.
# Run with -g to actually generate the layers file and the scripts.

name                                               gpu_layers    context  prompt_format           type
-----------------------------------------------  ------------  ---------  ----------------------  ----------
Noromaid-v0.4-Mixtral-Instruct-8x7b.q4_k_m.gguf            24       4096  chat-ml                 chat
daringmaid-20b.Q6_K.gguf                                   54       4096  alpaca                  chat
dolphin-2.1-mistral-7b.Q4_K_M.gguf                        999       8192  chat-ml                 default
dolphin-2.1-mistral-7b.Q8_0.gguf                          999       8192  chat-ml                 default
dolphin-2.7-mixtral-8x7b.Q4_0.gguf                         22       4096  chat-ml                 default
dolphin-2.7-mixtral-8x7b.Q4_K_M.gguf                       22       4096  chat-ml                 default
llava-v1.6-34b.Q4_K_M.gguf                                 54       2048  chat-ml                 multimodal
llava-v1.6-mistral-7b.Q5_K_M.gguf                         999       4096  chat-ml                 multimodal
miqu-1-70b.q5_K_M.gguf                                     32        512  mistral                 default
mistral-7b-instruct-v0.2.Q4_K_M.gguf                      999       8192  mistral                 default
neuralhermes-2.5-mistral-7b.Q6_K.gguf                       1       2048  chat-ml                 default
solar-10.7b-instruct-v1.0.Q8_0.gguf                       999       4096  user-assistant-newline  mini
unholy-v2-13b.Q8_0.gguf                                   666       8192  alpaca                  chat

Usage

Coming soon.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages