Text Gen Web UI LLMs for API

LLMs I found that work with the Text Generation Web UI "API" out of the box with no other tweaking. LLMs that span one GPU, the API works, LLMs that span two GPUS, API does not work.

System: Proxmox 8 Host Ubuntu 22.04 Server VM Guest 64GB RAM 2 x 3090s 24Gb VRAM each (48 Total)

CUDA/Nvidia: Cuda compilation tools, release 12.3, V12.3.107 Nvidia Driver Version: 545.23.08 CUDA Version: 12.3

API Does Work (GUI Chat Also):

TheBloke_Llama-2-7B-Chat-GPTQ
ehartford_dolphin-2.0-mistral-7b
MaziyarPanahi_dolphin-2.1-mistral-7b-GPTQ
TheBloke_Wizard-Vicuna-7B-Uncensored-GPTQ
TheBloke_Wizard-Vicuna-30B-Uncensored-GPTQ
TheBloke_vicuna-7B-v1.3-GPTQ
TheBloke_Nous-Hermes-2-SOLAR-10.7B-GPTQ
TheBloke_SOLAR-10.7B-Instruct-v1.0-GPTQ
TheBloke_Llama-2-13B-chat-GPTQ
TheBloke_vicuna-33B-GPTQ

API Does NOT Work (GUI Chat DOES Work):

TheBloke_llama2_70b_chat_uncensored-GPTQ
TheBloke_guanaco-65B-GPTQ
TheBloke_MoMo-70B-V1.1-GPTQ
TheBloke_CodeLlama-70B-hf-GPTQ
TheBloke_CodeLlama-70B-Python-GPTQ
TheBloke_Mixtral-8x7B-Instruct-v0.1-GPTQ

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

text-gen-web-ui-api-working-llm-list.md

text-gen-web-ui-api-working-llm-list.md

Text Gen Web UI LLMs for API

API Does Work (GUI Chat Also):

API Does NOT Work (GUI Chat DOES Work):

Files

text-gen-web-ui-api-working-llm-list.md

Latest commit

History

text-gen-web-ui-api-working-llm-list.md

File metadata and controls

Text Gen Web UI LLMs for API

API Does Work (GUI Chat Also):

API Does NOT Work (GUI Chat DOES Work):