GitHub - sujitvasanth/streaming-LLM-chat: transformers based streaming chat for GPTQ models

This is a transformers library application that allows you to choose a local LLM and run streaming inference on GPU.

it uses:

the models are assumed to be in oogabooga textgeneration ui folder

the openchat model is available at https://huggingface.co/

TheBloke/openchat-3.5-0106-GPTQ

sujitvasanth/TheBloke-openchat-3.5-0106-GPTQ

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
samplechat.gif		samplechat.gif
streaming LLM chat.py		streaming LLM chat.py

Provide feedback