Gradio vLLM Hugging Face

Gradio Frontend using vLLM to download and deploy Hugging Face models. CRUD REST API using Docker SDK and Redis.

Prerequisite: A GPU which supports CUDA 12.4 Prerequisite: Some models require a minimum Bfloat16

To install the containers run the docker compose file:

sudo docker compose up -d

then visit the Gradio Frontend at the port specified in the compose file e.g. http://localhost:7860

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
backend		backend
frontend		frontend
utils		utils
vllm		vllm
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback