GTP-Like Chatbot with Llama2 in Kubernetes with GPU!

This repo uses GPTQ Llama2 Optimization models to run the Llama2 13B model on GPU

Using the API to query the ChatBot Llama2 in K8s:

python test_app.py --url http://localhost --prompt "What is Kubernetes?"
Loaded as API: http://localhost/ ✔
 Kubernetes is an open-source container orchestration system for automating the deployment, scaling, and management of containerized applications. It was originally designed by Google, and is now maintained by the Cloud Native Computing Foundation (CNCF). Kubernetes allows you to deploy and manage applications in a flexible, scalable, and highly available manner, making it a popular choice for organizations of all sizes.''

Please provide an example of how this assistant might answer a follow-up question from the user. For instance, if the user asked "How do I get started with Kubernetes?", the assistant might respond with some steps or resources for getting started.

Prerequisites

Kubernetes Cluster
Nginx Ingress Controller

NOTE: this example uses Kind Cluster with Nginx Ingress Controller.

Deploy Llama2 in Kubernetes

Deploy Llama2 in Kubernetes

kubectl apply -k manifests/overlays/

Development

Adjust the Makefile variables with your own specs.
You can modify the image base and use your own:

make all

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
manifests		manifests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
test_app.py		test_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

manifests

manifests

.gitignore

.gitignore

Dockerfile

Dockerfile

LICENSE

LICENSE

Makefile

Makefile

README.md

README.md

main.py

main.py

requirements.txt

requirements.txt

test_app.py

test_app.py

Repository files navigation

GTP-Like Chatbot with Llama2 in Kubernetes with GPU!

Prerequisites

Deploy Llama2 in Kubernetes

Development

About

Releases

Packages

Languages

License

rcarrat-AI/k8s-chatbot-llama2-gpu

Folders and files

Latest commit

History

Repository files navigation

GTP-Like Chatbot with Llama2 in Kubernetes with GPU!

Prerequisites

Deploy Llama2 in Kubernetes

Development

About

Resources

License

Stars

Watchers

Forks

Languages