Deploying HuggingFaceH4/starchat-alpha Large Language Model on AWS EC2 Instance

Choosing an AMI and Instance

I've chosen the AMI "Deep Learning AMI GPU PyTorch 2.0.0 (Ubuntu 20.04) 20230530" and the instance type "g5.4xlarge", which comes with a 24 GiB GPU. This is because the model requires approximately 20GiB of GPU memory to run it in 8-bit mode.

Setting up the Environment

First, we need to activate the preinstalled virtual environment, that comes with Docker by default with our chosen AMI.

source activate pytorch

Next, we clone the repository and navigate into the folder:

git clone https://github.com/SaturdaysAI/chatbot-docker-server.git
cd chatbot-docker-server/

Next, we download the model locally:

pip install huggingface_hub
python download_model.py

The pip install huggingface_hub command installs the Hugging Face hub, a Python library that allows you to download and use models from Hugging Face. python download_model.py runs the Python script that downloads the model into a folder named "model" in the current directory.

Dockerization

Now we need to create an image from the Dockerfile. We do this by running the following command:

docker build --tag chatbot-docker .

The docker build --tag chatbot-docker . command builds a Docker image from the Dockerfile located in the current directory and tags it as "chatbot-docker".

Once the image is created, we can verify it with the following command:

docker images

This command lists all Docker images currently on your machine.

The output should be something like this:

REPOSITORY	TAG	IMAGE ID	CREATED	SIZE
chatbot-docker	latest	3f8763f1593a	3 minutes ago	8.41GB

Now we can run the image in a container with the following command:

docker run -d -p 5000:5000/tcp --gpus all --mount type=bind,src=/home/ubuntu/chatbot-docker-server/model,dst=/model chatbot-docker:latest

The docker run -d --gpus all --mount type=bind,src=/home/ubuntu/chatbot-docker-server/model,dst=/model chatbot-docker:latest command runs a Docker container in detached mode (-d), enabling access to all GPUs (--gpus all), and binds the "model" directory from your local machine to the "/model" directory inside the Docker container (--mount type=bind,src=...dst=...).

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
download_model.py		download_model.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dockerfile

Dockerfile

README.md

README.md

app.py

app.py

download_model.py

download_model.py

requirements.txt

requirements.txt

Repository files navigation

Deploying HuggingFaceH4/starchat-alpha Large Language Model on AWS EC2 Instance

Table of Contents

Choosing an AMI and Instance

Setting up the Environment

Dockerization

About

Releases

Packages

Languages

SaturdaysAI/chatbot-docker-server

Folders and files

Latest commit

History

Repository files navigation

Deploying HuggingFaceH4/starchat-alpha Large Language Model on AWS EC2 Instance

Table of Contents

Choosing an AMI and Instance

Setting up the Environment

Dockerization

About

Resources

Stars

Watchers

Forks

Languages