This repo provides the code for serving the original bloomz-3b model in production for german snippet generation. It can be used to spin up a simple HTTP server, that handles translation and snippet generation.
NOTE: There is an even faster server for this model, built with the Potassium framework. Please use the Potassium Server for optimal performance.
Curious to get your hand on a bloomz server capable of german snippet generation?
You can check it out with docker:
- Run
docker build -t bloomz-model-server . && docker run -it bloomz-model-server
to build and run the docker container.
Or you can check it out manually:
- Run
pip3 install -r requirements.txt
to download dependencies. - Run
python3 server.py
to start the server. - Run
python3 test.py
in a different terminal session to test against it.
Note: Model requires a GPU with ~ 15GB memory for generation!
app.py
contains the code to load and run the model for inference.- You can run a simple test with
test.py
!
if deploying using Docker:
download.py
is a script to download our finetuned model weights at build time.
This repo provides you with a functioning http server for our finetuned gptj-title-teaser-10k model. You can use it as is, or package it up with our provided Dockerfile
and deploy it to your favorite container hosting provider!
We are currently running this code on Banana, where you can get 1 hour of model hosting for free. Feel free to choose a different hosting provider. In the following section we provide instructions for deployment with Banana.
- Fork this repo
- Log in to the Banana App
- Select your forked repo for deploy
It'll then be built from the dockerfile, optimized, then deployed on Banana Serverless GPU cluster.
You can monitor buildtime and runtime logs by clicking the logs button in the model view on the Banana Dashboard.
When build and optimization finished successfully you will find your credentials printed in the build logs.
Your model was updated and is now deployed!
It is runnable with the same credentials:
API_KEY=Your-Personal-Api-Key
MODEL_KEY=Your-Personal-Model-Key
You need these keys to hook up the web app with the model.
To setup the frontend follow the instructions in the web-app repository.