This repository contains docker files for deploying Marian as a REST service in a Docker container. At this point, each instance supports only a translation direction. (In the future, the Marian REST server may also support multiple models in one instance, but currently it's a single system per instance.)
- Put all necessary files into a directory:
- the binarized model file(s). Binarize with
marian-conv
from the Marian distribution. - the vocabulary file(s)
- the decoder.yml file. You'll have to create this or adapt it from the
decoder.yml
file written by the Marian training process. Here's an example:relative-paths: true models: - model.bin vocabs: - joint-vocab.spm - joint-vocab.spm beam-size: 4 normalize: 1 word-penalty: 0 mini-batch: 128 maxi-batch: 100 maxi-batch-sort: src # The following are specific to the marian REST server # source-language and target-language are used for the Demo # interface; the ssplit-prefix-file is from the Moses sentence splitter # and comes with the marian REST server image. Pick the right one # for your source language. SSPLIT_ROOT_DIR is set to the appropriate # value in the `mariannmt/marian-rest-server` image. source-language: German target-language: English ssplit-prefix-file: ${SSPLIT_ROOT_DIR}/nonbreaking_prefixes/nonbreaking_prefix.de
- the binarized model file(s). Binarize with
- without GPU utilization:
docker run --rm -d -p 18080:18080 -v /path/to/the/model/directory:/opt/app/marian/model mariannmt/marian-rest-server
- with GPU utilization. This requires Docker 19.03 or above (see https://github.com/NVIDIA/nvidia-docker)
and unfortunately currently won't work within docker-compose (see docker/compose#6691).
where GPU_IDs is a comma-separated list of GPUs on the host that should be made available to the Docker container.
docker run --rm --gpus device=${GPU_IDs} -d -p 18080:18080 -v /path/to/the/model/directory:/opt/app/marian/model mariannmt/marian-rest-server
The server currently supports two APIs
- The ELG API can be accessed at
http://localhost:18080/api/elg/v1
. - The Bergamot API can be accessed at
http://localhost:18080/api/bergamot/v1
.
The server currently does not support https.
For easy deployment in a cluster, you may want create a Docker image with the model integrated.
-
copy the
./marian-mt-service/Dockerfile
from this repository into your model directory. -
run
docker build -t ${IMAGE_ID} /path/to/the/model/directory
IMAGE_ID is the name of the resulting Docker image (e.g. /marian-rest-server:).
-
to publish, push the image to dockerhub:
docker push ${IMAGE_ID}
Normally, this is not necessary. Do this only if you can't find mariannmt/marian-rest-server:latest, or if you are using your own custom version of Marian server.
In order to achieve compact images, we use a staged build process:
- Create an image that contains the build environment (this avoids you having to install all kinds of things on your local machine).
- Compile Marian in a separate build process that uses the build environment image to compile but uses host-mounted directories so that you can keep intermediate steps around.
- Create a new image and copy only the necessary bits and pieces into the new image.
make image/build-environment
make image/marian-rest-server