Skip to content

Repository with code for microservices that runs cpsign models and makes it possible to do predictions

License

Notifications You must be signed in to change notification settings

arosbio/cpsign_predict_services

Repository files navigation

CPSign REST Services

Web services for running CPSign predictive models as REST services. The models should be generated using the open source program CPSign - see preprint CPSign at bioRxiv.

Table of Contents

Introduction

Currently this server supports models trained from version 2.0.0 of CPSign. These REST services are automatically documented using OpenAPI by Swagger annotations of the code. Optionally the services can bundle in a drawing interface where molecules can be drawn in a JSME editor and generate images of atom contributions to the predictions as well as the Swagger UI for viewing/rendering the OpenAPI definition in a more human readable format.

Further reading

Quick start

We now publish base docker images using GitHub packages which simplifies spinning up your own services. See the available images and tags at the Packages tab at GitHub. There are one type of service for each type of CPSign model;

  • cpsign-cp-clf-server: Conformal classification models, supports both ACP and TCP models.
  • cpsign-cp-reg-server: Conformal regression models.
  • cpsign-vap-clf-server: Venn-ABERS probabilistic models.

To build a Docker image that includes your model you only need these lines in a Dockerfile (excluding the comment lines):

# Pick the base image to use
FROM ghcr.io/arosbio/[IMAGE-NAME]:[TAG]
# Copy your model to the default model-path
COPY [your-model-file] /var/lib/jetty/model.jar

This assumes that you have the model in the same directory as your Dockerfile. Building your image is then easily performed by:

docker build -t [YOUR TAG] .

And then you can start it up by:

docker run -p 80:8080 -u jetty [YOUR TAG]

The service exposes port 8080 in the container and the docker run command maps it to port 80 on your local machine - which allows you to check that it works using your standard web browser;

http://localhost/api/v2/modelInfo
# or:
http://localhost/api/v2/health

Note 1: Tag names follows the same convention as the profiles in the maven project, i.e. tags that have a suffix -full are built using the maven profile full - i.e. including the draw UI and Swagger UI. The tags that do not include the suffix are built using the default profile which do not include these additional resources.

Note 2: There are alternative ways of deploying your models using these Docker images, e.g. you do not have to build an image containing the model and can instead mount a filesystem and give a URI from where to load the model. Here's an example of how to deploy a conformal classification model in your current directory:

docker run -p 80:8080 \
    --name clf-model \
    --mount type=bind,source="$(pwd)",target=/app/data \
    --env MODEL_FILE=/app/data/clf-model.jar \
    ghcr.io/arosbio/cpsign-cp-clf-server:latest

Here the --mount argument mounts the current directory to the directory /app/data inside the container. In this example we assume that there is a conformal classification model called clf-model.jar in the current directory, which will then have the URI /app/data/clf-model.jar inside the container. The --env argument specifies an environment variable (MODEL_FILE) for the container which the server looks at when initiating the server (overriding the default model location).

License

The CPSign program is dual licensed, see more in the CPSign base repo. This extension is published under the GNU General Public License version 3 (GPLv3).

Origin filter

CORS headers are now set in the file CORSFilter and the Access-Control-Allow-Origin header can be configured at server startup by passing either an environment variable or system property ALLOWED_ORIGIN with the desired content. When no such information is given, the default is to set Access-Control-Allow-Origin: *.

Custom build

When custom you have custom requirements, e.g. wish to include the draw UI, here comes more details that required to build the services yourself.

Repo layout

This repo contains a Maven Parent pom in the root and three service-implementations (cp_classification, cp_regression and vap_classification) which handles the three types of CPSign predictive models that can be deployed. To reduce the code duplication a separate java project (service_utils) is used for grouping common utility function and models, so that updates can be applied more easily and pushed to all concrete model services. The folder web_res in the root is used for common resources such as the Swagger UI and some common code for the draw GUI. These resources are optionally included in the final WAR application (see more below).

Building and deployment

Building the WAR files are done either from one of the child modules (building a single service WAR) or from the root (parent) module (to build all three WARs). There are two profiles (full and thin), where the full profile is active by default. The full profile bundles in static files for serving the Swagger UI and draw GUI as part of the WAR application and thus make each service slightly larger. The thin profile excludes these static components and thus create more lightweight services (difference is around 7MB).

# Build the full:
mvn package -DskipTests
# Build the using the thin profile
mvn package -DskipTests -P thin

The generated WAR files are saved in each service /target folder using the default <service-name>-<version>.war name (but the WAR name can be modified by adding the optional argument -DfinalName=<your name>). This can simply be dropped in and deployed in a Jetty server together with the model you wish to deploy, currently tested and built for Jetty 11.0.18.

Start up

When starting a prediction service the server will need the prediction model that should be used, this can be injected and specified in three different ways and handled (in order):

  1. Firstly, the code will check if the environment variable MODEL_FILE is set, if it is set it will handle this as a URI to a model. If the content of MODEL_FILE either is not pointing to a model, or if the model is non-compatible in any way, the setup will fail.
  2. Secondly, the code with check if a JVM property MODEL_FILE was set and try to load this as a URI pointing to a model. Setup will fail in case the URI is invalid or points to an invalid/incompatible file/model.
  3. As a final step, the server will check in the location /var/lib/jetty/model.jar. If there is no model the setup will fail.

In case the setup fails, the web server will still be running but all calls to the REST endpoints should return a HTTP 503 error code.

Docker build

If you wish to deploy your services using Docker you can look at our multi-stage Dockerfile which builds all three server types and can be used for both building the thin (default) or full server version. Switching between these are achieved by passing the argument --build-arg SERVICE_TYPE=full for e.g. using the full profile instead. Note that the base jetty image can be replaced with the alpine version in order to make the container slimmer, but as it do not work for all platforms we have opted for the default jetty image for version 11.0.18 for our base docker images - but you might be able to slim down your services by testing out the alpine version.

Swagger UI

The prediction services are documented by an OpenAPI definition, which is located at <server-url>:<optional-port>/api/openapi.json once the REST service is up and running. This is however rater hard for a human to read, why we recommend to use the Swagger UI for an easier way to view it. Users can either download and run the Swagger UI locally or from another web service by pointing to your server-URL, or for convenience we make it possible to add the Swagger UI static files to the WAR files themselves in which case the Swagger UI is accessible within the service from the root URL of the service, e.g. http://localhost:8080 in case you run it locally using e.g. the start_test_server.sh script and use the default settings.

Draw GUI

Each service can optionally include a GUI where molecules can be drawn in one window and the atom gradient can be viewed interactively, i.e. see what parts of the molecule had the greatest impact on the prediction. If the WAR file is built with the profile full this interface is accessible from the URL <service-url>:<optional-port>/draw and looks e.g. in this way;

Draw GUI

In case you run the start_test_server.sh this web page is accessible from http://localhost:8080/draw/.

Check service health

The services all has a REST endpoint at <service-URL>:<optional-port>/api/v2/health that returns HTTP 200 if everything is OK or 503 if something is wrong.

Developer info

This section is intended for developers.

Software testing

Requirements

For all tests to run the services each need a valid model. For each service the tests rely on having a valid model (for the given service type) at location src/test/resources/test-model.cpsign. For the cp_classification service there's also an alternative test-suite for TCP which requires a TCP-model in the location src/test/resources/test-model-tcp.cpsign

Unit-testing utility code

There are a few unit-tests, i.e. tests that do not rely on having a web server running. These are executed by running mvn test from the root directory.

Integration tests

The integration tests requires a running REST service. These tests are run using mvn verify but in order for this to work, the environment variable MODEL_FILE must be set (and pointing to the correct model for each service type). Thus, these tests do not work by running mvn verify from the maven parent - instead each service type must be tested one-by-one. To make this simpler there's a run_IT_tests.sh script within each service module that sets these variables and runs the integration tests (i.e. pointing to the src/test/resources/test-model.cpsign model within each child module, then runs mvn verify). There is also a 'master script' in the root directory that calls each service test-script so all integration tests can be run from a single script.

User-interactive testing

To facilitate easy interactive testing, each of the services includes a shell script (start_test_server.sh) that sets the environment variable to point to the test-model (see Requirements) and spins up a jetty server for the user to try it out.

TODOs:

  • update draw GUI to use ketcher drawer. This is more frosting on the cake, will be postponed for now.
  • Add Dockerfile for how to start a server
  • add config to startup to include/exclude the draw GUI and swagger UI files - to make it possible to have as small services as possible
  • look over updates on jetty, swagger etc
  • refactor the "draw" thing as a separate folder that is pulled in during maven build
  • CORSFilter update?
  • Remove jackson included both in cpsign and from swagger stuff
  • Set up logging properly - need a logging config file
  • add Prediction-starter servlet thingy to web.xml again
  • local repo for CPSign in the root of git-repo, not for each individual Eclipse-project

Implementation details

  • Using JAX-RS annotations for the REST api
  • Using Jersey runtime "environment" and tooling
  • Using Jetty server (due to being light weight)
  • Using Jackson annotation such as @JsonProperty("code") on the models, which Swagger-core picks up and puts in the OpenAPI definition and for Jackson conversion POJO to json/xml

About

Repository with code for microservices that runs cpsign models and makes it possible to do predictions

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors 4

  •  
  •  
  •  
  •