Pet project to explore the "Model-as-a-Service" concept via API creation. Docker image available here.
Pet project for creating a simple Naïve Bayes classifier with the Titanic data set (as an example) and deploying it as an API through Docker and a container-hosting service.
This project is heavily dependent on two R packages:
plumber, for creating the API with R --- see more here
mlr, to create and use the model --- see more here
HTTPS can be set via the platform of delivery (tested with Microsoft Azure) instead of burdening the Docker container with a server. However, it is straightforward to integrate an apache server with custom certificates. See the security note below for more information.
It is assumed that the model is created locally, by sourcing the createModel.R. The API itself needs only the the final model and one row of example data (which contains all relevant metadata), which are both stored as
.rds files inside the
model folder, along with the parameter validation function contained in
valParams.R, already inside the
Running the API locally (without Docker)
To test the API locally (without the Docker container), assuming that both
mlr packages are installed, all that is needed is to source the
runAPI.R script, which will automatically plumb
You should be able to find the basic introductory page at http://localhost:8000.
/titanic endpoint handles the actual prediction. The parameters to be passed into the model are included in the request itself for simplicity. The pattern is
Running the API locally (with Docker)
If you've built the Docker container (or pulled it from here), then you don't need to have an R installation - that's already taken care of via the container, with all the necessary packages installed.
- run locally with:
docker run --rm --user docker -p 8000:8000 titanic_api
- and see results on http://localhost:8000/
For an example query and response, see the previous section.
Running the API from a hosted container
Simply navigate to the URL for the container. That should load the introductory page. Appending
/titanic?... as above should result in the expected behaviour (see the running locally section), as the API maps to the endpoints directly, without any need for further manual configuration or routing.
Data was acquired from Stanford's CS109 publicly accessible page here.
A note regarding security
It is assumed that the container would be online behind other security measures such as user authentication and HTTPS. The container itself validates the parameters passed to it (thus avoiding the most obvious security breach) but does not implement other security features. However, such measures are easily implemented and usually already in place. Container hosting services may also offer solutions as well (as mentioned above, tested with Microsoft Azure).
If needed, HTTPS can be implemented via the container by including an apache server and the necessary certificates. For an example of such an implementation, see T-mobile's repository.
Naïve-Bayes probabilities from the Titanic dataset used
Conditional probabilities (categorical features)
Naïve-Bayes assumes Gaussian distribution for non-categorical features.
Family members on board
While the feature is obviously an ordinal (and equally spaced by one person at a time), it has been left as a numerical to be able to predict previously unseen combinations. During input validation, it is ensured that an integer is passed to the model.