This repository hosts deployment artifacts for the reference architecture "Real-time model deployment with R". You can use these artifacts to deploy a containerised predictive service in Azure.
The workflow in this repository builds a sample machine learning model: a random forest for housing prices, using the Boston housing dataset that ships with R. It then builds a Docker image with the components to host a predictive service:
- The base R runtime
- the model object plus any packages necessary to use it (randomForest in this case).
- the Plumber package for exposing R code as an API.
- a script that is run on container startup to create the API.
This image is pushed to a Docker registry hosted in Azure, and then deployed to a Kubernetes cluster, also in Azure.
To use this repository, you will need the following:
-
A recent version of R. It's recommended to use Microsoft R Open, although the standard R distribution from CRAN will work perfectly well.
-
The AzureContainers package, version 1.2.0 or later, for working with containers in Azure.
-
The bcrypt package, which is used to generate the service authentication password.
-
An Azure subscription.
Edit the file resource_specs.R
to contain the following:
- Your Azure Active Directory tenant. This can be either your directory name or a GUID.
- Your subscription ID.
- The name of the resource group that will hold the resources created. The resource group will be created if it does not already exist.
- The location of the resource group. For a list of regions where AKS is available, see this page.
- The names for the ACR and AKS resources to be created. The name of the AKS resource, along with its location, will also be used for the domain name label of the cluster.
- The number of nodes and node VM size for the AKS cluster.
- Your email address. This is used to obtain a TLS certificate from Let's Encrypt.
- A (generic) username and password for the predictive service.
The script 00_train_model.R
trains a simple model (a random forest for house prices, using the Boston dataset), and saves the model object to a .RDS file. This step is optional, as the repository already contains a suitable model object.
The script 01_create_resources.R
creates the necessary Azure resources for the deployment. Note that creating an AKS cluster can take several minutes.
The script 02_install_ingress.R
installs the Traefik reverse proxy on the Kubernetes cluster and sets the cluster's domain name.
The script 03_deploy_service.R
pushes the model image to Azure, and deploys the predictive service.
The script 04_test_service.R
tests that the service works properly, by sending a request to the API endpoint; you can check that the responses are as expected.