# Using R models in Azure Machine Learning

This notebook offers a quick demonstration of importing and registering a trained R model into Azure Machine Learning environment.  This integration facilitates authoring of complex Machine Learning inference pipelines that may include multiple modeling technologies.

To convert an R model to AML, the following pipeline may be used:
- Train R model using R (RStudio)
- Save R model with a native R model archive library (e.g.: e1071)
- Wrap the R model as MLFlow model using rpy2 Python library
- Register the wrapped MLFlow model with your AML workspace

## Install R on your computer

Please install R on your computer using such repositories as listed on https://www.r-project.org/

Please set the R_HOME directory as per your local installation of R

In [None]:
%env R_HOME=<Your path, example: C:\Program Files\R\R-4.3.3>

## Install rpy2

Install rpy2 using pip

In [None]:
%pip install rpy2

## Convert R model to MLFlow 

Define Model wrapper

In [None]:
# Load training and test datasets
from sys import version_info
import mlflow.pyfunc
import numpy as np
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
from rpy2.robjects import numpy2ri

PYTHON_VERSION = "{major}.{minor}.{micro}".format(major=version_info.major,
                                                  minor=version_info.minor,
                                                  micro=version_info.micro)

# Train and save an R model
r_model_path = "./svm_model/artifact"

artifacts = {
    "model_rds_path" : "{}.rds".format(r_model_path),
    "model_dep_path" : "{}.dep".format(r_model_path)
}

r = robjects.r
numpy2ri.activate()

# create wrapper
class MLFlowWrapper(mlflow.pyfunc.PythonModel):

    def load_context(self, context):

        self.model = r.readRDS(context.artifacts["model_rds_path"])

        with open(context.artifacts["model_dep_path"], "rt") as f:
            model_dep_list = [importr(dep.strip())
                              for dep in f.readlines()
                              if dep.strip()!='']

        return self
        
    
    def predict(self, model, X):
        if self.model is None:
            raise Exception("There is no Model")
        
        if type(X) is not np.ndarray:
            X = np.array(X)

        return np.array(r.predict(self.model, X))

Test your MLFlow wrapper logic:

In [None]:
class TestContext:
    def __init__(self, _artifacts) -> None:
        self.artifacts = _artifacts

In [None]:
test_context = TestContext(artifacts)
test_context.artifacts

In [None]:
wrapped_model = MLFlowWrapper()
wrapped_model.load_context(test_context)
test_data = [[5.1, 3.5, 1.4, 0.2], [5.9, 3.0, 5.1, 1.8]]
wrapped_model.predict(None, # use internal model 
                      test_data)



Define model dependencies

In [None]:
conda_env = {
    'channels': ['defaults'],
    'dependencies': [
      'python={}'.format(PYTHON_VERSION),
      'pip',
      {
        'pip': [
          'mlflow',
          'rpy2',
        ],
      },
    ],
    'name': 'rpy2_env'
}


## Save the wrapped R model in ML Flow format 

In [None]:
mlflow_pyfunc_model_path = "r_mlflow_pyfunc_rpy2"
mlflow.pyfunc.save_model(path=mlflow_pyfunc_model_path, python_model=MLFlowWrapper(), conda_env=conda_env, artifacts=artifacts)


## Register the MLFlow model with Azure Machine Learning

In [None]:
# Import the necessary libraries
from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

# authenticate
credential = DefaultAzureCredential()
# # Get a handle to the workspace
import os 

# authenticate
credential = DefaultAzureCredential()

# Get a handle to the workspace
ml_client = MLClient(
    credential=credential,
    subscription_id = "781b03e7-6eb7-4506-bab8-cf3a0d89b1d4",
    resource_group_name = "antonslutsky-rg",
    workspace_name = "gpu-workspace",
)



### Register MLFlow model

In [None]:

# Provide the model details, including the
# path to the model files, if you've stored them locally.
mlflow_model = Model(
    path=mlflow_pyfunc_model_path,
    type=AssetTypes.MLFLOW_MODEL,
    name=mlflow_pyfunc_model_path,
    description="MLflow Model created from local files.",
)

# Register the model
ml_client.models.create_or_update(mlflow_model)

## Deploy the registered model as Real-Time endpoint

### Create online endpoint

In [None]:
%%writefile endpoint.yml
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineEndpoint.schema.json
name: r-mlflow-pyfunc-rpy2
auth_mode: key

In [None]:
!az ml online-endpoint create --file endpoint.yml

## Define scoring script that uses rpy2

In [None]:
%%writefile ./src/score.py
import os
import logging
import json
import numpy as np
import rpy2.robjects as robjects
from rpy2.robjects import numpy2ri
from rpy2.robjects.packages import importr

r = robjects.r
numpy2ri.activate()


class Model(object):

    def __init__(self):
        self.model = None

    def load(self, path):
        model_rds_path = "{}.rds".format(path)
        model_dep_path = "{}.dep".format(path)
        
        utils = importr('utils')
        utils.install_packages('e1071')

        self.model = r.readRDS(model_rds_path)

        with open(model_dep_path, "rt") as f:
            model_dep_list = [importr(dep.strip())
                              for dep in f.readlines()
                              if dep.strip()!='']
            
            print("imported packages: ", model_dep_list)

        return self

    def predict(self, X):
    
        if self.model is None:
            raise Exception("There is no Model")
        
        if type(X) is not np.ndarray:
            X = np.array(X)

        pred = r.predict(self.model, X)

        return np.array(pred)

def init():
    global model
    model_path = os.path.join(
        os.getenv("AZUREML_MODEL_DIR"), "r_mlflow_pyfunc_rpy2/artifacts/artifact"
    )

    model = Model()
    model.load(model_path)


def run(raw_data):

    logging.info("model 1: request received")
    data = json.loads(raw_data)["data"]
    data = np.array(data)
    result = model.predict(data)
    logging.info("Request processed")
    return result.tolist()


### Create deployment confirguration

In [None]:
%%writefile deployment.yml
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: red
endpoint_name: r-mlflow-pyfunc-rpy2
model: azureml:r_mlflow_pyfunc_rpy2@latest
environment: azureml:r_environment@latest
code_configuration:
  code: src
  scoring_script: score.py
instance_type: Standard_DS3_v2
instance_count: 1

In [None]:
!az ml online-deployment create --file deployment.yml --skip-script-validation --all-traffic

==============================================================================================================