# Retraining Policies

Retraining closes the loop of the model lifecycle. It helps to ensure that the best performing model with latest available data is always ready to go. 

DataRobot provides [broad support for retraining policies](https://docs.datarobot.com/en/docs/workbench/nxt-console/nxt-mitigation/nxt-retraining.html#retraining) which will automate the process of creating a new model (and deploying it if needed). First some definitions: 

A retraining policy has three parts:

1. **Trigger:** This determines when the policy runs. The three "triggers" available: Scheduled, like "every Friday", Accuracy Status (when accuracy falls below X threshold), or Drift Status. 
2. **Selection:** DataRobot will create a AutoML/AutoTS project that will train a series of model. "Selection" criteria determines which model will be selected as the new retrained model. 
3. **Action:** Deterimines what happens to the candidate model once it has been retrained and evaluated. The options are to replace the model in the deployment, add it as a challenger where it will run in parallel, or simply save it until needed. 


In addition to the above components, retraining policies can have a range of different options for the retraining project. 

Retraining policies primarily use the DataRobot REST API. We will use the DataRobot Python API Client to bootstrap authentication and access to the REST API Calls. You can access the REST API documenation by using the "?" documentation app in DataRobot. This ensures you view the version of the documentation matched to your installed system. 




In [None]:
# Create and save the DataRobot Client (reading auth from envionment variables)

import datarobot as dr

client = dr.Client()
deployment = dr.Deployment.get("6759e8aebd38a7fca6ba234a")  # Update with your deployment id. 
deployment.update_challenger_models_settings(challenger_models_enabled=True) 


Before creating a retraining policy, we identify the user and dataset to use for retraining. 

In [10]:
RAW_DATASET_ID = "675c98d9f2e4b4189bff25f5"
PREDICTION_SERVER = "67521300fe4b98000d28270f"
RETRAINING_USER_ID = deployment.owners['preview'][0]['id'] # User who created the deployment. 

body = {
  "datasetId": RAW_DATASET_ID,
  "predictionEnvironmentId": PREDICTION_SERVER,
  "retrainingUserId": RETRAINING_USER_ID
}

resp = client.patch(f"deployments/{deployment.id}/retrainingSettings", json=body)
print(resp.status_code) # Should be 204



204


The following implements a Drift Detection policy for our deployment from the previous notebooks. 

In [None]:

POLICY_NAME = "MultiClass Drift Tracking Policy"

body = {
  "action": "create_challenger",
  "autopilotOptions": {
    "blendBestModels": False,
    "mode": "auto",
    "runLeakageRemovedFeatureList": True,
    "scoringCodeOnly": False,
    "shapOnlyMode": False
  },
  "description": None,
  "featureListStrategy": "informative_features",
  "modelSelectionStrategy": "autopilot_recommended",
  "name": POLICY_NAME,
  "projectOptions": {
    "cvMethod": "RandomCV",
    "holdoutPct": None,
    "metric": "Accuracy",
    "reps": None,
    "validationPct": None,
    "validationType": "CV"
  },
  "projectOptionsStrategy": "same_as_champion", # Any class aggregation settings are inherited here. 
  "trigger": {
    "minIntervalBetweenRuns": None,
    "schedule": {
      "dayOfMonth": [
        "*"
      ],
      "dayOfWeek": [
        "*"
      ],
      "hour": [
        0
      ],
      "minute": [
        0
      ],
      "month": [
        "*"
      ]
    },
    "statusDeclinesToFailing": False,
    "statusDeclinesToWarning": True,
    "statusStillInDecline": False,
    "type": "data_drift_decline"
  }
}
url = f"deployments/{deployment.id}/retrainingPolicies/"
resp = client.post(url, json=body)
print(resp.text)


