<h1 align="center">
  <a href="https://uptrain.ai">
    <img width="300" src="https://user-images.githubusercontent.com/108270398/214240695-4f958b76-c993-4ddd-8de6-8668f4d0da84.png" alt="uptrain">
  </a>
</h1>

<h1 style="text-align: center;">Monitoring Concept Drift on a Binary Classification Model</h1>

In this notebook, we will see how we can use UpTrain package to identify concept drift ie degradation in model's performance. We will use DDM (Drift Detection Method) for the same.

### Install the required packages for this example

In [1]:
!pip install torch imgaug



In [2]:
import sys
import os
import subprocess
import zipfile
import numpy as np
import uptrain
import sys
sys.path.insert(0,'..')

from helper_files import read_json, write_json, KpsDataset
import torch

  from .autonotebook import tqdm as notebook_tqdm


Let's first download the dataset from remote and define the testing and annotation files

In [3]:
data_dir = "data"
remote_url = "https://oodles-dev-training-data.s3.amazonaws.com/data.zip"
orig_training_file = 'data/training_data.json'
if not os.path.exists(data_dir):
    try:
        # Most Linux distributions have Wget installed by default.
        # Below command is to install wget for MacOS
        wget_installed_ok = subprocess.call("brew install wget", shell=True, stdout=subprocess.DEVNULL, stderr=subprocess.STDOUT)
        print("Successfully installed wget")
    except:
        dummy = 1
    try:
        if not os.path.exists("data.zip"):
            file_downloaded_ok = subprocess.call("wget " + remote_url, shell=True, stdout=subprocess.DEVNULL, stderr=subprocess.STDOUT)
            print("Data downloaded")
        with zipfile.ZipFile("data.zip", 'r') as zip_ref:
            zip_ref.extractall("./")
        full_training_data = read_json(orig_training_file)
        np.random.seed(1)
        np.random.shuffle(full_training_data)
        reduced_training_data = full_training_data[0:1000]
        write_json(orig_training_file, reduced_training_data)
        print("Prepared Example Dataset")
        os.remove("data.zip")
    except Exception as e:
        print(e)
        print("Could not load training data")

real_world_test_cases = 'data/real_world_testing_data.json'
golden_testing_file = 'data/golden_testing_data.json'
annotation_args = {'master_file': 'data/master_annotation_data.json'}

inference_batch_size = 16

Next, we train our network using Deep Neural Network

In [4]:
from helper_files import get_accuracy_torch, train_model_torch, BinaryClassification
train_model_torch('data/training_data.json', 'version_0')

Next, we get the model accuracy on testing dataset

In [5]:
get_accuracy_torch(golden_testing_file, 'version_0')

Evaluating model: version_0  on  15731  data-points


0.9253702879664357

Let's define the UpTrain config to detect Concept drift and use DDM as the algorithm for the same

In [6]:
cfg = {
    "checks": [{
        'type': uptrain.Monitor.CONCEPT_DRIFT,
        'algorithm': uptrain.DataDriftAlgo.DDM
    }],
    "data_identifier": "id",
    "logging_args": {"st_logging": True},
    "retraining_folder": "uptrain_smart_data_data_integrity",
}

In [7]:
framework_torch = uptrain.Framework(cfg)

model_dir = 'trained_models_torch/'
model_save_name = 'version_0'
real_world_dataset = KpsDataset(
    real_world_test_cases, batch_size=inference_batch_size, is_test=True
)
model = BinaryClassification()
model.load_state_dict(torch.load(model_dir + model_save_name))
model.eval()

gt_data = read_json(annotation_args['master_file'])
all_gt_ids = [x['id'] for x in gt_data]

for i,elem in enumerate(real_world_dataset):

    # Do model prediction
    inputs = {"kps": elem[0]["kps"], "id": elem[0]["id"]}
    x_test = torch.tensor(inputs["kps"]).type(torch.float)
    test_logits = model(x_test).squeeze() 
    preds = torch.round(torch.sigmoid(test_logits)).detach().numpy()
    idens = framework_torch.log(inputs=inputs, outputs=preds)

    # Attach ground truth for accuracy evaluation
    this_elem_gt = [gt_data[all_gt_ids.index(x)]['gt'] for x in elem[0]['id']]
    framework_torch.log(identifiers=idens, gts=this_elem_gt)

Deleting the folder:  uptrain_logs

  You can now view your Streamlit app in your browser.

  Local URL: http://localhost:8501
  Network URL: http://192.168.6.92:8501

  For better performance, install the Watchdog module:

  $ xcode-select --install
  $ pip install watchdog
            
  Stopping...



KeyboardInterrupt

