# Overview

## Objective

This notebook provides an example of how to registry a Tensoflow model on SAS Model Manager.

The goal is manage the e2e with the model deployment on RedHat OpenShift

## Assumption

At that time, SAS does not have a proper Python library to deal with Tensorflow models.

But we can use some pzmm and sasctl functionalities.

## Import and Setup

In [None]:
# General
import os
import shutil
import subprocess

# Data
import pandas as pd

# SAS Model Manager
import sasctl
from sasctl.services import model_repository, model_management
import sasctl.pzmm as pzmm

## Helpers

In [None]:
def setup(folder, modelname):
    model_folder =  os.path.join(folder, modelname)
    #if yes, delete it
    if os.path.exists(model_folder):
        shutil.rmtree(model_folder)
        print("Older " , model_folder ,  "folder removed!")
    os.makedirs(model_folder)
    print("Directory " , model_folder ,  " created!")
    return model_folder

def write_requirements(folder, filename):
    reqfile_path = os.path.join(folder, filename)
    with open(reqfile_path, "w") as f:
        sterr = subprocess.call(["pip", "freeze"], stdout=f, stderr=-1)
    if sterr==0:
        print("Requirements file created under " , reqfile_path)
    else:
        print("pip freeze command fails!")

def get_output_variables(names, labels, eventprob):
    outputVar = pd.DataFrame(columns=names)
    outputVar[names[0]] = labels
    outputVar[names[1]] = eventprob
    return outputVar

def zip_folder(zipfolder, name, rmtree=False):
    folder_to_zip_path = os.path.join(zipfolder, name)
    shutil.make_archive(
        folder_to_zip_path,        # folder to zip
        'zip',                  # the archive format - or tar, bztar, gztar 
        root_dir=zip_folder,    # folder to zip root
        base_dir=name)          # folder to zip name
    if rmtree:
        shutil.rmtree(zip_folder_path) # remove .zip folder
    

def run_model_tracking():
    pass

## Define Variables

In [None]:
#Base
BASE_DIR_PATH = os.getcwd()
DATA_DIR_PATH = os.path.join(BASE_DIR_PATH, '../data')

# Data directories paths
TRAIN_DIR_PATH = os.path.join(DATA_DIR_PATH, 'train')

# Data file paths
TRAIN_DATA_PATH = os.path.join(TRAIN_DIR_PATH, 'train.csv')

# Models directory
MODELS_DIR = os.path.join(BASE_DIR_PATH, '../models')

# Deriverables directory
DELIVERS_DIR = os.path.join(BASE_DIR_PATH, '../deliverables')

# 1. Model Governance with SAS Model Manager Registry

In general, SAS Model Manager handles several files to guarantee model governance in the registry. 

For example, in case of pickle model, we have

- Required

    1. requirement.json
    2. score.py
    3. model.pkl
    4. inputVar.json
    5. outputVar.json
    6. ModelProperties.json
    

- Optional

    7. train.py
    8. fileMetadata.json
    9. dmcas_fitstat.json
    10. dmcas_roc
    11. dmcas_lift

Because we're going to deploy on RedHat OpenShift, we jusy need some of them for compliance.

## Create Model Folder

### Setup Model folder

In [None]:
# working dir
WRK_DIR = setup(DELIVERS_DIR, 'champion')

In [None]:
# Zip TF variables
TF_SAVEDMODEL_NAME = os.listdir(WRK_DIR)[-1]
ZIP_TF_SAVEDMODEL_PATH = os.path.join(WRK_DIR, TF_SAVEDMODEL_NAME)

### Write requirement.txt

In [None]:
write_requirements(WRK_DIR, 'requirements.txt')

### Write Metadata files

In [None]:
data_train = pd.read_csv(TRAIN_DATA_PATH, sep=',')

TARGET = 'BAD'
PREDICTORS = ['REASON', 'JOB', 'LOAN', 'MORTDUE', 'VALUE', 'YOJ', 'DEROG', 'DELINQ', 'CLAGE', 'NINQ', 'CLNO', 'DEBTINC']

In [None]:
JSONFiles = pzmm.JSONFiles()
#write input.json
JSONFiles.writeVarJSON(data_train[PREDICTORS], isInput=True, jPath=WRK_DIR)

In [None]:
NAMES=['EM_CLASSIFICATION', 'EM_EVENTPROBABILITY']
LABELS=['0', '1']
EVENTPROB=0.5
outputVar = get_output_variables(NAMES, LABELS, EVENTPROB)

#write output.json
JSONFiles.writeVarJSON(outputVar, isInput=False, jPath=WRK_DIR)

In [None]:
MODELNAME = 'Tensorflow BoostedTreesClassifier'
#write 
JSONFiles.writeModelPropertiesJSON(modelName=MODELNAME,
                                   modelDesc='A Classifier for Tensorflow Boosted Trees models',
                                   targetVariable=TARGET,
                                   modelType='Boosted Tree',
                                   modelPredictors=PREDICTORS,
                                   targetEvent=1,
                                   numTargetCategories=1,
                                   eventProbVar='EM_EVENTPROBABILITY',
                                   jPath=WRK_DIR,
                                   modeler='ivnard')

### Create zip files

In [None]:
# Zip TF SavedModel format
    ZIP_TF_SAVEDMODEL_NAME = os.path.join(WRK_DIR, TF_SAVEDMODEL_NAME)
shutil.make_archive(
    ZIP_TF_SAVEDMODEL_NAME, 
    'zip',                                     # the archive format - or tar, bztar, gztar 
    root_dir=WRK_DIR,                          # root for archive - current working dir if None
    base_dir=TF_SAVEDMODEL_NAME)               # start archiving from here - cwd if None too
shutil.rmtree(ZIP_TF_SAVEDMODEL_NAME)

In [None]:
ZIP_CHAMPION_NAME = WRK_DIR.split('/')[-1]
# Zip the entire folder
shutil.make_archive(
    WRK_DIR, 
    'zip',                                     # the archive format - or tar, bztar, gztar 
    root_dir=DELIVERS_DIR,                          # root for archive - current working dir if None
    base_dir=ZIP_CHAMPION_NAME)