#### Use Case Background 

##### For financial services organizations, model development for fraud detection and for surfacing potentially anti-money laundering activities are an area of increasing interest.

##### Bespoke models may be used by banks to replace rules-based scenarios or other fraud detection activities.

##### This use case models bank account holder activity to determine the probability of money launding event.

##### Analytics base table (aml_bank_prep) has already gone through ETL process and is prepped for modeling.

In [1]:
####################################################
###  Train & Register Python Scikit Logit Model  ###
####################################################

###################
### Credentials ###
###################

import os
import sys
from pathlib import Path

filepath = input("file path to credentials: ")
sys.path.append(filepath)
from credentials import hostname, session, protocol, output_dir, git_dir, token, token_pem, username

In [3]:
#############################
### Connect with SAS Viya ###
#############################

import swat

access_token = open(token, "r").read()
conn =  swat.CAS(hostname=hostname, username=None, password=access_token, ssl_ca_list=token_pem, protocol=protocol)
print(conn.serverstatus())

NOTE: Grid node action status report: 3 nodes, 24 total actions executed.
[About]

 {'CAS': 'Cloud Analytic Services',
  'CASCacheLocation': 'CAS Disk Cache',
  'CASHostAccountRequired': 'OPTIONAL',
  'Copyright': 'Copyright © 2014-2025 SAS Institute Inc. All Rights Reserved.',
  'GlobalReadOnlyMode': 'NO',
  'ServerTime': '2026-01-14T15:09:04Z',
  'System': {'Hostname': 'controller.sas-cas-server-default.viya.svc.cluster.local',
   'Linux Distribution': 'Red Hat Enterprise Linux release 9.7 (Plow)',
   'Model Number': 'x86_64',
   'OS Family': 'LIN X64',
   'OS Name': 'Linux',
   'OS Release': '5.15.0-1102-azure',
   'OS Version': '#111-Ubuntu SMP Fri Nov 21 22:22:11 UTC 2025'},
  'Transferred': 'NO',
  'Version': '4.00',
  'VersionLong': 'V.04.00M0P12082025',
  'Viya Release': '20260113.1768323213655',
  'Viya Version': 'Stable 2025.12',
  'license': {'expires': '06Mar2026:00:00:00',
   'gracePeriod': 0,
   'site': 'ENGAGE PLATFORM FINANCIAL CRIMES ANALYTICS PREMIER',
   'siteNum': 7

In [4]:
###############################
### Upload Data to SAS Viya ###
###############################

### upload if not already imported
conn.upload('https://raw.githubusercontent.com/christopher-parrish/sas_viya/refs/heads/main/poc/1_data_management/aml_bank/aml_bank.csv', casOut={"caslib":"public", "name":"aml_bank", "promote":True})
conn.upload('https://raw.githubusercontent.com/christopher-parrish/sas_viya/refs/heads/main/poc/1_data_management/aml_bank/aml_bank_prep.csv', casOut={"caslib":"public", "name":"aml_bank_prep", "promote":True})
conn.upload('https://raw.githubusercontent.com/christopher-parrish/sas_viya/refs/heads/main/poc/1_data_management/aml_bank/AML_BANK_PREP_1_Q1.csv', casOut={"caslib":"public", "name":"aml_bank_prep_1_q1", "promote":True})
conn.upload('https://raw.githubusercontent.com/christopher-parrish/sas_viya/refs/heads/main/poc/1_data_management/aml_bank/AML_BANK_PREP_2_Q2.csv', casOut={"caslib":"public", "name":"aml_bank_prep_2_q2", "promote":True})
conn.upload('https://raw.githubusercontent.com/christopher-parrish/sas_viya/refs/heads/main/poc/1_data_management/aml_bank/AML_BANK_PREP_3_Q3.csv', casOut={"caslib":"public", "name":"aml_bank_prep_3_q3", "promote":True})
conn.upload('https://raw.githubusercontent.com/christopher-parrish/sas_viya/refs/heads/main/poc/1_data_management/aml_bank/AML_BANK_PREP_4_Q4.csv', casOut={"caslib":"public", "name":"aml_bank_prep_4_q4", "promote":True})
### also available: conn.read_csv('https://...csv')

# promote tables to global scope if removing the promotiion above
#conn.table.promote(caslib="public", name="aml_bank", targetlib="public", target="aml_bank")

NOTE: Cloud Analytic Services made the uploaded file available as table AML_BANK in caslib public.
NOTE: The table AML_BANK has been created in caslib public from binary data uploaded to Cloud Analytic Services.
NOTE: Cloud Analytic Services made the uploaded file available as table AML_BANK_PREP in caslib public.
NOTE: The table AML_BANK_PREP has been created in caslib public from binary data uploaded to Cloud Analytic Services.
NOTE: Cloud Analytic Services made the uploaded file available as table AML_BANK_PREP_1_Q1 in caslib public.
NOTE: The table AML_BANK_PREP_1_Q1 has been created in caslib public from binary data uploaded to Cloud Analytic Services.
NOTE: Cloud Analytic Services made the uploaded file available as table AML_BANK_PREP_2_Q2 in caslib public.
NOTE: The table AML_BANK_PREP_2_Q2 has been created in caslib public from binary data uploaded to Cloud Analytic Services.
NOTE: Cloud Analytic Services made the uploaded file available as table AML_BANK_PREP_3_Q3 in caslib p

In [5]:
#############################
### Identify Table in CAS ###
#############################

### caslib and table to use in modeling
caslib = 'public' # 'casuser'
in_mem_tbl = 'AML_BANK_PREP'

### load table in-memory if not already exists in-memory
if conn.table.tableExists(caslib=caslib, name=in_mem_tbl).exists<=0:
    conn.table.loadTable(caslib=caslib, path=str(in_mem_tbl+str('.sashdat')), 
                         casout={'name':in_mem_tbl, 'caslib':caslib, 'promote':True})

### show table to verify
conn.table.tableInfo(caslib=caslib, wildIgnore=False, name=in_mem_tbl)

Unnamed: 0,Name,Rows,Columns,IndexedColumns,Encoding,CreateTimeFormatted,ModTimeFormatted,AccessTimeFormatted,JavaCharSet,CreateTime,View,MultiPart,SourceName,SourceCaslib,Compressed,Creator,Modifier,SourceModTimeFormatted,SourceModTime,TableRedistUpPolicy
0,AML_BANK_PREP,14302,27,0,utf-8,2026-01-14T15:09:38+00:00,2026-01-14T15:09:39+00:00,2026-01-14T15:09:39+00:00,UTF8,2084023000.0,0,0,,,0,chris.parrish@sas.com,,2026-01-14T15:09:38+00:00,2084023000.0,Not Specified


In [6]:
########################
### Create Dataframe ###
########################

dm_inputdf =  conn.CASTable(in_mem_tbl, caslib=caslib).to_frame()

### print columns for review of model parameters
print(dm_inputdf.dtypes)

account_id                     float64
num_transactions               float64
credit_score                   float64
marital_status_single          float64
marital_status_married         float64
marital_status_divorced        float64
analytic_partition             float64
ml_indicator                   float64
checking_only_indicator        float64
prior_ctr_indicator            float64
address_change_2x_indicator    float64
cross_border_trx_indicator     float64
in_person_contact_indicator    float64
linkedin_indicator             float64
atm_deposit_indicator          float64
trx_10ksum_indicator           float64
common_merchant_indicator      float64
direct_deposit_indicator       float64
citizenship_country_risk       float64
occupation_risk                float64
num_acctbal_chgs_gt2000        float64
distance_to_employer           float64
distance_to_bank               float64
income                         float64
primary_transfer_cash          float64
primary_transfer_check   

In [7]:
########################
### Model Parameters ###
########################

### import python libraries
import numpy as np
import pandas as pd
from sklearn.utils import shuffle

logit_params = {
             'penalty': 'l2', 
             'dual': False, 
             'tol': 0.0001, 
             'fit_intercept': True, 
             'intercept_scaling': 1, 
             'class_weight': None, 
             'random_state': None, 
             'solver': 'newton-cg', 
             'max_iter': 100, 
             'multi_class': 'auto', 
             'verbose': 0, 
             'warm_start': False, 
             'n_jobs': None, 
             'l1_ratio': None
             } 
print(logit_params)

### define macro variables for model
dm_dec_target = 'ml_indicator'
dm_partitionvar = 'analytic_partition'
create_new_partition = 'no' # 'yes', 'no'
dm_key = 'account_id' 
dm_classtarget_level = ['0', '1']
dm_partition_validate_val, dm_partition_train_val, dm_partition_test_val = [0, 1, 2]
dm_partition_validate_perc, dm_partition_train_perc, dm_partition_test_perc = [0.3, 0.6, 0.1]

### create list of regressors
keep_predictors = [
    'marital_status_single',
    'checking_only_indicator',
    'prior_ctr_indicator',
    'address_change_2x_indicator',
    'cross_border_trx_indicator',
    'in_person_contact_indicator',
    'linkedin_indicator',
    'citizenship_country_risk',
    'distance_to_employer',
    'distance_to_bank'
    ]
#rejected_predictors = []

### mlflow
use_mlflow = 'no' # 'yes', 'no'
mlflow_run_to_use = 0
mlflow_class_labels =['TENSOR']
mlflow_predict_syntax = 'predict'

### var to consider in bias assessment
bias_vars = ['marital_status_single']

### var to consider in partial dependency
pd_var1 = 'distance_to_employer'
pd_var2 = 'distance_to_bank'

### create partition column, if not already in dataset
if create_new_partition == 'yes':
    dm_inputdf = shuffle(dm_inputdf)
    dm_inputdf.reset_index(inplace=True, drop=True)
    validate_rows = round(len(dm_inputdf)*dm_partition_validate_perc)
    train_rows = round(len(dm_inputdf)*dm_partition_train_perc) + validate_rows
    test_rows = len(dm_inputdf)-train_rows
    dm_inputdf.loc[0:validate_rows,dm_partitionvar] = dm_partition_validate_val
    dm_inputdf.loc[validate_rows:train_rows,dm_partitionvar] = dm_partition_train_val
    dm_inputdf.loc[train_rows:,dm_partitionvar] = dm_partition_test_val

  from scipy.sparse import csr_matrix, issparse


{'penalty': 'l2', 'dual': False, 'tol': 0.0001, 'fit_intercept': True, 'intercept_scaling': 1, 'class_weight': None, 'random_state': None, 'solver': 'newton-cg', 'max_iter': 100, 'multi_class': 'auto', 'verbose': 0, 'warm_start': False, 'n_jobs': None, 'l1_ratio': None}


In [8]:
##############################
### Final Modeling Columns ###
##############################

### create list of model variables
dm_input = list(dm_inputdf.columns.values)
macro_vars = (dm_dec_target + ' ' + dm_partitionvar + ' ' + dm_key).split()
rejected_predictors = [i for i in dm_input if i not in keep_predictors]
rejected_vars = rejected_predictors # + macro_vars (include macro_vars if rejected_predictors are explicitly listed - not contra keep_predictors)
for i in rejected_vars:
    dm_input.remove(i)
print(dm_input)

### create prediction variables
dm_predictionvar = [str('P_') + dm_dec_target + dm_classtarget_level[0], str('P_') + dm_dec_target + dm_classtarget_level[1]]
dm_classtarget_intovar = str('I_') + dm_dec_target

['marital_status_single', 'checking_only_indicator', 'prior_ctr_indicator', 'address_change_2x_indicator', 'cross_border_trx_indicator', 'in_person_contact_indicator', 'linkedin_indicator', 'citizenship_country_risk', 'distance_to_employer', 'distance_to_bank']


In [9]:
##################
### Data Split ###
##################

### create train, test, validate datasets using existing partition column
dm_traindf = dm_inputdf[dm_inputdf[dm_partitionvar] == dm_partition_train_val]
X_train = dm_traindf.loc[:, dm_input]
y_train = dm_traindf[dm_dec_target]
dm_testdf = dm_inputdf.loc[(dm_inputdf[dm_partitionvar] == dm_partition_test_val)]
X_test = dm_testdf.loc[:, dm_input]
y_test = dm_testdf[dm_dec_target]
dm_validdf = dm_inputdf.loc[(dm_inputdf[dm_partitionvar] == dm_partition_validate_val)]
X_valid = dm_validdf.loc[:, dm_input]
y_valid = dm_validdf[dm_dec_target]
fullX = dm_inputdf.loc[:, dm_input]
fully = dm_inputdf[dm_dec_target]

In [10]:
###################
### Train Model ###
###################

from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix

### estimate & fit model
dm_model = LogisticRegression(**logit_params)
dm_model.fit(X_train, y_train)



In [11]:
###############################
### Evaluate Model Accuracy ###
###############################

description = 'Logistic Regression'
predictions = dm_model.predict(X_test)
cols = X_train.columns
predictors = np.array(cols)
tn, fp, fn, tp = confusion_matrix(y_test, predictions).ravel()
print(description)
print(description)
print('model_parameters')
print(dm_model)
print(' ')
print('model_performance')
print('score_full:', dm_model.score(fullX, fully))
print('score_train:', dm_model.score(X_train, y_train))
print('score_test:', dm_model.score(X_test, y_test))
print('score_valid:', dm_model.score(X_valid, y_valid))
print('confusion_matrix:')
print('(tn, fp, fn, tp)')
print((tn, fp, fn, tp))
print('classification_report:')
print(classification_report(y_test, predictions))

### print logit odds ratios
orat = np.exp(dm_model.coef_, out=None)
c1 = np.vstack([predictors,orat])
c2 = np.transpose(c1)
c = pd.DataFrame(c2, columns=['predictors', 'odds_ratio'])
print('intercept:')
print(dm_model.intercept_)
print('odds_ratios:')
print(c)

Logistic Regression
Logistic Regression
model_parameters
LogisticRegression(multi_class='auto', solver='newton-cg')
 
model_performance
score_full: 0.9649699342749266
score_train: 0.9660878685467894
score_test: 0.9615384615384616
score_valid: 0.9638778839431368
confusion_matrix:
(tn, fp, fn, tp)
(np.int64(1345), np.int64(15), np.int64(40), np.int64(30))
classification_report:
              precision    recall  f1-score   support

         0.0       0.97      0.99      0.98      1360
         1.0       0.67      0.43      0.52        70

    accuracy                           0.96      1430
   macro avg       0.82      0.71      0.75      1430
weighted avg       0.96      0.96      0.96      1430

intercept:
[-6.40166319]
odds_ratios:
                    predictors odds_ratio
0        marital_status_single   9.517659
1      checking_only_indicator   3.631675
2          prior_ctr_indicator   2.963293
3  address_change_2x_indicator   5.393762
4   cross_border_trx_indicator    5.10149
5  i

In [12]:
############################
### Capture Scoring Data ###
############################

### create dataframes by split with model probabilities (p), indicator level (i), bias variables (b), and dependent variable (y)
def score_data(df_x, df_y):
    p = pd.DataFrame(dm_model.predict_proba(df_x), columns=dm_predictionvar)
    i = pd.DataFrame(dm_model.predict(df_x), columns=[dm_classtarget_intovar])
    b = df_x[bias_vars].reset_index(drop=True)
    y = pd.DataFrame(df_y.reset_index(drop=True))
    scored_df = pd.concat([p, i, b, y], axis=1)
    return scored_df

full_score = score_data(fullX, fully)
train_score = score_data(X_train, y_train)
test_score = score_data(X_test, y_test)
valid_score = score_data(X_valid, y_valid)

### the function may need to be altered depending upon the data type of the objects (e.g., dataframe vs. array)

# columns_keep = bias_vars + [dm_dec_target]
# dm_scoreddf_other_cols = pd.DataFrame(dm_inputdf, columns=columns_keep)

# scored = pd.concat([dm_scoreddf, dm_scoreddf_other_cols], axis=1)
# scored = scored.astype({dm_dec_target: int, bias_vars[0]: int, dm_classtarget_intovar: int})

# ### create tables with predicted values
# trainProba = dm_model.predict_proba(X_train)
# trainProbaLabel = dm_model.predict(X_train)
# testProba = dm_model.predict_proba(X_test)
# testProbaLabel = dm_model.predict(X_test)
# validProba = dm_model.predict_proba(X_valid)
# validProbaLabel = dm_model.predict(X_valid)
# trainData = pd.concat([y_train.reset_index(drop=True), pd.Series(data=trainProbaLabel), pd.Series(data=trainProba[:,1])], axis=1)
# testData = pd.concat([y_test.reset_index(drop=True), pd.Series(data=testProbaLabel), pd.Series(data=testProba[:,1])], axis=1)
# validData = pd.concat([y_valid.reset_index(drop=True), pd.Series(data=validProbaLabel), pd.Series(data=validProba[:,1])], axis=1)
# trainData.columns = [dm_dec_target, dm_classtarget_intovar, dm_predictionvar[1]]
# testData.columns = [dm_dec_target, dm_classtarget_intovar, dm_predictionvar[1]]
# validData.columns = [dm_dec_target, dm_classtarget_intovar, dm_predictionvar[1]]

In [13]:
test_score

Unnamed: 0,P_ml_indicator0,P_ml_indicator1,I_ml_indicator,marital_status_single,ml_indicator
0,0.999914,0.000086,0.0,0.0,0.0
1,0.998409,0.001591,0.0,0.0,0.0
2,0.999846,0.000154,0.0,0.0,0.0
3,0.999162,0.000838,0.0,0.0,0.0
4,0.999789,0.000211,0.0,0.0,0.0
...,...,...,...,...,...
1425,0.999460,0.000540,0.0,0.0,0.0
1426,0.999956,0.000044,0.0,0.0,0.0
1427,0.859987,0.140013,0.0,1.0,0.0
1428,0.999948,0.000052,0.0,0.0,0.0


In [14]:
#######################################
### Register Model in Model Manager ###
#######################################

from sasctl import Session
import sasctl.pzmm as pzmm
from sasctl.services import model_repository as modelRepo 
from sasctl.tasks import register_model
import shutil
import json

### define paramters
metadata_output_dir = 'outputs'
model_name = 'logit_python_amlbank'
project_name = 'Anti-Money Laundering'
model_type = 'logistic_regression'
predict_syntax = 'predictproba'
input_df = X_train
target_df = y_train
predictors = np.array(X_train.columns)
#prediction_labels = ['I_ml_indicator', 'P_ml_indicator0', 'P_ml_indicator1']
prediction_labels = ['EM_CLASSIFICATION', 'EM_EVENTPROBABILITY']
target_event = dm_predictionvar[1]
non_target_event = dm_predictionvar[0]
target_event_level = dm_classtarget_level[1]
non_target_event_level = dm_classtarget_level[0]
target_level = 'BINARY'
num_target_categories = len(dm_classtarget_level)
predict_method = str('{}.')+str(predict_syntax)+str('({})')
output_vars = pd.DataFrame(columns=prediction_labels, data=[['A', 0.5]])

In [15]:
### create directories for files
output_name = 'logit_python_amlbank_demo'
output_path = Path(output_dir) / metadata_output_dir / output_name
if output_path.exists() and output_path.is_dir():
    shutil.rmtree(output_path)

### create output path
os.makedirs(output_path)

In [16]:
### create model files and metadata
pzmm.PickleModel.pickle_trained_model(trained_model=dm_model, model_prefix=model_name, pickle_path=output_path)
pzmm.JSONFiles().write_var_json(input_data=input_df, is_input=True, json_path=output_path)
pzmm.JSONFiles().write_var_json(input_data=output_vars, is_input=False, json_path=output_path)
pzmm.JSONFiles().write_model_properties_json(
    model_name=model_name, 
    target_variable=dm_dec_target,
    target_values=[dm_classtarget_level[1], dm_classtarget_level[0]],
    json_path=output_path,
    model_desc=description,
    model_algorithm=model_type,
    #model_function=model_function,
    modeler=username,
    #train_table=in_mem_tbl,
    #properties=None
    )
pzmm.JSONFiles().write_file_metadata_json(model_prefix=model_name, json_path=output_path, is_h2o_model=False, is_tf_keras_model=False)

Model logit_python_amlbank was successfully pickled and saved to C:\Users\chparr\OneDrive - SAS\python\outputs\logit_python_amlbank_demo\logit_python_amlbank.pickle.
inputVar.json was successfully written and saved to C:\Users\chparr\OneDrive - SAS\python\outputs\logit_python_amlbank_demo\inputVar.json
outputVar.json was successfully written and saved to C:\Users\chparr\OneDrive - SAS\python\outputs\logit_python_amlbank_demo\outputVar.json
ModelProperties.json was successfully written and saved to C:\Users\chparr\OneDrive - SAS\python\outputs\logit_python_amlbank_demo\ModelProperties.json
fileMetadata.json was successfully written and saved to C:\Users\chparr\OneDrive - SAS\python\outputs\logit_python_amlbank_demo\fileMetadata.json


In [17]:
### create requirements file
import json
requirements_json = pzmm.JSONFiles().create_requirements_json(model_path=output_path)
print(json.dumps(requirements_json, sort_keys=True, indent=4))
for requirement in requirements_json:
    if 'sklearn' in requirement['step']:
        requirement['command'] = requirement["command"].replace('sklearn', 'scikit-learn')
        requirement['step'] = requirement['step'].replace('sklearn', 'scikit-learn')
print(json.dumps(requirements_json, sort_keys=True, indent=4))
with open(Path(output_path) / "requirements.json", "w") as req_file:
    req_file.write(json.dumps(requirements_json, indent=4))

[
    {
        "command": "pip install sklearn==1.5.2",
        "step": "install sklearn"
    },
    {
        "command": "pip install numpy==2.3.2",
        "step": "install numpy"
    }
]
[
    {
        "command": "pip install scikit-learn==1.5.2",
        "step": "install scikit-learn"
    },
    {
        "command": "pip install numpy==2.3.2",
        "step": "install numpy"
    }
]


In [18]:
### create session in cas
sess = Session(hostname=session, token=access_token, client_secret='access_token')

In [19]:
### create model statistics

validData=valid_score[[dm_dec_target, dm_classtarget_intovar, dm_predictionvar[1]]]
trainData=train_score[[dm_dec_target, dm_classtarget_intovar, dm_predictionvar[1]]]
testData=test_score[[dm_dec_target, dm_classtarget_intovar, dm_predictionvar[1]]]

pzmm.JSONFiles().calculate_model_statistics(
    target_value=int(dm_classtarget_level[1]),
    validate_data=validData,
    train_data=trainData, 
    test_data=testData, 
    json_path=output_path,
    #target_type=model_function,
    #cutoff=None
    )

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data["predict_proba2"] = 1 - data["predict_proba"]
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data["predict_proba2"] = 1 - data["predict_proba"]
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data["predict_proba2"] = 1 - data["predict_proba"]


dmcas_fitstat.json was successfully written and saved to C:\Users\chparr\OneDrive - SAS\python\outputs\logit_python_amlbank_demo\dmcas_fitstat.json
dmcas_roc.json was successfully written and saved to C:\Users\chparr\OneDrive - SAS\python\outputs\logit_python_amlbank_demo\dmcas_roc.json
dmcas_lift.json was successfully written and saved to C:\Users\chparr\OneDrive - SAS\python\outputs\logit_python_amlbank_demo\dmcas_lift.json


In [20]:
### create model bias measures
# scored_table requires a specific format and column order; default is scored "test" table
scored_table_keep = [target_event, non_target_event, dm_dec_target, bias_vars[0]]
scored_table = test_score.astype({dm_dec_target: int, bias_vars[0]: int, dm_classtarget_intovar: int})

pzmm.JSONFiles().assess_model_bias(
    score_table=scored_table,
    sensitive_values=bias_vars, 
    actual_values=dm_dec_target,
    #pred_values=None,
    prob_values=[target_event, non_target_event],
    levels=[target_event_level, non_target_event_level],
    json_path=output_path,
    #cutoff=0.5,
    #datarole="TEST",
    return_dataframes=True
    )

  pzmm.JSONFiles().assess_model_bias(


maxDifferences.json was successfully written and saved to C:\Users\chparr\OneDrive - SAS\python\outputs\logit_python_amlbank_demo\maxDifferences.json
groupMetrics.json was successfully written and saved to C:\Users\chparr\OneDrive - SAS\python\outputs\logit_python_amlbank_demo\groupMetrics.json


  json_files = cls.bias_dataframes_to_json(


{'maxDifferencesData':     BASE  COMPARE           Metric  \
 0    1.0      0.0  P_ml_indicator1   
 1    0.0      1.0  P_ml_indicator0   
 2    1.0      0.0              TPR   
 3    1.0      0.0              FPR   
 4    0.0      1.0              TNR   
 5    0.0      1.0              FNR   
 6    0.0      1.0              FDR   
 7    0.0      1.0              ACC   
 8    1.0      0.0                C   
 9    1.0      0.0               F1   
 10   1.0      0.0             GINI   
 11   1.0      0.0        MISCEVENT   
 12   1.0      0.0      MISCEVENTKS   
 13   1.0      0.0              MCE   
 14   1.0      0.0              ASE   
 15   1.0      0.0             RASE   
 16   1.0      0.0             MCLL   
 17   0.0      1.0            maxKS   
 18   1.0      0.0         cutoffKS   
 19   0.0      1.0             GAIN   
 20   1.0      0.0             LIFT   
 21   1.0      0.0             RESP   
 22   0.0      1.0          CUMRESP   
 23   0.0      1.0          CUMLIFT   
 24

In [21]:
print(X_train.columns)
print(X_train.dtypes)

Index(['marital_status_single', 'checking_only_indicator',
       'prior_ctr_indicator', 'address_change_2x_indicator',
       'cross_border_trx_indicator', 'in_person_contact_indicator',
       'linkedin_indicator', 'citizenship_country_risk',
       'distance_to_employer', 'distance_to_bank'],
      dtype='object')
marital_status_single          float64
checking_only_indicator        float64
prior_ctr_indicator            float64
address_change_2x_indicator    float64
cross_border_trx_indicator     float64
in_person_contact_indicator    float64
linkedin_indicator             float64
citizenship_country_risk       float64
distance_to_employer           float64
distance_to_bank               float64
dtype: object


In [22]:
dm_inputdf_pd = pd.DataFrame(dm_inputdf)

In [23]:
### create model card information
pzmm.JSONFiles().generate_model_card(
        model_prefix=model_name,
        model_files=output_path,
        algorithm=description,
        train_data=dm_inputdf_pd,
        train_predictions=train_score[dm_classtarget_intovar],
        target_type="classification",
        target_value=int(target_event_level),
        interval_vars=['citizenship_country_risk', 'distance_to_employer', 'distance_to_bank'],
        class_vars=['marital_status_single', 'checking_only_indicator', 'prior_ctr_indicator', 'address_change_2x_indicator', 
                    'cross_border_trx_indicator', 'in_person_contact_indicator', 'linkedin_indicator'],
        #selection_statistic="_KS_",
        training_table_name="training_table",
        server="cas-shared-default",
        caslib=caslib
        )

dmcas_relativeimportance.json was successfully written and saved to C:\Users\chparr\OneDrive - SAS\python\outputs\logit_python_amlbank_demo\dmcas_relativeimportance.json
dmcas_misc.json was successfully written and saved to C:\Users\chparr\OneDrive - SAS\python\outputs\logit_python_amlbank_demo\dmcas_misc.json


In [25]:
### copy .py script to output path
### right click script and copy path (change to forward slash)
src = str(git_dir) + str('/poc/5_model_building_sas_python_r_apis/python_r_sas_actions/logit_python_amlbank.ipynb')
print(src)
dst = output_path
shutil.copy(src, dst)
output_path

C:/Users/chparr/OneDrive - SAS/git/sas_viya/poc/5_model_building_sas_python_r_apis/python_r_sas_actions/logit_python_amlbank.ipynb


WindowsPath('C:/Users/chparr/OneDrive - SAS/python/outputs/logit_python_amlbank_demo')

In [26]:
### import to model manager
pzmm.ImportModel().import_model(
    model_files=output_path, 
    model_prefix=model_name, 
    project=project_name, 
    input_data=input_df,
    predict_method=[dm_model.predict_proba, [int, int]],
    score_metrics=prediction_labels,
    pickle_type='pickle',
    project_version='latest',
    missing_values=False,
    overwrite_model=False,
    mlflow_details=None,
    predict_threshold=None,
    target_values=dm_classtarget_level,
    overwrite_project_properties=False,
    target_index=1,
    model_file_name=model_name + str('.pickle'))

  warn(


Model score code was written successfully to C:\Users\chparr\OneDrive - SAS\python\outputs\logit_python_amlbank_demo\score_logit_python_amlbank.py and uploaded to SAS Model Manager.
All model files were zipped to C:\Users\chparr\OneDrive - SAS\python\outputs\logit_python_amlbank_demo.


  warn(f"No project with the name or UUID {project} was found.")


A new project named Anti-Money Laundering was created.
Model was successfully imported into SAS Model Manager as logit_python_amlbank with the following UUID: ee1a97dc-5f02-4625-81d1-482aa0954665.


(<class 'sasctl.core.RestObj'>(headers={'Date': 'Wed, 14 Jan 2026 15:29:40 GMT', 'Content-Type': 'application/vnd.sas.collection+json; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Cache-Control': 'no-cache, no-store, max-age=0, must-revalidate', 'Content-Security-Policy': "default-src 'self'; object-src 'none'; frame-ancestors 'self'; form-action 'self';", 'Expires': '0', 'Pragma': 'no-cache', 'Sas-Activity-Correlator-Id': '9a858f82-336d-4d2b-a58e-7dc4caaaf5bc', 'Sas-Service-Response-Flag': 'true', 'Vary': 'Origin', 'X-Content-Type-Options': 'nosniff', 'X-Csrf-Header': 'X-CSRF-Token', 'X-Csrf-Token': 'VFQ22EH2WJWEUBEELVWYFAKFTY', 'X-Xss-Protection': '1; mode=block', 'Strict-Transport-Security': 'max-age=6.3072e+07; includeSubDomains'}, data={'creationTimeStamp': '2026-01-14T15:29:38.309Z', 'createdBy': 'chris.parrish@sas.com', 'modifiedTimeStamp': '2026-01-14T15:29:39.991Z', 'modifiedBy': 'chris.parrish@sas.com', 'id': 'ee1a97dc-5f02-4625-81d1-482aa09546

Additional information to fill out Model Card with SAS Viya Model Manager:

Models (select model) --> Properties --> Model Usage

Model Purpose - To replace rules-based scenarios with machine learning methods while improving accuracy and minimizing manual processes.

Intended Use - The model will be used to identify entities and transactions that might require further investigation for potential money-laundering activities.

Expected Benefit - By replacing manually developed and maintained scenarios, the expectation is that money-laundering detection will be more efficient.

Out-of-scope use cases - Out of scope use cases include those that have been identified as high-risk entities.  This is not intended to completely replace the current process.

Limitations - Model should be tracked along with scenarios for a period of time to evaluate accuracy.