# AWS Fraud Detector SDK Example 

Following this guide: https://docs.aws.amazon.com/frauddetector/latest/ug/building-a-model.html  
and using the sample data-set from here: https://docs.aws.amazon.com/frauddetector/latest/ug/samples/training_data.zip

Also ref the the boto3 FraudDetector API reference: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/frauddetector.html

### Authentication to AWS Account ###
This notebook assumes you have exported a token before starting the notbook session, or configured the AWS environment to allow this notebook to allow AWS cloud operations to be executed with the nexcessary AWS Fraud Detector privileges.  

***Alternatively***, provide values for the AWS ACCESS KEY, AWS SECRET KEY, and SESSION_TOKEN with the mechanism in the next cell
If the session times out and a new token needs to be provided, **restart the notebook kernel**

In [None]:
# Set authentiation to AWS via temporary access tokens
import json
import os
from getpass import getpass
ACCESS_KEY = getpass("Enter the AWS Access Key:")
SECRET_KEY =  getpass("Enter the AWS Secret Key:")
SESSION_TOKEN = getpass("Enter the AWS Session Token to use:")

os.environ['AWS_ACCESS_KEY_ID'] = ACCESS_KEY
os.environ['AWS_SECRET_ACCESS_KEY'] = SECRET_KEY
os.environ['AWS_SESSION_TOKEN'] = SESSION_TOKEN

### Amazon Fraud Detector SDK Client

In [None]:
import boto3
fraudDetector = boto3.client('frauddetector')

In [None]:
response = fraudDetector.get_detectors()
print(response)

### Sample Data

Sample data for building a model is in this repo in `training_data` folder.  Unzip it and copy it to an S3 bucket so the Amazon Fraud Detection services can use it.

In [None]:
# Set BUCKET_PATH to where the training data is located (change this!)
BUCKET_PATH='s3://<my_bucket>/training/'

In [None]:
TRAINING_DATA=BUCKET_PATH+'registration_data_20K_minimum.csv'
print(TRAINING_DATA)

In [None]:
# Load the data into Pandas for easy access and exploring the data
import pandas as pd
#training_df = pd.read_csv(TRAINING_DATA)   # Can't do this if public access not allowed
training_df = pd.read_csv('training_data/registration_data_20K_minimum.csv.zip')

In [None]:
training_df.head()

## Amazon Fraud Detector Role Privileges
See the following link for guidance on recommendations for setting up an IAM user and role with `AmazonFraudDetectorFullAccessPolicy` associated with it:  
https://docs.aws.amazon.com/frauddetector/latest/ug/security-iam.html

##### Role ARN
Identify the ARN of the AWS Role that will running the Amazon Fraud Detection operations.  Set this in the `ROLE_ARN` variable below.

In [None]:
# ARN URL for Fraud Detector operations
ROLE_ARN = 'arn:aws:iam::999999999999:role/RoleToUseForFraudDetector'

In the console, grant the policy  `AmazonFraudDetectorFullAccessPolicy` to the role referred to by the ROLE_ARN above.
  
It is necessary to update the trust relationship for the specified role by specifying Amazon Fraud Detector as a trusted entity.
https://docs.aws.amazon.com/frauddetector/latest/ug/security_iam_troubleshoot.html#security_iam_troubleshoot-assume-role  
- Open the IAM console  
- In the navigation pane choose Roles. 
- Choose the name of the role that you want to modify, and choose the Trust relationships tab. 
- Choose Edit trust relationship. 
- Under Policy Document, paste the following, and then choose Update Trust Policy.  

```
       {
           "Version": "2012-10-17",
           "Statement": [ {
               "Effect": "Allow",
               "Principal": {
                   "Service": "frauddetector.amazonaws.com"
               },
               "Action": "sts:AssumeRole"
           } ]
       }
```



## Create variables, entity type, and labels

### Variable Types Concept

You can optionally assign variables a variable type. Variable types represent common data elements used during fraud predictions.  
Only variables with an associated variable type can be used for model training.  
  
Variables must have a data type for the data element that the variable represents. For variables that are mapped to a variable type, the data type is pre-selected. 

Possible data types include: String, Integer, Boolean, Float.

ref: https://docs.aws.amazon.com/frauddetector/latest/ug/create-a-variable.html

Variable types ref : https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/frauddetector.html#FraudDetector.Client.get_variables  
(Boto Docs ver 1.18.46) - FraudDetector client
also: https://docs.aws.amazon.com/frauddetector/latest/ug/create-a-variable.html


| Category | Variable type | Description | Data type | Example  |
| - | - | - | - | - |
| Custom | NUMERIC | Any variable that can be represented as a real number | Float | 1.224 |  
| Custom | CATEGORICAL | Any variable that describes categories, segments, or groups  | String |  Large |  
| Custom | FREE_FORM_TEXT | Any free form text that is captured as part of the event. For example, a customer review or comment.   | String |  Example of a free form text input |
| Email |	EMAIL_ADDRESS |	Email address collected during the event |	String | abc@domain.com |
| IP address | IP_ADDRESS | IP address collected during the event | String | 1.1.1.1 |
| Phone number | PHONE_NUMBER | Phone number collected during the event | String | 1-123-456-7891 |
| Browser/Device | USERAGENT | User agent collected during the event | String | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 |
| Browser/Device | FINGERPRINT | Unique identifier for a device | String | sadfow987u234 |
| Payment Instrument |	PAYMENT_TYPE | Payment instrument type used for payment during the event | String | Credit Card | CARD_BIN | First six digits of the credit card | Integer | 123456 |
| Payment Instrument |CARD_BIN | First six digits of the credit card | Integer | 123456|  
| Payment Instrument |AUTH_CODE  |Alphanumerical code sent by a credit card issuer or issuing bank | String | 00|  
| Payment Instrument |AVS | Address Verification System response code from card processor | String | Y|
| Billing Address | BILLING_NAME |  Name associated with billing address  | String |  John Doe |  
| Billing Address | BILLING_PHONE |  Phone associated with billing address  | String |  1-123-456-7891 |  
| Billing Address | BILLING_ADDRESS_L1 |  Billing address line 1  | String |  123 4th St. |  
| Billing Address | BILLING_ADDRESS_L2 |  Billing address line 2  | String |  Unit 123 |  
| Billing Address | BILLING_CITY |  Billing address city  | String |  Seattle |  
| Billing Address | BILLING_STATE |  Billing address state or providence  | String |  WA |  
| Billing Address | BILLING_COUNTRY |  Billing address country  | String |  US |  
| Billing Address | BILLING_ZIP |  Billing address postal code  | String |  98109 |  
| Shipping Address | SHIPPING_NAME |  Name associated with shipping address  | String |  John Doe |  
| Shipping Address | SHIPPING_PHONE |  Phone associated with shipping address  | String |  1-123-456-7891 |  
| Shipping Address | SHIPPING_ADDRESS_L1 |  Shipping address line 1  | String |  123 4th St. |  
| Shipping Address | SHIPPING_ADDRESS_L2 |  Shipping address line 2  | String |  Unit 123 |  
| Shipping Address | SHIPPING_CITY |  Shipping address city  | String |  Seattle |  
| Shipping Address | SHIPPING_STATE |  Shipping address state or providence  | String |  WA |  
| Shipping Address | SHIPPING_COUNTRY |  Billing address country  | String |  US |  
| Shipping Address | SHIPPING_ZIP |  Shipping address postal code  | String |  98109 |  
| Order | ORDER_ID |  Unique identifier for transaction  | String |  LUX60 |  
| Order | PRODUCT_CATEGORY |  Product category of order item  | String |  kitchen |  
| Order | CURRENCY_CODE |  ISO 4217 currency code  | String |  USD |  
| Order | PRICE |  Total order price  | String |  560.00 |  


In [None]:
#You can optionally assign variables a variable type: 
variable_types = [ "NUMERIC" , "CATEGORICAL" , "FREE_FORM_TEXT" , "EMAIL_ADDRESS" , "IP_ADDRESS" , "PHONE_NUMBER" , "USERAGENT" , "FINGERPRINT" , "PAYMENT_TYPE" , "CARD_BIN" , "AUTH_CODE " , "AVS" , "BILLING_NAME" , "BILLING_PHONE" , "BILLING_ADDRESS_L1" , "BILLING_ADDRESS_L2" , "BILLING_CITY" , "BILLING_STATE" , "BILLING_COUNTRY" , "BILLING_ZIP" , "SHIPPING_NAME" , "SHIPPING_PHONE" , "SHIPPING_ADDRESS_L1" , "SHIPPING_ADDRESS_L2" , "SHIPPING_CITY" , "SHIPPING_STATE" , "SHIPPING_COUNTRY" , "SHIPPING_ZIP" , "ORDER_ID" , "PRODUCT_CATEGORY" , "CURRENCY_CODE" , "PRICE" ]

### Create Variables ###
Variables represent data elements that you want to use in a fraud prediction.  
Variables must have a data type for the data element that the variable represents.  
For variables that are mapped to a variable type, the data type is pre-selected.
https://docs.aws.amazon.com/frauddetector/latest/ug/create-a-variable.html

In [None]:
#Create variable email_address
fraudDetector.create_variable(
name = 'email_address',
variableType = 'EMAIL_ADDRESS',
dataSource = 'EVENT',
dataType = 'STRING',
defaultValue = '<unknown>'
)

#Create variable ip_address
fraudDetector.create_variable(
name = 'ip_address',
variableType = 'IP_ADDRESS',
dataSource = 'EVENT',
dataType = 'STRING',
defaultValue = '<unknown>'
)

In [None]:
response = fraudDetector.get_variables(
    name='ip_address',
    nextToken='string',
    maxResults=100
)

In [None]:
response.items()

### Create Entity Types ###
An entity represents who is performing the event. Example classifications include customer, merchant, or account.
https://docs.aws.amazon.com/frauddetector/latest/ug/create-an-entity-type.html  

In [None]:
fraudDetector.put_entity_type(
name = 'customer',
description = 'sample customer entity type'
)

### Create Event Labels ###
These are the label binary classifications for "Fraud" or "Not-Fraud" ("legit")  
https://docs.aws.amazon.com/frauddetector/latest/ug/create-a-label.html  

In [None]:
fraudDetector.put_label(
name = 'fraud',
description = 'label for fraud events'
)

fraudDetector.put_label(
name = 'legit',
description = 'label for legitimate events'
)

### Create an Event Type ###
An event type defines the structure for an individual event sent to Amazon Fraud Detector.   
https://docs.aws.amazon.com/frauddetector/latest/ug/create-event-type.html  
  
The structure of an event includes:  
- *Entity Type*: Classifies who is performing the event. During prediction, specify the entity type and entity Id to define who performed the event.

- *Variables*: Defines what variables can be sent as part of the event. Variables are used by models and rules to evaluate fraud risk. Once added, variables cannot be removed from an event type.

- *Labels*: Classifies an event as fraudulent or legitimate. Used during model training. Once added, labels cannot be removed form an event type.  

In [None]:
fraudDetector.put_event_type (
   name = 'registrations',
   eventVariables = ['ip_address', 'email_address'],
   labels = ['legit', 'fraud'],
   entityTypes = ['customer']
)

### Define a Model
https://docs.aws.amazon.com/frauddetector/latest/ug/building-a-model.html

In [None]:
fraudDetector.create_model (
   modelId = 'sample_fraud_detection_model',
   eventTypeName = 'registrations',
   modelType = 'ONLINE_FRAUD_INSIGHTS'
)

### Train a Model 
https://docs.aws.amazon.com/frauddetector/latest/ug/building-a-model.html


In [None]:
fraudDetector.create_model_version (
modelId = 'sample_fraud_detection_model',
modelType = 'ONLINE_FRAUD_INSIGHTS',
trainingDataSource = 'EXTERNAL_EVENTS',
trainingDataSchema = {
    'modelVariables' : ['ip_address', 'email_address'],
    'labelSchema' : {
        'labelMapper' : {
            'FRAUD' : ['fraud'],
            'LEGIT' : ['legit']
        }
    }
}, 
externalEventsDetail = {
    'dataLocation' : TRAINING_DATA,
    'dataAccessRoleArn' : ROLE_ARN
}
)

### Check the Model Training Progress 

In [None]:
response = fraudDetector.get_model_version(
    modelId='sample_fraud_detection_model',
    modelType='ONLINE_FRAUD_INSIGHTS',
    modelVersionNumber='1.0'
)

several hours later...

In [None]:
response['status']

In [None]:
response = fraudDetector.describe_model_versions(
    modelId='sample_fraud_detection_model',
    modelVersionNumber='1.0',
    modelType='ONLINE_FRAUD_INSIGHTS',
    nextToken='string',
    maxResults=5
)

In [None]:
# Get metrics including AUC
response['modelVersionDetails'][0]['trainingResult']['trainingMetrics']['auc']

## Build a Detector

https://docs.aws.amazon.com/frauddetector/latest/ug/create-a-detector.html  

A detector contains the detection logic, such as the models and rules, for a particular event that you want to evaluate for fraud.    
Specify the detector that you want to use to evaluate your event.  
1. Create a detector. 
2. Create rules.  These are the conditions that determine the outcome.  
3. Create outcomes.  These are the result of a fraud detection by the detector - EG one of *high_risk*, *medium_risk*, and *low_risk*. 
4. Creat a detector version.  Specifies the model-version and rules that will be used to run a fraud prediction  

In [None]:
fraudDetector.put_detector (
detectorId = 'registration_detector',
eventTypeName = 'registrations'
)

### Activate the Model ###
Possible model status values are:
- TRAINING_IN_PROGRESS
- TRAINING_COMPLETE
- ACTIVATE_REQUESTED
- ACTIVATE_IN_PROGRESS
- ACTIVE
- INACTIVATE_REQUESTED
- INACTIVATE_IN_PROGRESS
- INACTIVE
- ERROR

In [None]:
fraudDetector.get_model_version(modelId="sample_fraud_detection_model"
, modelType="ONLINE_FRAUD_INSIGHTS"
, modelVersionNumber="1.0")['status']


Once the model reaches a state of `TRAINING_COMPLETE`, activate it:

In [None]:
fraudDetector.update_model_version_status (
modelId = 'sample_fraud_detection_model',
modelType = 'ONLINE_FRAUD_INSIGHTS',
modelVersionNumber = '1.00',
status = 'ACTIVE'
)

Check the model status and wait until it moves from `ACTIVATE_IN_PROGRESS` to `ACTIVE`

In [None]:
fraudDetector.get_model_version(modelId="sample_fraud_detection_model"
, modelType="ONLINE_FRAUD_INSIGHTS"
, modelVersionNumber="1.0")['status']

### Create Outcomes
An outcome is the result of a fraud prediction. Create an outcome for each possible fraud prediction result.

In [None]:
fraudDetector.put_outcome(
name = 'verify_customer',
description = 'this outcome initiates a verification workflow'
)

fraudDetector.put_outcome(
name = 'review',
description = 'this outcome sidelines event for review'
)

fraudDetector.put_outcome(
name = 'approve',
description = 'this outcome approves the event'
)

### Create Rules 
 A detector must have at least one associated rule
 https://docs.aws.amazon.com/frauddetector/latest/ug/create-a-rule.html  
 https://docs.aws.amazon.com/frauddetector/latest/ug/rule-language-reference.html  

In [None]:

fraudDetector.create_rule(
ruleId = 'high_fraud_risk',
detectorId = 'registration_detector',
expression = '$sample_fraud_detection_model_insightscore > 900',
language = 'DETECTORPL',
outcomes = ['verify_customer']
)

fraudDetector.create_rule(
ruleId = 'low_fraud_risk',
detectorId = 'registration_detector',
expression = '$sample_fraud_detection_model_insightscore <= 900 and $sample_fraud_detection_model_insightscore > 700',
language = 'DETECTORPL',
outcomes = ['review']
)


### Create a Detector Version ###

In [None]:
fraudDetector.create_detector_version(
detectorId = 'registration_detector',
rules = [{
    'detectorId' : 'registration_detector',
    'ruleId' : 'high_fraud_risk',
    'ruleVersion' : '1'
},
{
    'detectorId' : 'registration_detector',
    'ruleId' : 'low_fraud_risk',
    'ruleVersion' : '1'
}
],
modelVersions = [{
    'modelId' : 'sample_fraud_detection_model',
    'modelType': 'ONLINE_FRAUD_INSIGHTS',
    'modelVersionNumber' : '1.0'
}],
ruleExecutionMode = 'FIRST_MATCHED'
)

## Make a Prediction

Response from the get_event_prediction method sees if we hit any of the rule conditiions set earlier.  This can be used to trigger an action for the fraud.

In [None]:
response = fraudDetector.get_event_prediction(
    detectorId='registration_detector',
    detectorVersionId='1',
    eventId='1234',
    eventTypeName='registrations',
    entities=[
        {
            'entityType': 'customer',
            'entityId': 'unknown'
        },
    ],
    eventTimestamp='2021-11-13T12:18:21Z',
    eventVariables = {
            'email_address' : 'johndoe@exampledomain.com',
            'ip_address' : '1.2.3.4'
        }
)

In [None]:
response.keys()

In [None]:
response['ruleResults']

In [None]:
response['modelScores']

In [None]:
response = fraudDetector.get_event_prediction(
    detectorId='registration_detector',
    detectorVersionId='1',
    eventId='1234',
    eventTypeName='registrations',
    entities=[
        {
            'entityType': 'customer',
            'entityId': 'unknown'
        },
    ],
    eventTimestamp='2021-11-11T12:18:21Z',
    eventVariables = {
            'email_address' : 'johndoe@gmail.com',
            'ip_address' : '82:24:61:42'
        }
)

In [None]:
response['ruleResults']

In [None]:
print(response)