## In this notebook we will perform inference wheater embeddings vector represents subgraph of normal or  anomalous transaction. using our anomaly detection model     
---
**NOTE**: 

In real life scenarios financial transaction are dynamically evolving graphs. Performing anomaly detection inference on graph embeddings in live Transaction Monitoring Systems will require 1st to update the graph and node representations after new transactions arrive. Recomputing entire graph for every newly arrived transaction will lead to unaxeptable delayes and even monitoring system failure. This problem  will be more sever if large amount of updates happen in a short time window.

Contact us at Logical Clocks and we will help you to setup end to end graph based deep anomaly detection live Transaction Monitoring Systems. 

---

In [1]:
spark

Starting Spark application


ID,YARN Application ID,Kind,State,Spark UI,Driver log
60,application_1612947411090_0005,pyspark,idle,Link,Link


SparkSession available as 'spark'.
<pyspark.sql.session.SparkSession object at 0x7f36f025bf90>

In [2]:
import json
import numpy as np
from hops import model
from hops.model import Metric

# Query Model Repository for best anomaly detection model

In [3]:
MODEL_NAME="gansimaml"
EVALUATION_METRIC="metric"

In [4]:
best_model = model.get_best_model(MODEL_NAME, EVALUATION_METRIC, Metric.MIN)

In [5]:
print('Model name: ' + best_model['name'])
print('Model version: ' + str(best_model['version']))
print(best_model['metrics'])

Model name: gansimaml
Model version: 1
{'metric': '6.566298961639404'}

# Create Model Serving of Exported Model

In [6]:
from hops import serving

In [7]:
MODEL_NAME

'gansimaml'

In [8]:
best_model['version']

1

In [9]:
model_path="/Models/" + best_model['name']
model_path

'/Models/gansimaml'

In [10]:
# Create serving
model_path="/Models/" + best_model['name']
model_path
response = serving.create_or_update(artifact_path=model_path, serving_name=MODEL_NAME, serving_type="TENSORFLOW", 
                                 model_version=best_model['version'])

Creating a serving for model gansimaml ...
Serving for model gansimaml successfully created

In [11]:
# List all available servings in the project
for s in serving.get_all():
    print(s.name)

gansimaml

In [12]:
# Get serving status
serving.get_status(MODEL_NAME)

'Stopped'

# Start Model Serving Server

In [13]:
if serving.get_status(MODEL_NAME) == 'Stopped':
    serving.start(MODEL_NAME)

Starting serving with name: gansimaml...
Serving with name: gansimaml successfully started

In [14]:
import time
while serving.get_status(MODEL_NAME) != "Running":
    time.sleep(5) # Let the serving startup correctly
time.sleep(5)

# Send Prediction Requests to the Served Model using Hopsworks REST API

In [16]:
import hsfs
# Create a connection
connection = hsfs.connection()
# Get the feature store handle for the project's feature store
fs = connection.get_feature_store()

Connected. Call `.close()` to terminate connection gracefully.

In [17]:
eval_td = fs.get_training_dataset("eval_td", 1)

In [18]:
eval_td.show(5)

+------+--------------------+
|target|                 mms|
+------+--------------------+
|   0.0|[-0.4065766, -0.2...|
|   0.0|[-0.37156734, -0....|
|   0.0|[-0.3331596, -0.0...|
|   0.0|[-0.30843127, -0....|
|   0.0|[-0.26306894, -0....|
+------+--------------------+
only showing top 5 rows

In [20]:
def model_server(model_name, input):
    data = {"signature_name": "serving_default", "inputs": [input]}
    return serving.make_inference_request(model_name, data)['outputs']

In [22]:
scored_df = eval_td.read()\
                   .rdd.map(lambda x: (x.target,model_server(MODEL_NAME, np.array(x.mms).tolist()))).map(lambda f: (f[0],f[1][0]))\
                   .toDF().toDF("target","score")

In [23]:
scored_df.show()

+------+----------+
|target|     score|
+------+----------+
|   0.0|16.3264771|
|   0.0|32.0050278|
|   0.0|8.96398163|
|   0.0|9.12968445|
|   0.0|10.6783476|
|   0.0|3.54688883|
|   0.0|2.80013108|
|   0.0|5.92823219|
|   0.0|6.65167141|
|   0.0|7.45862341|
|   0.0| 3.1319952|
|   0.0|17.8965607|
|   0.0|16.2934685|
|   0.0|7.09727764|
|   0.0|7.13687325|
|   0.0|7.36101723|
|   0.0|   6.18361|
|   0.0|5.83205318|
|   0.0|7.72574568|
|   0.0|13.0513754|
+------+----------+
only showing top 20 rows

In [25]:
scored_df.where(scored_df.target == 1.0).show()

+------+----------+
|target|     score|
+------+----------+
|   1.0|   9.60008|
|   1.0|13.9220219|
|   1.0|12.3727131|
|   1.0|8.70950413|
|   1.0|11.6165619|
|   1.0|15.2314949|
|   1.0|11.2959318|
|   1.0|11.9945049|
|   1.0|13.9294739|
|   1.0|9.27066612|
|   1.0|18.3138161|
|   1.0|17.8045673|
|   1.0|7.22413254|
|   1.0|26.7689133|
|   1.0|11.7971725|
|   1.0|12.8136816|
|   1.0|10.2775126|
|   1.0|8.90372562|
|   1.0|7.90231133|
|   1.0|10.8522491|
+------+----------+
only showing top 20 rows

In [26]:
scored_df.where(scored_df.target == 0.0).show()

+------+----------+
|target|     score|
+------+----------+
|   0.0|16.3264771|
|   0.0|32.0050278|
|   0.0|8.96398163|
|   0.0|9.12968445|
|   0.0|10.6783476|
|   0.0|3.54688883|
|   0.0|2.80013108|
|   0.0|5.92823219|
|   0.0|6.65167141|
|   0.0|7.45862341|
|   0.0| 3.1319952|
|   0.0|17.8965607|
|   0.0|16.2934685|
|   0.0|7.09727764|
|   0.0|7.13687325|
|   0.0|7.36101723|
|   0.0|   6.18361|
|   0.0|5.83205318|
|   0.0|7.72574568|
|   0.0|13.0513754|
+------+----------+
only showing top 20 rows