# Model Fingerprinting 

Model Fingerprint is a technique for defining a fingerprint of a model, which can be used to determine if any given input belongs to a dataset of reasonable input to the model or if it is an outlier to the dataset. 
**Fingerprint is calculated with trained model.** 
## 2 methods for layered, or non-layered models 
- Layered
The first component takes the activations of the training data for each of the layers and trains a unique autoencoder for each layer. 
The second component consists of the distribution of reconstruction errors calculated by running the activations of the training data through the autoencoder and calculating the root mean squared difference between the input activations and the outputted values
- Non-layered
The first component consist of an autoencoder calculated directly from the training data. 
The second component consists of the distributions of reconstruction errors calculated by running the training data through the autoencoder and calculating the root mean square difference between the input training data and the outputted values.


# Installation & REST service startup

To install from Artifactory, run `pip install edgeai_model_mgmt` 

After successful installation, run the command  `python -m  modelmgmt.apis.rest_services`

A REST server should start up and report the IP Address and port number used to accept requests

# Import libraries for calling APIs and data serialization

In [None]:
import requests
import pickle
import json

# Fingerprint Service Requirements 

The Edge AI Model Management Fingerprinting service expects a pretrained model, and the original training dataset in order to generate a fingerprint. If a layered model is not available, you can simply provide the training dataset to generate a fingerprint from this alone. 

We will train a Keras MNIST Model to use for this example. 


# Prepare MNIST Dataset 

 

In [None]:
from keras import utils
from keras.datasets import mnist, fashion_mnist

img_rows, img_cols = 28, 28
(mnist_x_train, mnist_y_train), _ = mnist.load_data()

x = mnist_x_train.astype('float32').reshape(mnist_x_train.shape[0], img_rows, img_cols, 1) / 255.
y = utils.to_categorical(mnist_y_train, 10)

# Build Model 

In [None]:
from keras import models
from keras import layers
from keras.models import Sequential

model = models.Sequential()
model.add(layers.Conv2D(32, (5, 5), activation='relu', input_shape=(28, 28, 1), padding='same'))
model.add(layers.MaxPool2D(2, 2))
model.add(layers.Conv2D(64, (5, 5), activation='relu'))
model.add(layers.MaxPool2D(2, 2))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dropout(0.4))
model.add(layers.Dense(10))
model.add(layers.Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train Model

In [None]:
model.fit(x, y, batch_size=128, epochs=3)

# Save newly trained model and dataset to desired path

In [None]:
import tempfile
import os

mfilename = 'model.h5'
dsfilename = 'mnist.pickle'

    
model.save(mfilename)
with open(dsfilename, 'wb') as file:
    pickle.dump(x, file)
    

# Invoke REST service endpoint

Retrieve the REST endpoint (IP and Port) as reported when the REST server was started and invoke the *generate_fingerprint* service. 

The generate_fingeprint service accepts multipart/formdata requests with the following arguments. 
- **dataset(required)**: pickled file of numpy array dataset used during model training. Type=file
- **model**: path to saved model that will be used to fingerprint. Type=File
- **model_type**: type of model to fingerprint 'keras' or 'pytorch'. Type=string
- **model_def**: required if model_type is pytorch, and should be path to the model definition file. Type=file
- **partial_activations**: percentage of partial activations to retain when fingerprinting. Type=float

In [None]:
url = 'http://10.0.0.101:5000/generate_fingerprint' # your url may differ. change as neccessary

data = {'model_type': 'keras', 'partial_activations': 0.2}

multiple_files = [
    ('dataset', ('mnist.pickle', open(dsfilename, 'rb'), 'file/pickle')),
    ('model', ('mnist.h5', open(mfilename, 'rb'), 'file/h5'))]

r = requests.post(url,  data=data, files=multiple_files)

# Check response code and save zip file returned from server (generated fingerprint)

The saved fingerprint could be used for outlier detection in determining if a given input is an outlier to the trained model.

In [None]:
r

In [None]:
saved_fingerprint = 'keras_fingerprint.zip'
with open(saved_fingerprint, 'wb') as fd:
        for chunk in r.iter_content(chunk_size=128):
            fd.write(chunk)


# Utilize generated fingerprint for outlier detection

After a fingerprint is successfully generated, this fingerprint can be used to determine if any given input belongs to a dataset of reasonable input to the model, or if it is an outlier to the dataset.

We can use the **outlier_detection** service to make this determine on a provided dataset for a generated fingerprint.

# Import fashion mnist model to use as input for outlier detection. 

In [None]:
from keras.datasets import mnist, fashion_mnist

(fm_x, fm_y), _ = fashion_mnist.load_data()
fm_x = fm_x.astype('float32').reshape(fm_x.shape[0], 28, 28, 1) / 255.
fm_x = fm_x[:500]

with open('fmnist.pickle', 'wb') as file:
    pickle.dump(fm_x, file)

# Invoke REST service endpoint

Retrieve the REST endpoint (IP and Port) as reported when the REST server was started and invoke the outlier_detection service.

The outlier_detection service accepts multipart/formdata requests with the following arguments.

- **fingerprint(required)**: zipped directory containing fingerprint. Type=file
- **num_layers(required)** : the number of layers in the model fingerprint. Type=int
- **dataset(required)**: picked file of numpy array dataset to determine if outlier. Type=file
- **model**: path to saved model that will be used to fingerprint. Type=file
- **model_type**: type of model fingerprinted; 'keras' or 'pytorch'. Type=string
- **model_def**: required if model_type is pytorch, and should be path to the model definition file. Type=file
- **activation_mappings**: acknowledges whether activation mapping was used for fingerprint. Type=bool

In [None]:
num_layers = 8 
activation_mapping = True

url = 'http://10.0.0.101:8443/outlier_detection'

payload = {'num_layers': num_layers, 'model_type': 'keras', 'activation_mapping': activation_mapping}

multiple_files = [
    ('dataset', ('fmnist.pickle', open('fmnist.pickle', 'rb'), 'file/pickle')),
    ('model', ('mnist.h5', open('mnist.h5', 'rb'), 'file/h5')),
    ('fingerprint', (saved_fingerprint, open(saved_fingerprint, 'rb'), 'file/zip'))]


r = requests.post(url,  data=payload, files=multiple_files)

# Retrive outlier score returned from server; should be < 10%

In [None]:
score = r.json()['score']
print(score)