# Exercise: Test of blackbox AI algorithm
 Thomas Liebig, 2023

## Recall two-sided tests
In the lecture you've learned statistical tests as a method to perform test of AI algorithms. In terms of testing a blackbox model, the distribution of the expected output must be compared to actual output of the blackbox model. Thus, two probaily distributions need to be compared.
The Kolmogorov-Smirnov (KS) is one possibility to perform such a test. Below you find details for comparing two distributions with the KS-test. 

In [24]:
from scipy.stats import ks_2samp

from numpy.random import seed
from numpy.random import randn
from numpy.random import lognormal

# reproducible distributions
seed(0)

# generate two distributions
dist1 = randn(100)
dist2 = lognormal(3, 1, 100)

p=ks_2samp(dist1, dist2)[1]
alpha=.05

if (p<alpha):
    print('null hypothesis rejected: both samples not from the same distribution.')
else:
    print('null hypothesis accepted: both samples likely from the same distribution.')

null hypothesis rejected: both samples not from the same distribution.


## Task Description
Suppose an unknown supervised model `model_foo` is given as $f:\mathbb{R}^5\rightarrow \mathbb{R}^2$. 

In [146]:
import numpy as np
import types
import marshal

# construct the secret function
code_string= bytes([227,   1,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   8,   0,   0,   0,   8,   0,   0,   0,  67,   0,   0,
                      0, 243, 168,   0,   0,   0, 116,   0, 160,   1, 103,   0, 100,   1, 162,   1, 103,   0, 100,   2, 162,   1, 103,   0,
                    100,   3, 162,   1, 103,   0, 100,   4, 162,   1, 103,   0, 100,   5, 162,   1, 103,   5, 161,   1, 125,   1, 116,   0,
                    160,   2, 103,   0, 100,   6, 162,   1, 161,   1, 125,   2, 116,   0, 160,   3, 124,   0, 124,   1, 161,   2, 124,   2,
                    107,   4, 100,   7,  20,   0, 125,   3, 116,   0, 160,   1, 103,   0, 100,   8, 162,   1, 103,   0, 100,   9, 162,   1,
                    103,   0, 100,  10, 162,   1, 103,   0, 100,  11, 162,   1, 103,   4, 161,   1, 125,   4, 116,   0, 160,   2, 103,   0,
                    100,  12, 162,   1, 161,   1, 125,   5, 116,   0, 160,   2, 103,   0, 100,  13, 162,   1, 161,   1, 125,   6, 116,   0,
                    160,   3, 124,   3, 124,   4, 161,   2, 124,   5, 107,   2, 100,   7,  20,   0, 125,   7, 116,   0, 160,   3, 124,   7,
                    124,   6, 161,   2,  83,   0, 169,  14,  78,  41,   4, 233,   0,   0,   0,   0, 114,   3,   0,   0,   0, 233, 255, 255,
                    255, 255, 114,   3,   0,   0,   0,  41,   4, 114,   3,   0,   0,   0, 233,   1,   0,   0,   0, 114,   3,   0,   0,   0,
                    114,   3,   0,   0,   0,  41,   4, 114,   5,   0,   0,   0, 114,   3,   0,   0,   0, 114,   3,   0,   0,   0, 114,   3,
                      0,   0,   0,  41,   4, 114,   3,   0,   0,   0, 114,   3,   0,   0,   0, 114,   3,   0,   0,   0, 114,   5,   0,   0,
                      0,  41,   4, 114,   3,   0,   0,   0, 114,   3,   0,   0,   0, 114,   3,   0,   0,   0, 114,   3,   0,   0,   0,  41,
                      4, 233,   3,   0,   0,   0, 233,   5,   0,   0,   0, 103,  51,  51,  51,  51,  51,  51, 243, 191, 114,   6,   0,   0,
                      0, 114,   5,   0,   0,   0,  41,   5, 114,   5,   0,   0,   0, 114,   5,   0,   0,   0, 114,   5,   0,   0,   0, 114,
                      4,   0,   0,   0, 114,   4,   0,   0,   0,  41,   5, 114,   5,   0,   0,   0, 114,   4,   0,   0,   0, 114,   4,   0,
                      0,   0, 114,   3,   0,   0,   0, 114,   3,   0,   0,   0,  41,   5, 114,   3,   0,   0,   0, 114,   3,   0,   0,   0,
                    114,   3,   0,   0,   0, 114,   5,   0,   0,   0, 114,   4,   0,   0,   0,  41,   5, 114,   3,   0,   0,   0, 114,   5,
                      0,   0,   0, 114,   4,   0,   0,   0, 114,   3,   0,   0,   0, 114,   3,   0,   0,   0,  41,   5, 233,   2,   0,   0,
                      0, 114,   8,   0,   0,   0, 114,   5,   0,   0,   0, 114,   5,   0,   0,   0, 114,   3,   0,   0,   0,  41,   5, 233,
                     10,   0,   0,   0, 233,  40,   0,   0,   0, 233,  50,   0,   0,   0, 233,  30,   0,   0,   0, 233,  20,   0,   0,   0,
                    169,   4, 218,   2, 110, 112, 218,   6, 109,  97, 116, 114, 105, 120, 218,   5,  97, 114, 114,  97, 121, 218,   6, 109,
                     97, 116, 109, 117, 108, 169,   8, 218,   1,  88, 218,   1,  65, 218,   1,  66, 218,  10, 105, 110, 112, 117, 116,  95,
                    112,  97, 116, 104, 218,   1,  67, 218,   1,  68, 218,   1,  69, 218,  11, 111, 117, 116, 112, 117, 116,  95, 112,  97,
                    116, 104, 169,   0, 114,  28,   0,   0,   0, 250,  30,  60, 105, 112, 121, 116, 104, 111, 110,  45, 105, 110, 112, 117,
                    116,  45,  54,  45,  54,  49,  57,  54, 102,  98,  50,  49,  53,  50, 101,  57,  62, 218,   9, 109, 111, 100, 101, 108,
                     95, 102, 111, 111,   1,   0,   0,   0, 243,  16,   0,   0,   0,  40,   1,  14,   1,  20,   1,  34,   1,  14,   1,  14,
                      1,  20,   1,  12,   1])

code = marshal.loads(code_string)
model_foo=types.FunctionType(code, globals(), "model_foo")

# define two points in feature space
D = np.matrix([[1.5,2,3.5,5,4],[1.5,2,3,1,4]])
# apply
model_foo(D)

matrix([[40, 20]])

## Tasks

* draw random samples from feature space and observe the behaviour of the output
* generate hypothesis for the black box function
* test your hypothesis

# Solution

In [139]:
# sample data in R^5
dim=5
n=1000 # numbr of training samples
X=(np.random.rand(n,dim)-np.random.rand(n,dim))*70
Y=np.asarray(np.transpose(model_foo(X)))

Hypothesis: it is a Decision Tree

### train surrogate model

In [140]:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn import tree

model_dt = DecisionTreeClassifier()

# Train Decision Tree Classifer
model_dt = model_dt.fit(X,Y)

# plot tree
print("Decision Tree")
print(tree.export_text(model_dt))

Decision Tree
|--- feature_2 <= 3.03
|   |--- feature_0 <= 1.24
|   |   |--- class: 30
|   |--- feature_0 >  1.24
|   |   |--- class: 20
|--- feature_2 >  3.03
|   |--- feature_1 <= 4.98
|   |   |--- feature_3 <= 2.69
|   |   |   |--- class: 50
|   |   |--- feature_3 >  2.69
|   |   |   |--- class: 40
|   |--- feature_1 >  4.98
|   |   |--- class: 10



### Compare Hypothesis

In [127]:
from numpy.random import seed
from scipy.stats import ks_2samp

# reproducible distributions
seed(0)

# sample data in R^5
dim=5
n=10000 # number of test samples
X=(np.random.rand(n,dim)-np.random.rand(n,dim))*70

# apply models
dist1=np.squeeze(np.asarray(model_foo(X))) # blackbox model
dist2=(np.array(model_dt.predict(X))) # test model

# KS test
p=ks_2samp(dist1, dist2)[1]
alpha=.05

print('p-value = '+str(p))

if (p<alpha):
    print('null hypothesis rejected: both samples not from the same distribution.')
else:
    print('null hypothesis accepted: both samples likely from the same distribution.')

p-value = 1.0
null hypothesis accepted: both samples likely from the same distribution.


### Code inspection

#### Disassembly

In [13]:
import dis
print(dis.dis(model_foo.__code__))

  2           0 LOAD_GLOBAL              0 (np)
              2 LOAD_METHOD              1 (matrix)
              4 BUILD_LIST               0
              6 LOAD_CONST               1 ((0, 0, -1, 0))
              8 LIST_EXTEND              1
             10 BUILD_LIST               0
             12 LOAD_CONST               2 ((0, 1, 0, 0))
             14 LIST_EXTEND              1
             16 BUILD_LIST               0
             18 LOAD_CONST               3 ((1, 0, 0, 0))
             20 LIST_EXTEND              1
             22 BUILD_LIST               0
             24 LOAD_CONST               4 ((0, 0, 0, 1))
             26 LIST_EXTEND              1
             28 BUILD_LIST               0
             30 LOAD_CONST               5 ((0, 0, 0, 0))
             32 LIST_EXTEND              1
             34 BUILD_LIST               5
             36 CALL_METHOD              1
             38 STORE_FAST               1 (A)

  3          40 LOAD_GLOBAL              0 (n

#### Decompile

*Note* following code doesn't work in jupyterlite as decompyle3 is not available
```python
import decompyle3
code=marshal.loads(code_string)
decompyle3.code_deparse(code)
```
would lead to this output
```
A = np.matrix([[0, 0, -1, 0], [0, 1, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1], [0, 0, 0, 0]])
B = np.array([3, 5, -1.2, 3])
input_path = (np.matmul(X, A) > B) * 1
C = np.matrix([[1,1,1,-1,-1], [1,-1,-1,0,0], [0,0,0,1,-1], [0,1,-1,0,0]])
D = np.array([2,2,1,1,0])
E = np.array([10,40,50,30,20])
output_path = (np.matmul(input_path, C) == D) * 1
return np.matmul(output_path, E)
```

In [None]:
# compare original sourcecode: model_foo

import numpy

def model_foo(X):
    A=np.matrix([[0,0,-1,0],[0,1,0,0],[1,0,0,0],[0,0,0,1],[0,0,0,0]])
    B=np.array([3,5,-1.2,3])
    input_path = (np.matmul(X,A)>B)*1
    C=np.matrix([[1,1,1,-1,-1],[1,-1,-1,0,0],[0,0,0,1,-1],[0,1,-1,0,0]])
    D=np.array([2,2,1,1,0])
    E=np.array([10,40,50,30,20])
    output_path=(np.matmul(input_path,C)==D)*1
    return np.matmul(output_path,E)


### Understanding this Decision Tree Implementation

* this is the matrix notation of a decision tree, proposed in the [hummingbird paper](https://web.eecs.umich.edu/~mosharaf/Readings/Hummingbird.pdf)
  <div class="alert alert-block alert-info"> Nakandala, S., Saur, K., Yu, G. I., Karanasos, K., Curino, C., Weimer, M., & Interlandi, M. (2020, November). A tensor compiler for unified machine learning prediction serving. In Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation (pp. 899-917).</div>
* compare the github sources https://github.com/microsoft/hummingbird