# Bernoulli Naive Bayes - Transparency

Data from :https://archive.ics.uci.edu/ml/datasets/Tic-Tac-Toe+Endgame

There are 957 objects and 9 features in the dataset
This database encodes the complete set of possible board configurations at the end of tic-tac-toe games, where "x" is assumed to have played first. The target concept is "win for x" 
(i.e., true when "x" has one of 8 possible ways to create a "three-in-a-row"). 

Attribute Information: (x=player x has taken, o=player o has taken, b=blank)
1. top-left-square: {x,o,b}
2. top-middle-square: {x,o,b}
3. top-right-square: {x,o,b}
4. middle-left-square: {x,o,b}
5. middle-middle-square: {x,o,b}
6. middle-right-square: {x,o,b}
7. bottom-left-square: {x,o,b}
8. bottom-middle-square: {x,o,b}
9. bottom-right-square: {x,o,b}
10. Class: {positive,negative}


*note: class distribution[negative,positive]

## Step 1: Import data

In [53]:
import pandas as pd
import numpy as np
df=pd.read_csv('tic-tac-toe.data')
df = df.rename(index=str, columns={"positive": "class"})

## Step 2: Binarize feature

In [54]:
df_new = pd.get_dummies(df, columns=["x", "x.1","x.2","x.3","o","o.1","x.4","o.2","o.3"], 
               prefix=["top-left-square", "top-middle-square","top-right-square","middle-left-square","middle-middle-square"
                       ,"middle-right-square","bottom-left-square","bottom-middle-square","bottom-right-square"])

cleanup_nums = {"class":  {"positive": 1, "negative": 0}}

df_new.replace(cleanup_nums, inplace=True)

## Step 3: Split data to train_test_set

In [55]:
x = df_new.iloc[:,1:28].values
y = df_new.iloc[:,0:1].values
feature_name=list(df_new.drop('class',1))
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test = train_test_split(x,y,test_size = 0.33, random_state = 0)


## Step 4: fit data to BernoulliNB

In [56]:
from sklearn.naive_bayes import BernoulliNB

nb = BernoulliNB()
nb.fit(X_train, Y_train.ravel())
y_pred = nb.predict(X_test)
print("The accuracy_score of train set", nb.score(X_train,Y_train))
print("The accuracy_score of test set", nb.score(X_test,Y_test))

The accuracy_score of train set 0.700468018721
The accuracy_score of test set 0.721518987342


## Step 5: Get feature distribution 

In [57]:
feature_log_prob_ = nb.feature_log_prob_
feature_yes_prob = np.exp(feature_log_prob_)
feature_yes_ratio=[]
for x in range(0,len(feature_yes_prob[0,])):
    feature_yes_ratio.append( [np.log(feature_yes_prob[1,x]/feature_yes_prob[0,x])])
feature_no_prob_neg =[1-x for x in feature_yes_prob[0,]]
feature_no_prob_pos =[1-x for x in feature_yes_prob[1,]]
feature_no_prob = [feature_no_prob_neg,feature_no_prob_pos]
feature_no_ratio=[]
for x in range(0,len(feature_no_prob[0])):
    feature_no_ratio.append( [np.log(feature_no_prob[1][x]/feature_no_prob[0][x])])
feature_ratio = [feature_no_ratio,feature_yes_ratio]
feature_ratio= np.array(feature_ratio).reshape(2,27)

## Step 6: Define total pos/neg log-evidence

In [58]:
def log_evidence(obj):
    positive_evidence_list = []
    negative_evidence_list = []
    total_positive_evidence = []
    total_negative_evidence = []
    for i, value in enumerate(obj):#i means number of features , value means value of i-th feature
        total_evidence_list.append(feature_ratio[value][i])
        if feature_ratio[value][i]>0 :
            positive_evidence_list.append(feature_ratio[value][i])
        else:
            negative_evidence_list.append(feature_ratio[value][i])
    total_positive_evidence = sum(positive_evidence_list)
    total_negative_evidence = sum(negative_evidence_list)
    return(total_negative_evidence,total_positive_evidence)

## Step7: Define top 3 features with most of pos/neg evidence

In [59]:
def find_top_feature(obj):
    feature_negative_index =[]
    feature_positive_index =[]
    feature_positive_index= np.argpartition(total_evidence_list, -3)[-3:]
    feature_negative_index = np.argpartition(total_evidence_list, 3)[:3]
    print('(d)top 3 features values that contribute most to the positive evidence')
    for i in range(0,3):
        print(feature_name[feature_positive_index[i]],obj[feature_positive_index[i]])
    print('(e)top 3 features values that contribute most to the negative evidence')
    for i in range(0,3):
        print(feature_name[feature_negative_index[i]],obj[feature_negative_index[i]])

## Step 8: Get the following values from no.1 to no.5 objects
a) the total positive log-evidence <br/>
b) the total negative log-evidence <br/>
c) probability distribution <br/>
d) top 3 features values that contribute most to the positive evidence <br/>
e) top 3 feature values that contribute the most to the negative evidence <br/>

### no.1 : The most positive object with respect to the probabilities

In [60]:
total_evidence_list = []
predict_proba = nb.predict_proba(X_test)
positive_object_index = np.argmax(predict_proba[:,1])

a_obj = X_test[positive_object_index,]
total = log_evidence(a_obj)
print('(a)the total positive log-evidence',total[1])
print('(b)the total negative log-evidence',total[0])
print("(c)probability distribution",predict_proba[positive_object_index,])
find_top_feature(a_obj)

(a)the total positive log-evidence 4.7197885665
(b)the total negative log-evidence -0.572949307799
(c)probability distribution [ 0.00836624  0.99163376]
(d)top 3 features values that contribute most to the positive evidence
bottom-right-square_b 1
middle-middle-square_x 1
middle-middle-square_o 0
(e)top 3 features values that contribute most to the negative evidence
top-middle-square_b 0
middle-right-square_o 0
bottom-right-square_x 0


### no.2 : The most negative object with respect to the probabilities

In [61]:
total_evidence_list = []
negative_object_index = np.argmax(predict_proba[:,0])

b_obj = X_test[negative_object_index,]
total = log_evidence(b_obj)
print('(a)the total positive log-evidence',total[1])
print('(b)the total negative log-evidence',total[0])
print("(c)probability distribution",predict_proba[negative_object_index,])
find_top_feature(b_obj)

(a)the total positive log-evidence 0.724058792054
(b)the total negative log-evidence -4.82874019321
(c)probability distribution [ 0.97000794  0.02999206]
(d)top 3 features values that contribute most to the positive evidence
bottom-middle-square_x 0
top-right-square_o 0
top-right-square_x 1
(e)top 3 features values that contribute most to the negative evidence
middle-middle-square_o 1
middle-middle-square_x 0
bottom-left-square_o 1


### no.3 : The object that has the largest positive evidence

In [62]:
total_evidence_list = []
pos_evidence_testset_list = []
neg_evidence_testset_list = []
for x in range(0,len(X_test[:,0])):
    total_evidence_list = []
    total = log_evidence(X_test[x,])
    pos_evidence_testset_list.append(total[1])
    neg_evidence_testset_list.append(total[0])
largest_pos_evidence_index = np.argmax(pos_evidence_testset_list)#3

total_evidence_list = []
c_obj = X_test[largest_pos_evidence_index,]
total = log_evidence(c_obj)
print('(a)the total positive log-evidence',total[1])
print('(b)the total negative log-evidence',total[0])
print("(c)probability distribution",predict_proba[largest_pos_evidence_index,])
find_top_feature(c_obj)

(a)the total positive log-evidence 4.7197885665
(b)the total negative log-evidence -0.572949307799
(c)probability distribution [ 0.00836624  0.99163376]
(d)top 3 features values that contribute most to the positive evidence
bottom-right-square_b 1
middle-middle-square_x 1
middle-middle-square_o 0
(e)top 3 features values that contribute most to the negative evidence
top-middle-square_b 0
middle-right-square_o 0
bottom-right-square_x 0


### no.4 : The object that has the largest negative evidence

In [63]:
total_evidence_list = []
largest_neg_evidence_index = np.argmin(neg_evidence_testset_list)

d_obj = X_test[largest_neg_evidence_index,]
total = log_evidence(d_obj)
print('(a)the total positive log-evidence',total[1])
print('(b)the total negative log-evidence',total[0])
print("(c)probability distribution",predict_proba[largest_neg_evidence_index,])
find_top_feature(d_obj)

(a)the total positive log-evidence 0.724058792054
(b)the total negative log-evidence -4.82874019321
(c)probability distribution [ 0.97000794  0.02999206]
(d)top 3 features values that contribute most to the positive evidence
bottom-middle-square_x 0
top-right-square_o 0
top-right-square_x 1
(e)top 3 features values that contribute most to the negative evidence
middle-middle-square_o 1
middle-middle-square_x 0
bottom-left-square_o 1


### no.5: The most uncertain object (the probabilities are closest to 0.5)

In [66]:
total_evidence_list = []
ratio = []
for x in range(0,len(predict_proba[:,0])):
    ratio.append( predict_proba[x,1]/predict_proba[x,0])
ratio = np.array(ratio)
new_ratio = np.abs(ratio-1)
neural_index = np.argmin(new_ratio)

e_obj = X_test[neural_index,]
total = log_evidence(e_obj)
print('(a)the total positive log-evidence',total[1])
print('(b)the total negative log-evidence',total[0])
print("(c)probability distribution",predict_proba[neural_index,])
find_top_feature(e_obj)

(a)the total positive log-evidence 2.34181134155
(b)the total negative log-evidence -2.97538386302
(c)probability distribution [ 0.50131571  0.49868429]
(d)top 3 features values that contribute most to the positive evidence
top-middle-square_o 1
top-right-square_x 1
bottom-left-square_x 1
(e)top 3 features values that contribute most to the negative evidence
middle-middle-square_o 1
middle-middle-square_x 0
bottom-right-square_o 1
