# Dataset description

This is Wisconsin Diagnostic Breast Cancer Dataset, collected from Wisconsin University, published in UCI https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/. 

The dataset is about two-class classification task. Class 'B' means Benign (positive), and class 'M' represents Malignant (negative).

The dataset has 569 instances in total. Each has 30 attributes/features (excluding ID) and 1 label/class as follows. The original feature datatype is Real, and the label datatype is Boolean. The dataset has no missing values.

Feature names:

    a) radius (mean of distances from center to points on the perimeter)
	b) texture (standard deviation of gray-scale values)
	c) perimeter
	d) area
	e) smoothness (local variation in radius lengths)
	f) compactness (perimeter^2 / area - 1.0)
	g) concavity (severity of concave portions of the contour)
	h) concave points (number of concave portions of the contour)
	i) symmetry 
	j) fractal dimension ("coastline approximation" - 1)

The mean, standard error, and "worst" or largest (mean of the three largest values) of these features were computed for each instance, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius.

# Load the dataset

In [4]:
import numpy as np
import pandas as pd

# load the dataset using pandas
data = pd.read_table('wdbc.data', header=None, sep=',', index_col=0)  # returned "data" is an array_like.
print(data.shape)         # show the row numbers and column numbers of the dataset
data.head(10)             # show the first 10 rows of the dataset


(569, 31)


Unnamed: 0_level_0,1,2,3,4,5,6,7,8,9,10,...,22,23,24,25,26,27,28,29,30,31
0,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
842302,M,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
842517,M,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902
84300903,M,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,...,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758
84348301,M,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,0.2597,...,14.91,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173
84358402,M,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,0.1809,...,22.54,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678
843786,M,12.45,15.7,82.57,477.1,0.1278,0.17,0.1578,0.08089,0.2087,...,15.47,23.75,103.4,741.6,0.1791,0.5249,0.5355,0.1741,0.3985,0.1244
844359,M,18.25,19.98,119.6,1040.0,0.09463,0.109,0.1127,0.074,0.1794,...,22.88,27.66,153.2,1606.0,0.1442,0.2576,0.3784,0.1932,0.3063,0.08368
84458202,M,13.71,20.83,90.2,577.9,0.1189,0.1645,0.09366,0.05985,0.2196,...,17.06,28.14,110.6,897.0,0.1654,0.3682,0.2678,0.1556,0.3196,0.1151
844981,M,13.0,21.82,87.5,519.8,0.1273,0.1932,0.1859,0.09353,0.235,...,15.49,30.73,106.2,739.3,0.1703,0.5401,0.539,0.206,0.4378,0.1072
84501001,M,12.46,24.04,83.97,475.9,0.1186,0.2396,0.2273,0.08543,0.203,...,15.09,40.68,97.65,711.4,0.1853,1.058,1.105,0.221,0.4366,0.2075


# Binarize all the features

In [12]:
from sklearn.preprocessing import Binarizer

# calculate the mean value in each feature
thrsd = np.mean(data.iloc[:, 1:], axis=0)
print(thrsd)
#type(thrsd)


# binarize all the features by the thrsd
for j in range(1, len(data.columns)):
    for i in range(len(data.index)):
        if data.iloc[i,j] > thrsd.iloc[j-1]:
            data.iat[i,j] = 1
        else:
            data.iat[i,j] = 0

data.head(10)

2       0.397188
3      19.289649
4      91.969033
5     654.889104
6       0.096360
7       0.104341
8       0.088799
9       0.048919
10      0.181162
11      0.062798
12      0.405172
13      1.216853
14      2.866059
15     40.337079
16      0.007041
17      0.025478
18      0.031894
19      0.011796
20      0.020542
21      0.003795
22     16.269190
23     25.677223
24    107.261213
25    880.583128
26      0.132369
27      0.254265
28      0.272188
29      0.114606
30      0.290076
31      0.083946
dtype: float64


Unnamed: 0_level_0,1,2,3,4,5,6,7,8,9,10,...,22,23,24,25,26,27,28,29,30,31
0,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
842302,M,1.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
842517,M,1.0,0.0,1.0,1.0,0.0,0.0,0.0,1.0,1.0,...,1.0,0.0,1.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0
84300903,M,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
84348301,M,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,...,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0
84358402,M,1.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,...,1.0,0.0,1.0,1.0,1.0,0.0,1.0,1.0,0.0,0.0
843786,M,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,...,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0
844359,M,1.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0,0.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0
84458202,M,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0
844981,M,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,...,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0
84501001,M,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,...,0.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0


# Train-test split & Train BernoulliNB classifier

In [17]:
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import BernoulliNB

X = data.iloc[:, 1:]     # extract the last 30 columns as features
y = data.iloc[:, 0]      # extract the first column as label
#print(X)

# train-test split, 2:1 
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)    # shuffle the data before splitting
print(X_test, y_test)


            2    3    4    5    6    7    8    9    10   11 ...    22   23  \
0                                                           ...              
923465     0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  1.0 ...   0.0  1.0   
885429     1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  0.0  1.0 ...   1.0  0.0   
8811779    0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  1.0 ...   0.0  0.0   
864292     0.0  1.0  0.0  0.0  1.0  1.0  0.0  0.0  1.0  1.0 ...   0.0  0.0   
869931     0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 ...   0.0  0.0   
915452     1.0  0.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0 ...   1.0  0.0   
846381     1.0  1.0  1.0  1.0  0.0  0.0  1.0  1.0  1.0  0.0 ...   1.0  1.0   
903516     1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 ...   1.0  1.0   
9010598    0.0  0.0  0.0  0.0  1.0  0.0  0.0  0.0  0.0  0.0 ...   0.0  1.0   
911685     0.0  0.0  0.0  0.0  1.0  0.0  0.0  0.0  0.0  1.0 ...   0.0  0.0   
9010259    0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0  1.0  1.0 ... 

array([[  1.00000000e+00,   3.41511718e-11],
       [  1.00620839e-15,   1.00000000e+00],
       [  1.00000000e+00,   2.66187522e-14],
       [  1.00000000e+00,   1.35470739e-10],
       [  1.00000000e+00,   3.38810097e-16],
       [  7.28741348e-01,   2.71258652e-01],
       [  7.43916110e-04,   9.99256084e-01],
       [  9.90404234e-18,   1.00000000e+00],
       [  1.00000000e+00,   8.65963929e-15],
       [  1.00000000e+00,   2.36961185e-13],
       [  9.99245837e-01,   7.54163184e-04],
       [  1.00000000e+00,   3.38810097e-16],
       [  1.08682608e-01,   8.91317392e-01],
       [  9.99782655e-01,   2.17344972e-04],
       [  2.73914802e-13,   1.00000000e+00],
       [  1.00000000e+00,   4.34080702e-15],
       [  9.48111919e-04,   9.99051888e-01],
       [  1.00000000e+00,   2.21471960e-14],
       [  1.00000000e+00,   5.11465267e-16],
       [  2.35995221e-15,   1.00000000e+00],
       [  5.33985120e-04,   9.99466015e-01],
       [  2.84303280e-07,   9.99999716e-01],
       [  

In [46]:
clf = BernoulliNB()
clf.fit(X_train, y_train)
pred = clf.predict_proba(X_test)
pred  # class0 = B(postive), class1 = M(negative)

array([[  1.00000000e+00,   3.41511718e-11],
       [  1.00620839e-15,   1.00000000e+00],
       [  1.00000000e+00,   2.66187522e-14],
       [  1.00000000e+00,   1.35470739e-10],
       [  1.00000000e+00,   3.38810097e-16],
       [  7.28741348e-01,   2.71258652e-01],
       [  7.43916110e-04,   9.99256084e-01],
       [  9.90404234e-18,   1.00000000e+00],
       [  1.00000000e+00,   8.65963929e-15],
       [  1.00000000e+00,   2.36961185e-13],
       [  9.99245837e-01,   7.54163184e-04],
       [  1.00000000e+00,   3.38810097e-16],
       [  1.08682608e-01,   8.91317392e-01],
       [  9.99782655e-01,   2.17344972e-04],
       [  2.73914802e-13,   1.00000000e+00],
       [  1.00000000e+00,   4.34080702e-15],
       [  9.48111919e-04,   9.99051888e-01],
       [  1.00000000e+00,   2.21471960e-14],
       [  1.00000000e+00,   5.11465267e-16],
       [  2.35995221e-15,   1.00000000e+00],
       [  5.33985120e-04,   9.99466015e-01],
       [  2.84303280e-07,   9.99999716e-01],
       [  

# The most positive object with respect to the probabilities

In [47]:
min_idx = np.argmin(pred, axis=0)  # select the instance index of least positive and least negative
print(min_idx)                     # the most positive object is min_idx[1], the most negative object is min_idx[0]

[  7 178]


In [110]:
# calculate each feature's log probability P(X_i|y)
prob_XB_y = clf.feature_log_prob_
prob_XM_y = np.log(1 - np.exp(prob_XB_y))


# calculate the total positive log-evidence
tpe = []
for j in range(len(X_test.columns)):
    if X_test.iloc[min_idx[1], j] == 1:
        tpe.append(prob_XB_y[0,j])
    else:
        tpe.append(prob_XM_y[0,j])
print("The total positive log-evidence is %f" % np.sum(tpe))


# calculate the total negative log-evidence
tne = []
for j in range(len(X_test.columns)):
    if X_test.iloc[min_idx[1], j] == 1:
        tne.append(prob_XB_y[1,j])
    else:
        tne.append(prob_XM_y[1,j])
print("The total negative log-evidence is %f" % np.sum(tne))


# probability distribution
print("Probability distribution:")
print("    B(positive): %f" % pred[min_idx[1], 0])
print("    M(negative): %f" % pred[min_idx[1], 1])


# top 3 features values that contribute most to the positive evidence
feature_names = ['mean radius', 'mean texture', 'mean perimeter', 'mean area', 'mean smoothness', 'mean compactness', 'mean concavity', 'mean concave points', 'mean symmetry', 'mean fractal dimension',
                 'SE radius', 'SE texture', 'SE perimeter', 'SE area', 'SE smoothness', 'SE compactness', 'SE concavity', 'SE concave points', 'SE symmetry', 'SE fractal dimension',
                 'worst radius', 'worst texture', 'worst perimeter', 'worst area', 'worst smoothness', 'worst compactness', 'worst concavity', 'worst concave points', 'worst symmetry', 'worst fractal dimension']
tpe_top_idx = np.argsort(tpe)[-3:][::-1]
print('Top 3 features that contribute positive evidence and their values:')
print('    %s: %f' % (feature_names[tpe_top_idx[0]], tpe[tpe_top_idx[0]]))
print('    %s: %f' % (feature_names[tpe_top_idx[1]], tpe[tpe_top_idx[1]]))
print('    %s: %f' % (feature_names[tpe_top_idx[2]], tpe[tpe_top_idx[2]]))


# top 3 features values that contribute most to the negative evidence
tne_top_idx = np.argsort(tne)[-3:][::-1]
print('Top 3 features that contribute negative evidence and their values:')
print('    %s: %f' % (feature_names[tne_top_idx[0]], tne[tne_top_idx[0]]))
print('    %s: %f' % (feature_names[tne_top_idx[1]], tne[tne_top_idx[1]]))
print('    %s: %f' % (feature_names[tne_top_idx[2]], tne[tne_top_idx[2]]))

The total positive log-evidence is -8.738989
The total negative log-evidence is -44.395900
Probability distribution:
    B(positive): 1.000000
    M(negative): 0.000000
Top 3 features that contribute positive evidence and their values:
    worst area: -0.028988
    SE area: -0.041673
    worst perimeter: -0.085158
Top 3 features that contribute negative evidence and their values:
    mean fractal dimension: -0.487295
    SE texture: -0.487295
    SE fractal dimension: -0.637577


# The most negative object with respect to the probabilities

In [111]:
# calculate the total positive log-evidence
tpe = []
for j in range(len(X_test.columns)):
    if X_test.iloc[min_idx[0], j] == 1:
        tpe.append(prob_XB_y[0,j])
    else:
        tpe.append(prob_XM_y[0,j])
print("The total positive log-evidence is %f" % np.sum(tpe))


# calculate the total negative log-evidence
tne = []
for j in range(len(X_test.columns)):
    if X_test.iloc[min_idx[0], j] == 1:
        tne.append(prob_XB_y[1,j])
    else:
        tne.append(prob_XM_y[1,j])
print("The total negative log-evidence is %f" % np.sum(tne))


# probability distribution
print("Probability distribution:")
print("    B(positive): %f" % pred[min_idx[0], 0])
print("    M(negative): %f" % pred[min_idx[0], 1])


# top 3 features values that contribute most to the positive evidence
tpe_top_idx = np.argsort(tpe)[-3:][::-1]
print('Top 3 features that contribute positive evidence and their values:')
print('    %s: %f' % (feature_names[tpe_top_idx[0]], tpe[tpe_top_idx[0]]))
print('    %s: %f' % (feature_names[tpe_top_idx[1]], tpe[tpe_top_idx[1]]))
print('    %s: %f' % (feature_names[tpe_top_idx[2]], tpe[tpe_top_idx[2]]))


# top 3 features values that contribute most to the negative evidence
tne_top_idx = np.argsort(tne)[-3:][::-1]
print('Top 3 features that contribute negative evidence and their values:')
print('    %s: %f' % (feature_names[tne_top_idx[0]], tne[tne_top_idx[0]]))
print('    %s: %f' % (feature_names[tne_top_idx[1]], tne[tne_top_idx[1]]))
print('    %s: %f' % (feature_names[tne_top_idx[2]], tne[tne_top_idx[2]]))

The total positive log-evidence is -49.600487
The total negative log-evidence is -9.881091
Probability distribution:
    B(positive): 0.000000
    M(negative): 1.000000
Top 3 features that contribute positive evidence and their values:
    SE symmetry: -0.483978
    SE texture: -0.497312
    SE smoothness: -0.517652
Top 3 features that contribute negative evidence and their values:
    worst perimeter: -0.074108
    worst concave points: -0.074108
    mean concave points: -0.074108


# The object that has the largest positive evidence

In [130]:
# the positive log-evidence of each feature for each instance in test set
pe_all = []                                # two-dimensional array
for i in range(len(X_test.index)):
    pe = []
    for j in range(len(X_test.columns)):
        if X_test.iloc[i, j] == 1:
            pe.append(prob_XB_y[0,j])
        else:
            pe.append(prob_XM_y[0,j])
    pe_all.append(pe)


# calculate the total positive log-evidence for each instance
tpe_all = []
for i in range(len(X_test.index)):
    tpe_all.append(np.sum(pe_all[i]))

    
# select the instance with the largest total positive log-evidence
idx = np.argmax(tpe_all)
pe = []
for j in range(len(X_test.columns)):
    pe.append(pe_all[idx][j])


# print the total positive log-evidence
print("The total positive log-evidence is %f" % tpe_all[idx])


# calculate the total negative log-evidence
tne = []
for j in range(len(X_test.columns)):
    if X_test.iloc[idx, j] == 1:
        tne.append(prob_XB_y[1,j])
    else:
        tne.append(prob_XM_y[1,j])
print("The total negative log-evidence is %f" % np.sum(tne))


# probability distribution
print("Probability distribution:")
print("    B(positive): %f" % pred[idx, 0])
print("    M(negative): %f" % pred[idx, 1])


# top 3 features values that contribute most to the positive evidence
tpe_top_idx = np.argsort(pe)[-3:][::-1]
print('Top 3 features that contribute positive evidence and their values:')
print('    %s: %f' % (feature_names[tpe_top_idx[0]], pe[tpe_top_idx[0]]))
print('    %s: %f' % (feature_names[tpe_top_idx[1]], pe[tpe_top_idx[1]]))
print('    %s: %f' % (feature_names[tpe_top_idx[2]], pe[tpe_top_idx[2]]))


# top 3 features values that contribute most to the negative evidence
tne_top_idx = np.argsort(tne)[-3:][::-1]
print('Top 3 features that contribute negative evidence and their values:')
print('    %s: %f' % (feature_names[tne_top_idx[0]], tne[tne_top_idx[0]]))
print('    %s: %f' % (feature_names[tne_top_idx[1]], tne[tne_top_idx[1]]))
print('    %s: %f' % (feature_names[tne_top_idx[2]], tne[tne_top_idx[2]]))

The total positive log-evidence is -7.876517
The total negative log-evidence is -42.931801
Probability distribution:
    B(positive): 1.000000
    M(negative): 0.000000
Top 3 features that contribute positive evidence and their values:
    worst area: -0.028988
    SE area: -0.041673
    worst perimeter: -0.085158
Top 3 features that contribute negative evidence and their values:
    SE symmetry: -0.366931
    SE smoothness: -0.419854
    mean fractal dimension: -0.487295


# The object that has the largest (in magnitude) negative evidence

In [131]:
# the negative log-evidence of each feature for each instance in test set
ne_all = []                                # two-dimensional array
for i in range(len(X_test.index)):
    ne = []
    for j in range(len(X_test.columns)):
        if X_test.iloc[i, j] == 1:
            ne.append(prob_XB_y[1,j])
        else:
            ne.append(prob_XM_y[1,j])
    ne_all.append(ne)


# calculate the total negative log-evidence for each instance
tne_all = []
for i in range(len(X_test.index)):
    tne_all.append(np.sum(ne_all[i]))

    
# select the instance with the largest total negative log-evidence
idx = np.argmax(tne_all)
ne = []
for j in range(len(X_test.columns)):
    ne.append(ne_all[idx][j])


# calculate the total positive log-evidence
tpe = []
for j in range(len(X_test.columns)):
    if X_test.iloc[idx, j] == 1:
        tpe.append(prob_XB_y[0,j])
    else:
        tpe.append(prob_XM_y[0,j])
print("The total negative log-evidence is %f" % np.sum(tpe))


# print the total negative log-evidence
print("The total positive log-evidence is %f" % tne_all[idx])


# probability distribution
print("Probability distribution:")
print("    B(positive): %f" % pred[idx, 0])
print("    M(negative): %f" % pred[idx, 1])


# top 3 features values that contribute most to the positive evidence
tpe_top_idx = np.argsort(tpe)[-3:][::-1]
print('Top 3 features that contribute positive evidence and their values:')
print('    %s: %f' % (feature_names[tpe_top_idx[0]], tpe[tpe_top_idx[0]]))
print('    %s: %f' % (feature_names[tpe_top_idx[1]], tpe[tpe_top_idx[1]]))
print('    %s: %f' % (feature_names[tpe_top_idx[2]], tpe[tpe_top_idx[2]]))


# top 3 features values that contribute most to the negative evidence
tne_top_idx = np.argsort(ne)[-3:][::-1]
print('Top 3 features that contribute negative evidence and their values:')
print('    %s: %f' % (feature_names[tne_top_idx[0]], ne[tne_top_idx[0]]))
print('    %s: %f' % (feature_names[tne_top_idx[1]], ne[tne_top_idx[1]]))
print('    %s: %f' % (feature_names[tne_top_idx[2]], ne[tne_top_idx[2]]))

The total negative log-evidence is -49.600487
The total positive log-evidence is -9.881091
Probability distribution:
    B(positive): 0.000000
    M(negative): 1.000000
Top 3 features that contribute positive evidence and their values:
    SE symmetry: -0.483978
    SE texture: -0.497312
    SE smoothness: -0.517652
Top 3 features that contribute negative evidence and their values:
    worst perimeter: -0.074108
    worst concave points: -0.074108
    mean concave points: -0.074108


# The most uncertain object (the probabilities are closest to 0.5)

In [142]:
# select the instance
idx = np.argmin(abs(np.diff(pred)))


# calculate the total positive log-evidence
tpe = []
for j in range(len(X_test.columns)):
    if X_test.iloc[idx, j] == 1:
        tpe.append(prob_XB_y[0,j])
    else:
        tpe.append(prob_XM_y[0,j])
print("The total positive log-evidence is %f" % np.sum(tpe))


# calculate the total negative log-evidence
tne = []
for j in range(len(X_test.columns)):
    if X_test.iloc[idx, j] == 1:
        tne.append(prob_XB_y[1,j])
    else:
        tne.append(prob_XM_y[1,j])
print("The total negative log-evidence is %f" % np.sum(tne))


# probability distribution
print("Probability distribution:")
print("    B(positive): %f" % pred[idx, 0])
print("    M(negative): %f" % pred[idx, 1])


# top 3 features values that contribute most to the positive evidence
tpe_top_idx = np.argsort(tpe)[-3:][::-1]
print('Top 3 features that contribute positive evidence and their values:')
print('    %s: %f' % (feature_names[tpe_top_idx[0]], tpe[tpe_top_idx[0]]))
print('    %s: %f' % (feature_names[tpe_top_idx[1]], tpe[tpe_top_idx[1]]))
print('    %s: %f' % (feature_names[tpe_top_idx[2]], tpe[tpe_top_idx[2]]))


# top 3 features values that contribute most to the negative evidence
tne_top_idx = np.argsort(tne)[-3:][::-1]
print('Top 3 features that contribute negative evidence and their values:')
print('    %s: %f' % (feature_names[tne_top_idx[0]], tne[tne_top_idx[0]]))
print('    %s: %f' % (feature_names[tne_top_idx[1]], tne[tne_top_idx[1]]))
print('    %s: %f' % (feature_names[tne_top_idx[2]], tne[tne_top_idx[2]]))


The total positive log-evidence is -27.417979
The total negative log-evidence is -26.860673
Probability distribution:
    B(positive): 0.502125
    M(negative): 0.497875
Top 3 features that contribute positive evidence and their values:
    worst area: -0.028988
    SE area: -0.041673
    worst perimeter: -0.085158
Top 3 features that contribute negative evidence and their values:
    worst concave points: -0.074108
    mean concave points: -0.074108
    mean concavity: -0.129458
