# This example shows how to use the Simpson's Fuzzy Min-Max Neural Network classifier using an enhanced online learning algorithm with k-nearest hyperboxes selection (KNEFMNN)

Note that the numerical features in training and testing datasets must be in the range of [0, 1] because the KNEFMNN classifier require features in the unit cube.

## 1. Execute directly from the python file

In [1]:
%matplotlib notebook

In [2]:
import os
import warnings
warnings.filterwarnings('ignore')

#### Get the path to the this jupyter notebook file

In [3]:
this_notebook_dir = os.path.dirname(os.path.abspath("__file__"))
this_notebook_dir

'C:\\hyperbox-brain\\examples\\numerical_data\\incremental_learner'

#### Get the home folder of the Hyperbox-Brain project

In [4]:
from pathlib import Path
project_dir = Path(this_notebook_dir).parent.parent.parent
project_dir

WindowsPath('C:/hyperbox-brain')

#### Create the path to the Python file containing the implementation of the Simpson's FMNN classifier using the enhanced online learning algorithm with k-nearest hyperboxes selection

In [5]:
knefmnn_file_path = os.path.join(project_dir, Path("hbbrain/numerical_data/incremental_learner/knefmnn.py"))
knefmnn_file_path

'C:\\hyperbox-brain\\hbbrain\\numerical_data\\incremental_learner\\knefmnn.py'

#### Run the found file by showing the execution directions

In [6]:
!python "{knefmnn_file_path}" -h

usage: knefmnn.py [-h] -training_file TRAINING_FILE -testing_file TESTING_FILE
                  [--theta THETA] [--gamma GAMMA] [--k_neighbors K_NEIGHBORS]
                  [--is_draw IS_DRAW]

The description of parameters

required arguments:
  -training_file TRAINING_FILE
                        A required argument for the path to training data file
                        (including file name)
  -testing_file TESTING_FILE
                        A required argument for the path to testing data file
                        (including file name)

optional arguments:
  --theta THETA         Maximum hyperbox size (in the range of (0, 1])
                        (default: 0.5)
  --gamma GAMMA         A sensitivity parameter describing the speed of
                        decreasing of the membership function in each
                        dimension (larger than 0) (default: 1)
  --k_neighbors K_NEIGHBORS
                        The number of nearest hyperboxes is considered for the
 

#### Create the path to training and testing datasets stored in the dataset folder

In [7]:
training_data_file = os.path.join(project_dir, Path("dataset/syn_num_train.csv"))
training_data_file

'C:\\hyperbox-brain\\dataset\\syn_num_train.csv'

In [8]:
testing_data_file = os.path.join(project_dir, Path("dataset/syn_num_test.csv"))
testing_data_file

'C:\\hyperbox-brain\\dataset\\syn_num_test.csv'

#### Run a demo program

In [9]:
!python "{knefmnn_file_path}" -training_file "{training_data_file}" -testing_file "{testing_data_file}" --theta 0.1 --k_neighbors 5 --gamma 1

Number of hyperboxes = 51
Testing accuracy =  86.60%


## 2. Using the KNEFMNN classifier through its init, fit, and predict functions

In [10]:
from hbbrain.numerical_data.incremental_learner.knefmnn import KNEFMNNClassifier
import pandas as pd

#### Create training and testing data sets

In [11]:
df_train = pd.read_csv(training_data_file, header=None)
df_test = pd.read_csv(testing_data_file, header=None)

Xy_train = df_train.to_numpy()
Xy_test = df_test.to_numpy()

Xtr = Xy_train[:, :-1]
ytr = Xy_train[:, -1]

Xtest = Xy_test[:, :-1]
ytest = Xy_test[:, -1]

#### Initializing parameters

In [12]:
theta = 0.1
k_neighbors = 5
gamma = 1
is_draw = True

### Training

In [13]:
knefmnn_clf = KNEFMNNClassifier(theta=theta, k_neighbors=k_neighbors, gamma=gamma, is_draw=is_draw)
knefmnn_clf.fit(Xtr, ytr)

<IPython.core.display.Javascript object>

KNEFMNNClassifier(C=array([1, 2, 1, 1, 1, 2, 1, 2, 2, 1, 2, 2, 2, 2, 1, 1, 2, 2, 1, 1, 2, 1,
       1, 2, 2, 2, 2, 1, 1, 2, 2, 1, 1, 2, 2, 1, 1, 1, 2, 1, 2, 1, 1, 2,
       1, 2, 2, 2, 2, 2, 1]),
                  V=array([[0.42413   , 0.53516   ],
       [0.70577   , 0.397105  ],
       [0.82785   , 0.78025   ],
       [0.66038   , 0.51128   ],
       [0.48794   , 0.672     ],
       [0.26651   , 0.18424   ],
       [0.32289   , 0.59994   ],
       [0.19944   , 0.03      ],
       [0.28077   , 0.26124   ],
       [0.63683   , 0.6936    ],
       [0.28822   , 0.55512   ],
       [0.03      , 0.47757   ],
       [0...
       [0.91185   , 0.5761    ],
       [0.2246    , 0.13567   ],
       [0.25929   , 0.81558   ],
       [0.815     , 0.397095  ],
       [0.67906   , 0.83605   ],
       [0.52197   , 0.91371   ],
       [0.66037   , 0.57837   ],
       [0.49408   , 0.66846   ],
       [0.80583   , 0.43242   ],
       [0.79935   , 0.7757    ],
       [0.35813   , 0.58772   ],
       [0.79

#### The code below shows how to display decision boundaries among classes if input data are 2-dimensional

In [14]:
knefmnn_clf.draw_hyperbox_and_boundary("The trained KNEFMNN classifier and its decision boundaries")

<IPython.core.display.Javascript object>

In [15]:
print("Number of existing hyperboxes = %d"%(knefmnn_clf.get_n_hyperboxes()))

Number of existing hyperboxes = 51


### Prediction

In [16]:
from sklearn.metrics import accuracy_score

In [17]:
y_pred = knefmnn_clf.predict(Xtest)
acc = accuracy_score(ytest, y_pred)
print(f'Accuracy = {acc * 100: .2f}%')

Accuracy =  86.60%


#### Explaining the predicted result for the input sample by showing membership values and hyperboxes for each class

In [18]:
sample_need_explain = 10
y_pred_input_0, mem_val_classes, min_points_classes, max_points_classes = knefmnn_clf.get_sample_explanation(Xtest[sample_need_explain])

In [19]:
print("Predicted class for sample X = [%f, %f] is %d and real class is %d" % (Xtest[sample_need_explain, 0], Xtest[sample_need_explain, 1], y_pred_input_0, ytest[sample_need_explain]))

Predicted class for sample X = [0.571640, 0.233700] is 2 and real class is 2


In [20]:
print("Membership values:")
for key, val in mem_val_classes.items():
    print("Class %d has the maximum membership value = %f" % (key, val))
    
for key in min_points_classes:
    print("Class %d has the representative hyperbox: V = %s and W = %s" % (key, min_points_classes[key], max_points_classes[key]))

Membership values:
Class 1 has the maximum membership value = 0.964263
Class 2 has the maximum membership value = 0.990050
Class 1 has the representative hyperbox: V = [0.58339 0.3649 ] and W = [0.66091    0.38616125]
Class 2 has the representative hyperbox: V = [0.57285 0.27229] and W = [0.66773 0.36489]


### Show input sample and hyperboxes belonging to each class. In 2D, we can show rectangles or use parallel coordinates

#### Using rectangles to show explanations

In [21]:
knefmnn_clf.show_sample_explanation(Xtest[sample_need_explain], Xtest[sample_need_explain], min_points_classes, max_points_classes, y_pred_input_0, "2D")

<IPython.core.display.Javascript object>

#### Using parallel coordinates. This mode best fits for any dimensions

In [22]:
knefmnn_clf.show_sample_explanation(Xtest[sample_need_explain], Xtest[sample_need_explain], min_points_classes, max_points_classes, y_pred_input_0)