# FPGA Accelerated - Naive Bayes Classification

#### In this notebook we are going to use the **InAccel modified GaussianNB** class to accelerate the algorithm's classification part. We will also use the **original scikit-learn implementation** to compute the overall **speedup** we get and compare any differences in the calculated predictions among the two runs. For the training and classification parts, we are going to create a custom dataset and further on adjust the number of **samples**, **features** and **classes** to inspect who speedup is affected.

### Import all necessary libraries

In [None]:
from inaccel.sklearn.naive_bayes import GaussianNB
import numpy as np
from sklearn import datasets
from sklearn import metrics
from time import time

### Create a custom dataset with the defined number of samples, features and classes

In [None]:
samples = 10000
features = 500
classes = 10

X, y = datasets.make_classification(n_samples = samples, n_features = features, n_informative = 400, n_redundant = 50,
                                    n_repeated = 50, n_classes = classes, class_sep = 10.0, random_state = 0)

### Use only **10%** of the generated samples for the training part and the rest for the classification

In [None]:
# Samples used for training the model
train_samples = int(0.1 * samples)

train_labels = y[:train_samples]
train_features = X[:train_samples]
print("Train data shape:\n\tLabels: " + str(train_labels.shape) + "\n\tFeatures: " + str(train_features.shape))

test_labels = y[train_samples:]
test_features = X[train_samples:]
print("Test data shape:\n\tLabels: " + str(test_labels.shape) + "\n\tFeatures: " + str(test_features.shape))

### Create a Naive Bayes object and **train** a model

In [None]:
nb = GaussianNB()

startTime = time()
nb_model = nb.fit(train_features, train_labels)
elapsedTime = int((time() - startTime) * 100) / 100

print("Naive Bayes training (CPU) took: " + str(elapsedTime) + " sec")

### Calculate the predictions using the FPGA resources

In [None]:
startTime = time()
predictions = nb_model.predict(test_features)
elapsedTime = int((time() - startTime) * 100) / 100

print("Accuracy: " + str(int(metrics.accuracy_score(test_labels, predictions) * 10000) / 100) + "%")
print("Naive Bayes classification (FPGA) took: " + str(elapsedTime) + " sec")

### Import the Original NaiveBayes class to compare the classification part execution time

In [None]:
from sklearn.naive_bayes import GaussianNB as OriginalNB

cpuNB = OriginalNB()

cpu_model = cpuNB.fit(train_features, train_labels)

### Calculate the predictions using the CPU resources

In [None]:
startTimeCPU = time()
predictionsCPU = cpu_model.predict(test_features)
elapsedTimeCPU = int((time() - startTimeCPU) * 100) / 100

print("Accuracy: " + str(int(metrics.accuracy_score(test_labels, predictionsCPU) * 10000) / 100) + "%")
print("Naive Bayes classification (CPU) took: " + str(elapsedTimeCPU) + " sec")

### Speedup Calculation

In [None]:
speedup = int(elapsedTimeCPU / elapsedTime * 100) / 100
print("Speedup: " + str(speedup))