Skip to content

a GNB classifier built from scratch in C++ without the use of any external dependencies like Eigen. The project summarizes a generic implementation for behaviour prediction of a vehicle on a highway.

Aparajith-S/Gaussian_Naive_Bayes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gaussian Naive Bayes Classifier

author : s.aparajith@live.com
date : 14/5/2021


Building the project

Windows

  • requires MSVC 15 or above compiler.
  • requires a latest version of cmake.
  • build can be triggered by the following commands
  •  mkdir build
     cd build
     cmake .. -G "Visual Studio 16 2019" -A x64
     cmake --build . --config Release
     cd Release
     GNBClassifier.exe  
    

Linux

  • requires gnu gc++ 5.4 or above compiler.
  • requires a latest version of cmake.
  • build can be triggered by the following commands
  •  mkdir build
     cd build
     cmake .. && make
     ./GNBClassifier
    

Introduction

This project deals with the theory of gaussian naive bayes classifier and it's implementation in C++. It uses an example data of a vehicle making some lane changes. The Gaussian NB classifier will predict the behavior of the vehicle on the highway given it's Frenet coordinates s and d and it's first order derivatives.

Theory

the Gaussian Naive Bayes classifier is an extension to the naive bayes classifier. Abstractly, naïve Bayes is a conditional probability model: given a problem instance to be classified, represented by a vector x = (x1, x2, x3 ... xn) representing some n features (independent variables), it assigns to this instance probabilities

ds

for each of the K possible outcomes or classes Ck.

The problem with the above formulation is that if the number of features n is large or if a feature takes on a large number of values, then basing such a model on probability tables is infeasible. The model must therefore be reformulated to make it more tractable.

Using Bayes' theorem, the conditional probability can be decomposed as

ds

which is nothing but,

ds

In practice, the numerator is only of interest as the denominator doesn't depend on C and the values xi are given which makes the denominator effectively constant.

The numerator is equivalent to the joint probability model, which can be rewritten using the chain rule for repeated application of conditional probability

ds

now, making a naiive assumption, all features x are mutually independent, conditional on the category Ck assuming,

ds

Hence, the joint model can be espressed as:

ds

Thus, with the above independence assumptions, the conditional distribution over the class variable C is:

ds

training the classifier

For a feature x and label C with mean μ and standard deviation σ, the conditional probability can be computed using the formula

ds

where, v would be used in the prediction step.
v is the observed states of the vehicle which is used to find the conditional probability of x given C so that C given x can be found.

prediction

In this formula, the argmax is taken over all possible labels Ck and the product is taken over all features Xi with values vi.

ds

Code

src/classifier.h contains the class GNB which creates an instance of a gaussian naive bayes classifier object.

  • using the void train(...) method the model is trained using the previously presented theory.
  • using the string predict(...) method, prediction can be done using the trained model.
    Note: the member possible_labels would need to be extended/changed if data files have more/different labels.

Data

In the image below the behaviors possible for on a 3 lane highway (with lanes of 4 meter width) is shown. The dots represent the d (y axis) and s (x axis) coordinates of vehicles as they either...

  • change lanes left (shown in blue)
  • keep lane (shown in black)
  • or change lanes right (shown in red)

ds

the coordinate contains the following four features

  • s
  • d
  • d(s)/dt
  • d(d)/dt

the lane width is given as 4m.

About

a GNB classifier built from scratch in C++ without the use of any external dependencies like Eigen. The project summarizes a generic implementation for behaviour prediction of a vehicle on a highway.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages