# <center>Naïve Bayes Classifier</center>
<font size="3">
The objective of this unit is to explain the basic idea behind Naïve Bayes Classifier in simple words. No background in Statistics or data science is required but knowing the basics of programming would be helpful for the exercises which come at the end. 

***
    
##  1. Introduction 


Suppose that you work in an organization where every few years an employee may receive a promotion which is surprisingly announced at the end of the year party. You wonder if you could find a way to make a close guess for your case so that you could be prepared and well dressed for the surprise. So you start shaping a story about it.
You start thinking about the relevant features that may result to a promotion of an employee and end up with four: **appearance, punctuality, client satisfaction** and **charisma**. 
    
You have the impression that good looking or well-dressed people are more likely to be promoted. Also, people who are punctual in doing the tasks or working hours as well as those with happy clients have higher chance of  promotion. Finally, you think that charismatic employees have more chance for a promotion. 

You are aware that some of these __features are related to each other__ and thus making your analysis inaccurate. For instance, good looking employees or those with charismatic character may also get more positive feedbacks from clients. But you don’t want to bother and be super precise about your story and thus decide to stick with your **naïve** thoughts. 

Ultimately, you are interested to **find out your chance of promotion given your scores on the features**. In other words, assuming that there are two classes of employees, **promoted** and **not yet promoted**, you want to find out which class you will belong to, or how likely is to be in each of these classes.
So the first step is to give yourself some scores. You are often well-dressed, often punctual, often receive positive feedback from clients and you do not have a charismatic character.
You may think that your chance depends on two elements: 

1. **How often there is a promotion in general**.<br> 
    The number or frequency of promoted employees or more precisely, the chance of having a promotion at all in your company is going to matter. Imagine that in your company there is no such thing happening, that is, no one has ever received a promotion. Thus, the chance of an employee receiving a promotion is zero irrespective of her scores on the features. On the other hand, suppose that in your company half of employees have received a promotion so far, that is, the likelihood of such event happening is 50%. Apparently, the higher this likelihood is the higher is also your chance for being one of them eventually. 


2. **How often you find the promoted persons having a particular feature**. <br>
    The number or frequency of your promoted colleagues who have the same scores as yours also matters. This makes a lot of sense! Imagine that your office mate has almost the same scores as you and she got promoted last year. That makes you excited and you may think that you would be next. 


<div class="alert alert-info">
<b>If you are convinced that these two elements are relevant then you are already using the Bayes rule. Welcome to the club!</b><br>
Note that the full specification of the Bayes rule <a href="https://en.wikipedia.org/wiki/Bayes%27_theorem">(see here)</a> is slightly different from what we have sketched. However, one does not need the full form of the rule to develop the Bayes Classifier.<br>   
<b>Bayes rule implies that</b>:

<center>chance of promotion given the scores is proportional to $\{$chance of each score given the promotion $\times$ chance of promotion$\}$</center>
</div>


Now let's put things into action. As for the first element let's suppose that out of 100 total employees 50 have already received promotion. For the second element, lets count all the promoted persons with the same scores as yours. 

* You find that out of the 50 promoted persons 45 have been as punctual as you. 
* Also, you find 10 persopns to be in the same class of appearance. 
* 40 out of 50 have client feedback rates close to yours and finally 
* 20 are not charismatic.

We can translate these numbers into chances:

* $\frac{45}{50}$ is the chance of a promoted person to be punctual as you. 
* $\frac{10}{50}$ is the chance a promoted person to be in the same range of appearance, and finally 
* $\frac{40}{50}$ and $\frac{20}{50}$ are the chances for client satisfaction and charisma. 

Now you can quantify your chance. Once again you are interested to calculate your chance of promotion given your scores. Putting the two elements (1) and (2) together, it can be approximated by: 

$\frac{50}{100}\times \big[\frac{45}{50} \times \frac{10}{50} \times \frac{40}{50} \times \frac{20}{50}\big]$ which is about 3\%. 
On the other hand, your chance of not receiving a promotion is: 

$\frac{50}{100}\times \big[(1-\frac{45}{50}) \times (1-\frac{10}{50}) \times (1-\frac{40}{50}) \times (1-\frac{20}{50})\big]$ which is about 0.5\%, and is 6 times smaller than receiving the promotion. 

Thus, it is 6 times more likely that you will be in the promotion class than in the other class.

<div class="alert alert-success">
Congratulation! You are just done with your first Naïve Bayes Classifier. You can use it to assess the chance of promotion for your colleagues too given their specific features.
</div>

If you have followed all the steps so far, you will be able to read the following expression which gives some structure to the story:<br>

$ P(A \mid B) \propto 
\prod_i^4 P(B_i\mid A)
P(A) $

where A: promotion and B: features

### **Summary:**
* ...
* ...
* ...
</font>