# Naive Bayes, Part 3

### Naive Bayes by Example 3
In part 2, you have seen an example of Naive Bayes classifier for rain prediction. It has three input features. Now let's start with a new example. In this example, we want to build an intelligent lighting. The light has two states either 'On' or 'Off' depending on user behaviour based on time and light intensity in the environment.

Consider a training data that describes the light conditions as shown in the following table. You have two input features (light sensor and time) and one label (light state). The value of light sensor is normalized to 0-1. The value of time is from 00:00 AM to 23:00 PM. Given the light sensor and time, each tuple classifies the light conditions as 'On' or 'Off'.

In [5]:
import pandas as pd
df = pd.read_csv("intelligent-lighting.csv")
df

Unnamed: 0,Light Sensor,Time,Light State
0,0.0,00:00,Off
1,0.0,01:00,Off
2,0.0,02:00,Off
3,0.0,03:00,Off
4,0.0,04:00,Off
5,0.1,04:00,Off
6,0.1,05:00,On
7,0.2,06:00,On
8,0.3,07:00,On
9,0.4,08:00,Off


Before we build our Naive Bayes model, I want to group light sensor values into four levels as follows:
$$
\text{Light Sensor}\gets
\begin{cases}
    0 & \text{if $0.00\leq$ Light Sensor $\leq 0.25$} \\
    1 & \text{if $0.26\leq$ Light Sensor $\leq 0.50$} \\
    2 & \text{if $0.51\leq$ Light Sensor $\leq 0.75$} \\
    3 & \text{if $0.76\leq$ Light Sensor $\leq 1.00$} \\
\end{cases}
$$

In [6]:
# group light sensor value into 4 levels
for i in range(len(df)):
    if (df.loc[i,'Light Sensor'] >= 0.0 and df.loc[i,'Light Sensor'] <= 0.25):
        df.loc[i,'Light Sensor'] = 0
    elif (df.loc[i,'Light Sensor'] >= 0.26 and df.loc[i,'Light Sensor'] <= 0.50):
        df.loc[i,'Light Sensor'] = 1
    elif (df.loc[i,'Light Sensor'] >= 0.51 and df.loc[i,'Light Sensor'] <= 0.75):
        df.loc[i,'Light Sensor'] = 2
    elif (df.loc[i,'Light Sensor'] >= 0.76 and df.loc[i,'Light Sensor'] <= 1.0):
        df.loc[i,'Light Sensor'] = 3
df['Light Sensor'] = df['Light Sensor'].astype('int64')    # convert float64 to int64
df

Unnamed: 0,Light Sensor,Time,Light State
0,0,00:00,Off
1,0,01:00,Off
2,0,02:00,Off
3,0,03:00,Off
4,0,04:00,Off
5,0,04:00,Off
6,0,05:00,On
7,0,06:00,On
8,1,07:00,On
9,1,08:00,Off


Now, let's build our Naive Bayes model for predicting light state. Let $X_{i}$ where $i=\{1, 2\}$ be the features that corresponds to light sensor and time, repectively. Let $y$ be the label (light state). You can make a prediction of light state using the following Naive Bayes equation:
$$\hat{y}=\arg\max_{y}P(y)\prod_{i=1}^{n}P(X_{i}|y)$$
where $\hat{y}$ is the prediction and $n=2$.

Let's say light sensor level is 2 and time is 12:00, can you predict the light state? Well, let's apply the same steps as in the previous examples. Here, we want to calculate the following:

$$\hat{y}=\arg\max_{y}(P(y=off)P(X_{1}=2|y=off)P(X_{2}=12|y=off), P(y=on)P(X_{1}=2|y=on)P(X_{2}=12|y=on)$$

#### **Step 0: create a frequency table for each feature of the training data**

| Light Sensor | Off   | On     | Total |
|--------------|-------|--------|-------|
| Level 0      | 6     | 12     | 18    |
| Level 1      | 4     | 5      | 9     |
| Level 2      | 5     | 0      | 5     |
| Level 3      | 5     | 0      | 5     |
| Total        | 20    | 17     | 37    |

| Time      | Off   | On     | Total |
|-----------|-------|--------|-------|
| 00:00     | 1     | 0      | 1     |
| 01:00     | 1     | 0      | 1     |
| 02:00     | 1     | 0      | 1     |
| 03:00     | 1     | 0      | 1     |
| 04:00     | 2     | 0      | 2     |
| 05:00     | 0     | 1      | 1     |
| 06:00     | 0     | 1      | 1     |
| 07:00     | 0     | 1      | 1     |
| 08:00     | 1     | 1      | 2     |
| 09:00     | 1     | 1      | 2     |
| 10:00     | 1     | 0      | 1     |
| 11:00     | 1     | 0      | 1     |
| 12:00     | 2     | 1      | 3     |
| 13:00     | 2     | 0      | 2     |
| 14:00     | 1     | 1      | 2     |
| 15:00     | 1     | 1      | 2     |
| 16:00     | 1     | 0      | 1     |
| 17:00     | 1     | 1      | 2     |
| 18:00     | 1     | 1      | 2     |
| 19:00     | 1     | 1      | 2     |
| 20:00     | 0     | 2      | 2     |
| 21:00     | 0     | 2      | 2     |
| 22:00     | 0     | 1      | 1     |
| 23:00     | 0     | 1      | 1     |
| Total     | 20    | 17     | 37    |

#### **Step 1: compute the probabilities for each value of the label**
Out of 15 observations, you have 20 'Off' and 17 'On'. So the respective probabilities are:
$$P(y=off)=\frac{20}{37}$$
$$P(y=on)=\frac{17}{37}$$

Because there are zero probabilities, you need to add Laplace Smoothing:
$$P(y=off)=\frac{20+1}{37+2.1}=\frac{21}{39}$$
$$P(y=on)=\frac{17+1}{37+2.1}=\frac{18}{39}$$

#### **Step 2: compute the conditional probability**
Out of 20 'Off', you have 5 'Level 2'. So the probability is:
$$P(X_{1}=2|y=off)=\frac{5}{20}$$
Out of 20 'Off', you have 2 '12:00'. So the probability is:
$$P(X_{2}=12|y=off)=\frac{2}{20}$$
Out of 17 'On', you have 0 'Level 2'. So the probability is:
$$P(X_{1}=2|y=on)=\frac{0}{17}$$
Out of 17 'On', you have 1 '12:00'. So the probability is:
$$P(X_{2}=12|y=on)=\frac{1}{17}$$

Because there are zero probabilities, you need to add Laplace Smoothing:
$$P(X_{1}=2|y=off)=\frac{5+1}{20+4.1}=\frac{6}{24}$$
$$P(X_{2}=12|y=off)=\frac{2+1}{20+24.1}=\frac{3}{44}$$
$$P(X_{1}=2|y=on)=\frac{0+1}{17+4.1}=\frac{1}{21}$$
$$P(X_{2}=12|y=on)=\frac{1+1}{17+24.1}=\frac{2}{41}$$

#### **Step 3: subtitute all the three probabilities into the Naive Bayes classifier**
$$\hat{y}=\arg\max_{y}(P(y=off)P(X_{1}=2|y=off)P(X_{2}=12|y=off), P(y=on)P(X_{1}=2|y=on)P(X_{2}=12|y=on)$$
$$\hat{y}=\arg\max_{y}(\frac{21}{39}.\frac{6}{24}.\frac{3}{44}, \frac{18}{39}.\frac{1}{21}.\frac{2}{41})$$
$$\hat{y}=\arg\max_{y}(0.00918, 0.00107)$$
$$\hat{y}=off$$

So, our predicted light state when light sensor level is 2 and time is 12:00 is 'Off'.