# Understanding Naive Bayes Classifier with Email Classification

In the domain of email filtering, the Naive Bayes classifier assumes a pivotal role, efficiently distinguishing between legitimate emails and spam with exceptional accuracy. Let us delve into its intricacies and examine its significance in the realm of email classification.

## Exploring the Basics

### Bayes Theorem

Before delving into Naive Bayes, let's acquaint ourselves with Bayes' theorem, a cornerstone of probability theory:

$$P(\frac{A}{B}) = \frac{P({B}/{A})P(A)}{P(B)}$$

Where, 

- **$P(\frac{A}{B})$**: This represents the probability of event $A$ occurring given that event $B$ has occurred. In other words, it's the likelihood of $A$ happening, given the context of $B$.

- **$P(B|A)$**: This is the conditional probability of event $B$ occurring given that event $A$ has occurred. It represents the likelihood of observing $B$ under the condition that $A$ has already happened.

- **$P(A)$**: This denotes the probability of event $A$ occurring independently, without any additional context or condition.

- **$P(B)$**: Similarly, this denotes the probability of event $B$ occurring independently, without any additional context or condition.


This theorem lays the groundwork for probabilistic inference, guiding our understanding of how evidence influences our beliefs.

## Unveiling the 'Naive' Assumption  

The term "naive" attached to Naive Bayes arises from its bold assumption regarding feature independence. It suggests that the existence or absence of one feature doesn't impact the presence or absence of another. While this assumption often doesn't hold true in reality, it surprisingly enhances the classifier's effectiveness.

**Example Scenario**:  

   - **Email 1 (Spam)**: "Urgent offer! Get exclusive deals now!"  
   - **Email 2 (Legitimate)**: "This is an urgent reminder about your appointment."  
   - **Email 3 (Spam)**: "Urgent! Amazing offer awaits! Act now!"  

In this scenario, both Email 1 and Email 3 contain the words "urgent" and "offer" together, which are indicative of spam.

# Multinomial Naive Bayes classifier.

#### Exploring Multinomial Naive Bayes for Email Classification

Let's delve into a concise example to demonstrate the application of multinomial Naive Bayes in classifying emails as either spam or legitimate.

#### Dataset:

**Legitimate Emails:**
1. Email 1: "Hello, I am interested in your business proposal."
2. Email 2: "Please find attached the meeting agenda for tomorrow."
3. Email 3: "Reminder: Your appointment is scheduled for next week."

**Spam Emails:**
1. Email 1: "Get rich quick! Buy our amazing products now!"
2. Email 2: "Congratulations! You have won a free vacation."
3. Email 3: "Enlarge your bank account with our guaranteed investment plan."

### Step 1: Count the Occurrences of Words

#### Legitimate Emails:
- Total words: 20  
  word_count = {
    "Hello": 1,
    "I": 1,
    "am": 1,
    "interested": 1,
    "in": 1,
    "your": 1,
    "business": 1,
    "proposal": 1,
    "Please": 1,
    "find": 1,
    "attached": 1,
    "the": 1,
    "meeting": 1,
    "agenda": 1,
    "for": 1,
    "tomorrow": 1,
    "Reminder": 1,
    "Your": 1,
    "appointment": 1,
    "is": 1
  }


#### Spam Emails:
- Total words: 21  
  word_count = {
    "Get": 1,
    "rich": 1,
    "quick!": 1,
    "Buy": 1,
    "our": 1,
    "amazing": 1,
    "products": 1,
    "now!": 1,
    "Congratulations!": 1,
    "You": 1,
    "have": 1,
    "won": 1,
    "a": 1,
    "free": 1,
    "vacation": 1,
    "Enlarge": 1,
    "bank": 1,
    "account": 1,
    "with": 1,
    "guaranteed": 1,
    "investment": 1,
    "plan": 1
  }


### Step 2: Calculate Probabilities  

**Note:** We use Laplace smoothing to avoid zero probabilities. Let's assume alpha (smoothing parameter) is 1. This means that for each word in our vocabulary, we add 1 to both the numerator and denominator when calculating probabilities. This ensures that even if a word did not appear in the training data for a particular class, it still has a non-zero probability of occurring in that class.

#### Legitimate Emails:
- Total words: 20
- Prior probability (P(Legitimate)): 3/6 = 0.5
- Word probabilities (P(word|Legitimate)):
  - P("Hello" | Legitimate) = (1 + 1) / (20 + 20) = 2/40
  - P("tomorrow" | Legitimate) = (1 + 1) / (20 + 20) = 2/40
  - P("business" | Legitimate) = (1 + 1) / (20 + 20) = 2/40
  (and so on for other words)

#### Spam Emails:
- Total words: 21
- Prior probability (P(Spam)): 3/6 = 0.5
- Word probabilities (P(word|Spam)):
  - P("Get" | Spam) = (1 + 1) / (21 + 21) = 2/42
  - P("rich" | Spam) = (1 + 1) / (21 + 21) = 2/42
  - P("quick!" | Spam) = (1 + 1) / (21 + 21) = 2/42
  (and so on for other words)

### Step 3: Make Predictions

Suppose we receive a new email: "Urgent: Double your income with our exclusive offer!"

We calculate the probabilities for this email being legitimate and spam using Naive Bayes with Laplace smoothing and make a prediction based on the higher probability.

- P(Legitimate) = 0.5 * P("Urgent" | Legitimate) * P("Double" | Legitimate) * ... * P("offer" | Legitimate)
- P(Spam) = 0.5 * P("Urgent" | Spam) * P("Double" | Spam) * ... * P("offer" | Spam)

Then, we compare P(Legitimate) and P(Spam) to classify the email as either legitimate or spam, based on which probability is higher.