### **Problem: Emotion Detection in Text**

We need to classify emotions in text messages using Bayes' Theorem, based on a set of predefined probabilities for words given emotions and the overall probabilities of emotions themselves.

### **Solution Design**
**1. Data Preprocessing**

We will clean the text input by:

* Removing punctuation.
* Converting all text to lowercase.
* Tokenizing the text into words.

**2. Bayesian Algorithm**

* Prior Probabilities $P(E)$: The probability of each emotion before seeing the text.
* Likelihood $P(W∣E):$ The probability of each word given an emotion.
* Posterior Probability $P(E∣W)$: The probability of an emotion given the text message. This is calculated using:
$$ P(E∣W) = \frac{P(W∣E)P(E)}{P(W)} $$
where $P(W)$ is calculated by summing over all emotions:

$$ P(W) = \sum{P(W∣E)P(E)} $$

**3. Handling Errors**

If a word is not found in the database, we assume a low probability for that word in the context of all emotions to avoid zeroing out the results.

**4. User Interface**

We'll create a simple input mechanism that accepts a text message from the user and outputs the most probable emotion based on Bayesian classification.

### **Python Implementation**

Python implementation using the above logic:

In [None]:
import string

# Example data
emotions = ['happy', 'sad', 'angry']
prior_probabilities = {'happy': 0.4, 'sad': 0.3, 'angry': 0.3}
conditional_probabilities = {
    'happy': {'happy': 0.25, 'joyful': 0.125, 'great': 0.125, 'sad': 0.05, 'down': 0.03, 'angry': 0.02, 'mad': 0.01, 'frustrated': 0.005},
    'sad': {'happy': 0.05, 'joyful': 0.02, 'great': 0.03, 'sad': 0.20, 'down': 0.125, 'angry': 0.03, 'mad': 0.02, 'frustrated': 0.01},
    'angry': {'happy': 0.10, 'joyful': 0.03, 'great': 0.02, 'sad': 0.05, 'down': 0.02, 'angry': 0.15, 'mad': 0.125, 'frustrated': 0.10}
}

# Function to preprocess the input text
def preprocess_text(text):
    # Remove punctuation and convert to lowercase
    text = text.translate(str.maketrans('', '', string.punctuation)).lower()
    return text.split()

# Function to calculate P(E|W) for each emotion
def calculate_posterior_probability(message):
    words = preprocess_text(message)
    
    # Initialize posterior probabilities for each emotion
    posterior_probabilities = {emotion: prior_probabilities[emotion] for emotion in emotions}
    
    # Calculate P(W|E) * P(E) for each emotion
    for emotion in emotions:
        for word in words:
            # Handle words not found in conditional probabilities (assign a small probability)
            if word in conditional_probabilities[emotion]:
                posterior_probabilities[emotion] *= conditional_probabilities[emotion][word]
            else:
                posterior_probabilities[emotion] *= 0.001  # Small probability for unknown words
    
    # Calculate P(W)
    total_prob = sum(posterior_probabilities.values())
    
    # Normalize the probabilities by dividing by P(W)
    for emotion in emotions:
        posterior_probabilities[emotion] /= total_prob
    
    # Return the emotion with the highest posterior probability
    return max(posterior_probabilities, key=posterior_probabilities.get)

# Example usage
message = input("Enter a text message: ")
predicted_emotion = calculate_posterior_probability(message)
print(f"The predicted emotion is: {predicted_emotion}")

### **Explanation of the Code:**

**1. Preprocessing**: The preprocess_text() function removes punctuation from the input message and converts it to lowercase for consistency.

**2. Bayes' Theorem Calculation**: The calculate_posterior_probability() function computes the posterior probability for each emotion based on the input message. It multiplies the prior probability of each emotion by the likelihood of each word in the message given that emotion. If a word is not in the conditional probabilities, a small probability is assumed.

**3. Prediction**: The system predicts the most probable emotion by selecting the emotion with the highest posterior probability.

### **Potential Error:**
* **Unknown Words**: If the message contains many words not found in the training data, the probabilities may become too small to make a reliable prediction. To mitigate this, we assign a small probability for unknown words to prevent the system from producing zero probabilities for any emotion.