In [1]:
# Q1. What is Bayes' theorem?
# Bayes' theorem, named after the Reverend Thomas Bayes, is a fundamental theorem in probability theory that describes how to 
# update the probability of a hypothesis (an event or proposition) based on new evidence or information. 

# Mathematically, Bayes' theorem can be expressed as:

# \[ P(H | E) = \frac{P(E | H) \cdot P(H)}{P(E)} \]

# Where:
# - \( P(H | E) \) is the posterior probability of hypothesis \( H \) given evidence \( E \).
# - \( P(E | H) \) is the probability of observing evidence \( E \) given that \( H \) is true (likelihood).
# - \( P(H) \) is the prior probability of hypothesis \( H \) being true before considering the evidence.
# - \( P(E) \) is the total probability of observing the evidence \( E \) (also known as the marginal likelihood or evidence).

# Bayes' theorem is foundational in statistics, machine learning, and various fields where inference and decision-making under
# uncertainty are crucial. It allows us to revise our beliefs about the likelihood of hypotheses in light of new data or 
# observations.

In [2]:
# Q2. What is the formula for Bayes' theorem?
# P(H∣E)= P(E∣H)⋅P(H)/P(E)

 



In [3]:
# Q3. How is Bayes' theorem used in practice?
# Bayes' theorem is applied in practice to update beliefs or probabilities in light of new evidence.
# It's used in various fields such as statistics, machine learning, and medical diagnosis. 
# For example, in medical diagnosis, it helps calculate the probability of a disease given certain symptoms, 
# incorporating both the initial likelihood of having the disease (prior probability) and the probability of 
# exhibiting those symptoms if the disease is present (likelihood). This updated probability (posterior probability) 
# guides decision-making, treatment plans, and risk assessments based on the latest available information.

In [4]:
# Q4. What is the relationship between Bayes' theorem and conditional probability?
# Bayes' theorem and conditional probability are closely related concepts in probability theory:

# 1. **Bayes' Theorem**: Bayes' theorem provides a way to update the probability of a hypothesis \( H \) given evidence \( E \), using the conditional probabilities of \( E \) given \( H \) (\( P(E | H) \)), the prior probability of \( H \) (\( P(H) \)), and the total probability of \( E \) (\( P(E) \)).

#    \[ P(H | E) = \frac{P(E | H) \cdot P(H)}{P(E)} \]

# 2. **Conditional Probability**: Conditional probability refers to the probability of an event \( E \) occurring given that another event \( H \) has occurred, denoted as \( P(E | H) \).

#    \[ P(E | H) = \frac{P(E \cap H)}{P(H)} \]

# The relationship lies in how Bayes' theorem utilizes conditional probabilities to update the prior probability \( P(H) \) into the posterior probability \( P(H | E) \). It connects the initial belief (prior) with new evidence, adjusting the belief based on how likely the evidence is under different scenarios (likelihood). Thus, Bayes' theorem formalizes the process of updating beliefs using conditional probabilities, making it a fundamental tool in probabilistic reasoning and decision-making under uncertainty.

In [5]:
# Q5. How do you choose which type of Naive Bayes classifier to use for any given problem?
# Choosing the appropriate type of Naive Bayes classifier depends on the nature of the problem and the characteristics of the
# data. Here are considerations for selecting the type of Naive Bayes classifier:

# 1. **Gaussian Naive Bayes**: 
#    - **Nature of Features**: It assumes continuous features that follow a Gaussian (normal) distribution.
#    - **Example**: Suitable for problems where features are real-valued and assumed to be normally distributed,
#     such as in some natural language processing tasks or when dealing with sensor data.

# 2. **Multinomial Naive Bayes**: 
#    - **Nature of Features**: It is used when features represent counts or frequencies of occurrences 
#     (e.g., word counts in text classification).
#    - **Example**: Commonly used in text classification tasks where the frequency of words or terms is used as features.

# 3. **Bernoulli Naive Bayes**: 
#    - **Nature of Features**: It assumes binary features (presence or absence of a feature).
#    - **Example**: Useful for text classification tasks where each term occurrence is binary
#     (e.g., presence of a word in a document).

# To decide which type to use:

# - **Data Representation**: Understand how features are represented in your dataset (continuous, counts, binary).
# - **Assumptions**: Consider whether the assumptions of each Naive Bayes type (like Gaussian distribution for Gaussian NB) 
#     align with your data.
# - **Performance**: Sometimes, testing multiple types and evaluating their performance via cross-validation can help determine 
#     which works best for your specific problem.



In [None]:
# Q6 - You have a dataset with two features, X1 and X2, and two possible classes, A and B. You want to use Naive
# Bayes to classify a new instance with features X1 = 3 and X2 = 4. The following table shows the frequency of
# each feature value for each class:
# Class X1=1 X1=2 X1=3 X2=1 X2=2 X2=3 X2=4
# A 3 3 4 4 3 3 3
# B 2 2 1 2 2 2 3
# Assuming equal prior probabilities for each class, which class would Naive Bayes predict the new instance
# to belong to?
# To predict the class of a new instance with features \( X1 = 3 \) and \( X2 = 4 \) using Naive Bayes, we will calculate the posterior probabilities for each class \( A \) and \( B \) based on the given data and assuming equal prior probabilities (\( P(A) = P(B) = 0.5 \)).

# Given data:
# - Class A:
#   - \( P(X1=3 | A) = \frac{4}{10} \)
#   - \( P(X2=4 | A) = \frac{3}{10} \)
# - Class B:
#   - \( P(X1=3 | B) = \frac{1}{7} \)
#   - \( P(X2=4 | B) = \frac{3}{7} \)

# Since the Naive Bayes assumption states that features are conditionally independent given the class, we can compute the posterior probabilities \( P(A | X1=3, X2=4) \) and \( P(B | X1=3, X2=4) \) using Bayes' theorem:

# \[ P(A | X1=3, X2=4) \propto P(X1=3 | A) \cdot P(X2=4 | A) \cdot P(A) \]
# \[ P(B | X1=3, X2=4) \propto P(X1=3 | B) \cdot P(X2=4 | B) \cdot P(B) \]

# Let's calculate:

# For Class A:
# \[ P(X1=3 | A) \cdot P(X2=4 | A) \cdot P(A) = \frac{4}{10} \cdot \frac{3}{10} \cdot 0.5 = \frac{12}{100} \cdot 0.5 = 0.06 \]

# For Class B:
# \[ P(X1=3 | B) \cdot P(X2=4 | B) \cdot P(B) = \frac{1}{7} \cdot \frac{3}{7} \cdot 0.5 = \frac{3}{49} \cdot 0.5 = 0.0306 \]

# Normalize these probabilities (sum to 1):
# \[ P(A | X1=3, X2=4) = \frac{0.06}{0.06 + 0.0306} = \frac{0.06}{0.0906} \approx 0.6623 \]
# \[ P(B | X1=3, X2=4) = \frac{0.0306}{0.06 + 0.0306} = \frac{0.0306}{0.0906} \approx 0.3377 \]

# Therefore, the Naive Bayes classifier would predict that the new instance with \( X1 = 3 \) and \( X2 = 4 \) belongs to **Class A** because \( P(A | X1=3, X2=4) > P(B | X1=3, X2=4) \).